Omnia vincit amor
Home -> Publications
Home
  Publications
    
all years
    2014
    2013
    2012
    2011
    2010
    2009
    2008
    2007
    2006
    2005
    2004
    theses
    edited volumes
    presentations
    techreports
    conferences
  Awards
  Research
  Teaching
  BLOG
  Miscellaneous
  Full CV [in pdf format]
  Short CV [in pdf format]






  Events







  Recent Events





Publications of Torsten Hoefler
Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Citation Listings: DBLP   CSB   Google Scholar   ACM Digital Library   MS Academic Search

Research overview                  Using Advanced MPI                 Edited volumes
      

Peer-Reviewed Conference or Journal Articles

SC14
[1] J. Domke, T. Hoefler, S. Matsuoka:
 Fail-in-Place Network Design: Interaction between Topology, Routing Algorithm and Failures presented in New Orleans, LA, USA, Nov. 2014, accepted at IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)
SC14
[2] K. B. Ferreira, P. Widener, S. Levy, D. Arnold, T. Hoefler:
 Understanding the Effects of Communication and Coordination on Checkpointing at Scale presented in New Orleans, LA, USA, Nov. 2014, accepted at IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)
SC14
[3] M. Besta, T. Hoefler:
 Slim Fly: A Cost Effective Low-Diameter Network Topology presented in New Orleans, LA, USA, Nov. 2014, accepted at IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394) SC14 Best Student Paper Finalist
Adv MPI
[4] W. Gropp, T. Hoefler, R. Thakur, E. Lusk:
 Using Advanced MPI: Modern Features of the Message-Passing Interface presented in Cambridge, MA, MIT Press, ISBN: 978-0262527637, Nov. 2014,
SuperFri
[5] T. Hoefler, D. Moor:
 Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations Journal of Supercomputing Frontiers and Innovations. Vol 1, Nr. 2, pages 58--75, SuperFri Open Journal, Oct. 2014,
EuroMPI'14
[6] P. Widener, K. Ferreira, S. Levy, T. Hoefler:
 Exploring the effect of noise on the performance benefit of nonblocking allreduce In Proceedings of the 21st European MPI Users' Group Meeting, presented in Kyoto, Japan, pages 77:77--77:82, ACM, ISBN: 978-1-4503-2875-3, Sep. 2014,
PACT'14
[7] A. Bhattacharyya, T. Hoefler:
 PEMOGEN: Automatic Adaptive Performance Modeling during Program Runtime In Accepted at the 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT'14), presented in Edmonton, Alberta, Canada, ACM, Aug. 2014,
HPDC'14
[8] B. Prisacari, G. Rodriguez, P. Heidelberger, D. Chen, C. Minkenberg, T. Hoefler:
 Efficient Task Placement and Routing in Dragonfly Networks In Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), presented in Vancouver, Canada, ACM, Jun. 2014, (acceptance rate: 16%, 21/130)
HPDC'14
[9] M. Besta, T. Hoefler:
 Fault Tolerance for Remote Memory Access Programming Models In Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), presented in Vancouver, Canada, ACM, Jun. 2014, (acceptance rate: 16%, 21/130) Best Paper Nominee at HPDC'14 (3/21)
SPAA'14
[10] T. Hoefler, G. Kwasniewski:
 Automatic Complexity Analysis of Explicitly Parallel Programs In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'14), presented in Prague, Czech Republic, ACM, Jun. 2014, (acceptance rate: 25%, 30/122)
Computing
[11] T. Schneider, R. Gerstenberger, T. Hoefler:
 Application-oriented ping-pong benchmarking: how to assess the real communication overheads Journal of Computing. Vol 96, Nr. 4, pages 279-292, Springer Vienna, ISSN: 0010-485X, Apr. 2014, In a journal special issue on top picks from EuroMPI'12.
IPDPS'14
[12] A. Arteaga, O.Fuhrer, T. Hoefler:
 Designing Bit-Reproducible Portable High-Performance Applications In Proceedings of the 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS), presented in Phoenix, AR, USA, IEEE Computer Society, Apr. 2014, (acceptance rate: 21.1%, 114/541)
Cluster Computing
[13] S. Li, T. Hoefler, C. Hu, M. Snir:
 Improved MPI collectives for MPI processes in shared address spaces Journal of Cluster Computing. pages 1-17, Springer US, ISSN: 1386-7857, Mar. 2014,
ACM TACO
[14] B. Prisacari, G. Rodriguez, C. Minkenberg, T. Hoefler:
 Fast Pattern-Specific Routing for Fat Tree Networks ACM Transactions on Architecture and Code Optimization. Vol 10, Nr. 4, presented in New York, NY, USA, pages 36:1--36:25, ACM, ISSN: 1544-3566, Dec. 2013, (acceptance rate: 24% (2011))
SC13
[15] A. Calotoiu, T. Hoefler, M. Poke, F. Wolf:
 Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes In IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 45:1--45:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)
SC13
[16] R. Gerstenberger, M. Besta, T. Hoefler:
 Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided In IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 53:1--53:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457) SC13 Best Paper (1/92) and Best Student Paper Finalist (8/92)
SC13
[17] A. Friedley, G. Bronevetsky, A. Lumsdaine, T. Hoefler:
 Hybrid MPI: Efficient Message Passing for Multi-core Systems In IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 18:1--18:11, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)
PMBS'13
[18] S. Levy, B. Topp, K. Ferreira, D. Arnold, T. Hoefler, P. Widener:
 Using Simulation to Evaluate the Performance of Resilience Strategies at Scale presented in Denver, CO, USA, Nov. 2013, accepted at 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS13)
ICPP'13
[19] T. Schneider, T. Hoefler, R. Grant, B. Barrett, R. Brightwell:
 Protocols for Fully Offloaded Collective Operations on Accelerated Network Adapters In Parallel Processing (ICPP), 2013 42nd International Conference on, presented in Lyon, France, pages 593-602, ISSN: 0190-3918, Oct. 2013,
EuroMPI'13
[20] T. Schneider and F. Kjolstad and T. Hoefler:
 MPI Datatype Processing using Runtime Compilation In Proceedings of the 20th European MPI Users' Group Meeting, presented in Madrid, Spain, pages 19--24, ACM, ISBN: 978-1-4503-1903-4, Sep. 2013, Best Paper Award at EuroMPI'13 (1/25)
LCPC'13
[21] T. Schneider, R. Gerstenberger, T. Hoefler:
 Compiler Optimizations for Non-Contiguous Remote Data Movement presented in Santa Clara, CA, USA, Sep. 2013, accepted at The 26th International Workshop on Languages and Compilers for Parallel Computing
HPDC'13
[22] S. Ramos and T. Hoefler:
 Modeling Communication in Cache-Coherent SMP Systems - A Case-Study with Xeon Phi In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, presented in New York City, NY, USA, pages 97--108, ACM, ISBN: 978-1-4503-1910-2, Jun. 2013, (acceptance rate: 15%, 20/131)
HPDC'13
[23] S. Li, T. Hoefler and M. Snir:
 NUMA-Aware Shared Memory Collective Communication for MPI In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, presented in New York City, NY, USA, pages 85--96, ACM, ISBN: 978-1-4503-1910-2, Jun. 2013, (acceptance rate: 15%, 20/131) Nominated for Best Paper Award at HPDC'13 (3/20)
ICS'13
[24] B. Prisacari, G. Rodriguez, C. Minkenberg and T. Hoefler:
 Bandwidth-optimal All-to-all Exchanges in Fat Tree Networks In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, presented in Eugene, OR, USA, pages 139--148, ACM, ISBN: 978-1-4503-2130-3, Jun. 2013, (acceptance rate: 21%, 41/198)
Computing
[25] T. Hoefler, J. Dinan, D. Buntinas, P. Balaji, B. Barrett, R. Brightwell, W. Gropp, V. Kale and R. Thakur:
 MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory Journal of Computing. Springer, May 2013, doi: 10.1007/s00607-013-0324-2
PPoPP'13
[26] A. Friedley, T. Hoefler, G. Bronevetsky, A. Lumsdaine:
 Ownership Passing: Efficient Distributed Memory Programming on Multi-core Systems In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, presented in Shenzen, China, pages 177--186, ACM, ISBN: 978-1-4503-1922-5, Feb. 2013, (acceptance rate: 18%, 26/146)
SC12
[27] T. Hoefler, T. Schneider:
 Optimization Principles for Collective Neighborhood Communications In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, presented in Salt Lake City, Utah, USA, pages 98:1--98:10, IEEE Computer Society Press, ISBN: 978-1-4673-0804-5, Nov. 2012, (acceptance rate: 21%, 100/472)
EuroMPI'12
[28] T. Schneider, R. Gerstenberger, T. Hoefler:
 Micro-Applications for Communication Data Access Patterns and MPI Datatypes Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, pages 121-131, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, In a journal special issue on top picks from EuroMPI'12.
EuroMPI'12
[29] S. Pellegrini, T. Hoefler, T. Fahringer:
 Exact Dependence Analysis for Increased Communication Overlap In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012,
EuroMPI'12
[30] T. Hoefler, J. Dinan, D. Buntinas, P. Balaji, B. Barrett, R. Brightwell, W. Gropp, V. Kale, R. Thakur:
 Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, In a journal special issue on top picks from EuroMPI'12.
PACT'12
[31] T. Hoefler, T. Schneider:
 Runtime Detection and Optimization of Collective Communication Patterns In Proceedings of the 21st international conference on Parallel Architectures and Compilation Techniques (PACT), presented in Minneapolis, MN, USA, pages 263--272, ACM, ISBN: 978-1-4503-1182-3, Sep. 2012, (acceptance rate: 18.9%, 39/207)
Cluster'12
[32] S. Pellegrini, T. Hoefler, T. Fahringer:
 On the Effects of CPU Caches on MPI Point-to-Point Communications In Proceedings of the 2012 IEEE International Conference on Cluster Computing, presented in Beijing, China, pages 495--503, IEEE Computer Society, ISBN: 978-0-7695-4807-4, Sep. 2012, (acceptance rate: 28.9%, 58/200)
CCGrid'12
[33] P. Gottschling and T. Hoefler:
 Productive Parallel Linear Algebra Programming with Unstructured Topology Adaption In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 9--16, IEEE Computer Society, ISBN: 978-0-7695-4691-9, May 2012, (acceptance rate: 27%, 83/302)
CCGrid'12
[34] G. Bauer, S. Gottlieb and T. Hoefler:
 Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3 rmd In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 652--659, IEEE Computer Society, ISBN: 978-0-7695-4691-9, May 2012, (acceptance rate: 27%, 83/302)
PDP'12
[35] K. Kharbas, D. Kim, T. Hoefler and F. Mueller:
 Assessing HPC Failure Detectors for MPI Jobs In Proceedings of the 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, presented in Munich, Germany, pages 81--88, IEEE Computer Society, ISBN: 978-0-7695-4633-9, Feb. 2012,
PPoPP'12
[36] T. Hoefler and T. Schneider:
 Communication-Centric Optimizations by Dynamically Detecting Collective Operations In Proceedings of the 17th ACM symposium on Principles and practice of parallel programming, Feb. 2012, (poster paper) (acceptance rate (posters): 17%, 32/185)
PPoPP'12
[37] F. Kjolstad, T. Hoefler and M. Snir:
 Automatic Datatype Generation and Optimization In Proceedings of the 17th ACM symposium on Principles and practice of parallel programming, Feb. 2012, (poster paper) (acceptance rate (posters): 17%, 32/185)
SC11
[38] T. Hoefler, W. Gropp, M. Snir and W. Kramer:
 Performance Modeling for Systematic Performance Tuning In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11), SotP Session, Nov. 2011,
EuroMPI'11
[39] W. Gropp, T. Hoefler, R. Thakur and J. L. Traeff:
 Performance Expectations and Guidelines for MPI Derived Datatypes Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 150-159, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,
EuroMPI'11
[40] V. Venkatesan, M. Chaarawi, E. Gabriel and T. Hoefler:
 Design and Evaluation of Nonblocking Collective I/O Operations Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 90-98, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,
EuroMPI'11
[41] T. Hoefler and M. Snir:
 Writing Parallel Libraries with MPI - Common Practice, Issues, and Extensions Vol 6960, In Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, Santorini, Greece, September 18-21, 2011. Proceedings, presented in Santorini, Greece, pages 345--355, Springer, ISBN: 978-3-642-24448-3, Sep. 2011, Keynote paper at IMUDI/EuroMPI 2011.
EuroPar'11
[42] T. Schneider, S. Eckelmann, T. Hoefler, and W. Rehm:
 Kernel-Based Offload of Collective Operations - Implementation, Evaluation and Lessons Learned In Proceedings of the 17th international conference on Parallel processing - Volume Part II, presented in Bordeaux, France, pages 264--275, Springer-Verlag, ISBN: 978-3-642-23396-8, Aug. 2011, (acceptance rate 29.9%, 81/271)
TG'11
[43] S. Harrell, P. Smith, D. Smith, T. Hoefler, A. Labutina and T. Overmeyer:
 Methods of Creating Student Cluster Competition Teams In Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, presented in Salt Lake City, Utah, pages 50:1--50:6, ACM, Jul. 2011,
ICS'11
[44] T. Hoefler and M. Snir:
 Generic Topology Mapping Strategies for Large-scale Parallel Architectures In Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11), presented in Tucson, AZ, pages 75--85, ACM, ISBN: 978-1-4503-0102-2, Jun. 2011, (acceptance rate 21.7%, 35/161)
ICS'11
[45] J. Willcock, T. Hoefler, N. Edmonds and A. Lumsdaine:
 Active Pebbles: Parallel Programming for Data-Driven Applications In Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11), presented in Tucson, AZ, pages 235--245, ACM, ISBN: 978-1-4503-0102-2, Jun. 2011, (acceptance rate 21.7%, 35/161)
LSAP'11
[46] T. Hoefler and M. Snir:
 Performance Engineering: A Must for Petaflops and Beyond Jun. 2011, Extended Abstract for Keynote at Large-scale System and Application Performance Workshop 2011 Keynote Paper at LSAP'11
IPDPS'11
[47] J. Domke, T. Hoefler and W. Nagel:
 Deadlock-Free Oblivious Routing for Arbitrary Topologies In Proceedings of the 25th IEEE International Parallel \& Distributed Processing Symposium (IPDPS), presented in Anchorage, AL, USA, pages 613--624, IEEE Computer Society, ISBN: 0-7695-4385-7, May 2011, (acceptance rate: 19.6%, 112/571)
PPL
[48] P. Balaji, D. Buntinas, D. Goodell, W. Gropp, T. Hoefler, S. Kumar, E. Lusk, R. Thakur and J. L. Traeff:
 MPI on Millions of Cores Parallel Processing Letters (PPL). Vol 21, Nr. 1, pages 45-60, World Scientific Publishing Company, Mar. 2011,
PPoPP'11
[49] J. Willcock, T. Hoefler, N. Edmonds and A. Lumsdaine:
 Active Pebbles: A Programming Model For Highly Parallel Fine-Grained Data-Driven Computations In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, pages 305--306, ISBN: 978-1-4503-0119-0, Feb. 2011, (poster paper) (acceptance rate: 25%, 26/165 papers + 16/165 poster) PPoPP'11 Best Poster Award
PADL'11
[50] E. Holk, W. E. Byrd, J. Willcock, T. Hoefler, A. Chauhan and A. Lumsdaine:
 Kanor -- A Declarative Language for Explicit Communication In Proceedings of the 13th international conference on Practical aspects of declarative languages, presented in Austin, TX, USA, pages 190--204, Springer-Verlag, ISBN: 978-3-642-18377-5, Jan. 2011,
CiSE
[51] T. Hoefler:
 Software and Hardware Techniques for Power-Efficient HPC Networking Computing in Science and Engineering (CiSE). Vol 12, Nr. 6, pages 30-37, IEEE Computer Society, ISSN: 0740-7475, Dec. 2010,
HiPC'10
[52] N. Edmonds, T. Hoefler and A. Lumsdaine:
 A Space-Efficient Parallel Algorithm for Computing Betweenness Centrality in Distributed Memory In International Conference on High Performance Computing, presented in Goa, India, pages 1 - 10, ISBN: 978-1-4244-8518-5 , Dec. 2010, (acceptance rate: 19.2%)
HiPC'10
[53] N. Edmonds, J. Willock, T. Hoefler and A. Lumsdaine:
 Design of a Large-Scale Hybrid-Parallel Graph Library In International Conference on High Performance Computing, Student Research Symposium, presented in Goa, India, IEEE, Dec. 2010,
PROPER'10
[54] T. Hoefler:
 Bridging Performance Analysis Tools and Analytic Performance Modeling for HPC In Proceedings of Workshop on Productivity and Performance (PROPER 2010), presented in Ischia, Italy, Springer, Dec. 2010, Keynote extended abstract for PROPER'10.
SC10
[55] T. Hoefler, T. Schneider and A. Lumsdaine:
 Characterizing the Influence of System Noise on Large-Scale Applications by Simulation In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10), Nov. 2010, (acceptance rate 19.8%, 50/253) SC10 Best Paper Award
EuroMPI'10
[56] T. Hoefler, G. Bronevetsky, B. Barrett, B. R. de Supinski and A. Lumsdaine:
 Efficient MPI Support for Advanced Hybrid Programming Models Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 50--61, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
EuroMPI'10
[57] T. Hoefler, W. Gropp, R. Thakur and J. L. Traeff:
 Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 21--30, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
EuroMPI'10
[58] T. Hoefler and S. Gottlieb:
 Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient using MPI Datatypes Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 132--141, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
PACT'10
[59] J. Willcock, T. Hoefler, N. Edmonds and A. Lumsdaine:
 AM++: A Generalized Active Message Framework In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, presented in Vienna, Austria, pages 401--410, ACM, ISBN: 978-1-4503-0178-7, Sep. 2010, (acceptance rate: 17%, 46/266)
CCPE
[60] T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. de Supinski, R. Thakur and J. L. Traeff:
 The Scalable Process Topology Interface of MPI 2.2 Concurrency and Computation: Practice and Experience. Vol 23, Nr. 4, pages 293-310, John Wiley & Sons, Ltd., ISSN: 1532-0634, Aug. 2010,
HotI'10
[61] B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni and R. Rajamony:
 The PERCS High-Performance Interconnect IBM. In Proceedings of 18th Symposium on High-Performance Interconnects (Hot Interconnects 2010), IEEE, Aug. 2010,
IJPEDS
[62] T. Hoefler, T. Schneider and A. Lumsdaine:
 Accurately Measuring Overhead, Communication Time and Progression of Blocking and Nonblocking Collective Operations at Massive Scale International Journal of Parallel, Emergent and Distributed Systems. Vol 25, Nr. 4, pages 241-258, Taylor & Francis Group, ISSN: 1744-5779, Jul. 2010,
LSAP'10
[63] T. Hoefler, T. Schneider and A. Lumsdaine:
 LogGOPSim - Simulating Large-Scale Applications in the LogGOPS Model In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, presented in Chicago, Illinois, pages 597--604, ACM, ISBN: 978-1-60558-942-8, Jun. 2010, LSAP'10 Best Paper Award
AMP'10
[64] T. Hoefler, J. Willcock, A. Chauhan and A. Lumsdaine:
 The Case for Collective Pattern Specification Jun. 2010, Accepted at the 1st ACM Workshop on Advances in Message Passing (AMP'10)
SciDAC'10
[65] R. Thakur, P. Balaji, D. Buntinas, D. Goodell, W. Gropp, T. Hoefler, S. Kumar, E. Lusk and J. L. Traeff:
 MPI at Exascale In Procceedings of SciDAC 2010, presented in Chattanooga, Tennessee, Jun. 2010,
PPoPP'10
[66] T. Hoefler, C. Siebert and A. Lumsdaine:
 Scalable Communication Protocols for Dynamic Sparse Data Exchange In Proceedings of the 2010 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'10), presented in Bangalore, India, pages 159--168, ACM, ISBN: 978-1-60558-708-0, Jan. 2010, (acceptance rate 16.8%, 29/173)
HiPC'09
[67] P. Kambadur, A. Gupta, T. Hoefler and A. Lumsdaine:
 Demand-driven Execution of Static Directed Acyclic Graphs Using Task Parallelism presented in Kochi, India, pages 284-293, ISBN: 978-1-4244-4922-4, Dec. 2009, (acceptance rate 11%, 35/320)
SIMPAT
[68] T. Hoefler, T. Schneider and A. Lumsdaine:
 LogGP in Theory and Practice - An In-depth Analysis of Modern Interconnection Networks and Benchmarking Methods for Collective Operations. Elsevier Journal of Simulation Modelling Practice and Theory (SIMPAT). Vol 17, Nr. 9, pages 1511-1521, Elsevier, ISSN: 1569-190X, Oct. 2009,
EuroMPI'09
[69] T. Hoefler, A. Lumsdaine and J. Dongarra:
 Towards Efficient MapReduce Using MPI In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 16th European PVM/MPI Users' Group Meeting, presented in Helsinki, Finland, Springer, Sep. 2009,
ICPP'09
[70] T. Hoefler, C. Siebert and A. Lumsdaine:
 Group Operation Assembly Language - A Flexible Way to Express Collective Communication In ICPP-2009 - The 38th International Conference on Parallel Processing, presented in Vienna, Austria, IEEE, ISBN: 978-0-7695-3802-0, Sep. 2009, (acceptance rate 32%, 71/220)
HotI'09
[71] T. Hoefler, T. Schneider and A. Lumsdaine:
 Optimized Routing for Large-Scale InfiniBand Networks In 17th Annual IEEE Symposium on High Performance Interconnects (HOTI 2009), presented in New York, NY, Aug. 2009,
PPL
[72] T. Hoefler and T. Schneider and A. Lumsdaine:
 The Effect of Network Noise on Large-Scale Collective Communications Parallel Processing Letters (PPL). Vol 19, Nr. 4, pages 573-593, World Scientific Publishing Company, Aug. 2009,
HIPS'09
[73] T. Hoefler and J. L. Traeff:
 Sparse Collective Operations for MPI In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, HIPS'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
CAC'09
[74] C. Kaiser, T. Hoefler, B. Bierbaum and T. Bemmerl:
 Implementation and Analysis of Nonblocking Collective Operations on SCI Networks In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, CAC'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
CAC'09
[75] T. Hoefler, T. Schneider and A. Lumsdaine:
 A Power-Aware, Application-Based, Performance Study Of Modern Commodity Cluster Interconnection Networks In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, CAC'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
LSPP'09
[76] T. Hoefler, T. Schneider and A. Lumsdaine:
 The Impact of Network Noise at Large-Scale Communication Performance In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, LSPP'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009, In a journal special issue on top picks from LSPP'09.
LCI'09
[77] J. Mueller, T. Schneider, J. Domke, R. Geyer, M. Haesing, T. Hoefler, S. Hoehlig, G. Juckeland, A. Lumsdaine, M. Mueller and W. Nagel:
 Cluster Challenge 2008: Optimizing Cluster Configuration and Applications to Maximize Power Efficiency In In proceedings of the 10th LCI International Conference on High-Performance Clustered Computing, presented in Boulder, CO, Mar. 2009, LCI'09 Best Paper Award
Cluster'08
[78] T. Hoefler, T. Schneider and A. Lumsdaine:
 Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks In Proceedings of the 2008 IEEE International Conference on Cluster Computing, presented in Tsukuba, Japan, IEEE Computer Society, ISSN: 1552-5244, ISBN: 978-1-4244-2640, Oct. 2008, (acceptance rate 30%, 28/92)
Cluster'08
[79] T. Hoefler and A. Lumsdaine:
 Message Progression in Parallel Computing - To Thread or not to Thread? In Proceedings of the 2008 IEEE International Conference on Cluster Computing, presented in Tsukuba, Japan, IEEE Computer Society, ISSN: 1552-5244, ISBN: 978-1-4244-2640, Oct. 2008, (acceptance rate 30%, 28/92)
EuroMPI'08
[80] T. Hoefler, M. Schellmann, S. Gorlatch and A. Lumsdaine:
 Communication Optimization for Medical Image Reconstruction Algorithms Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 75-83, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,
EuroMPI'08
[81] T. Hoefler, F. Lorenzen and A. Lumsdaine:
 Sparse Non-Blocking Collectives in Quantum Mechanical Calculations Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 55-63, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,
HotI'08
[82] P. Geoffray and T. Hoefler:
 Adaptive Routing Strategies for Modern High Performance Networks In 16th Annual IEEE Symposium on High Performance Interconnects, HOTI'08, presented in Stanford, CA, USA, pages 165-172, IEEE Computer Society, ISBN: 978-0-7695-3380-3, Aug. 2008, (acceptance rate 30%, 14/47)
SPAA'08
[83] T. Hoefler, P. Gottschling and A. Lumsdaine:
 Brief Announcement: Leveraging Non-blocking Collective Communication in High-performance Applications In Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures, SPAA'08, presented in Munich, Germany, pages 113-115, Association for Computing Machinery (ACM), ISBN: 978-1-59593-973-9, Jun. 2008, (short paper) (acceptance rate: 28%, 36/128)
CCGrid'08
[84] T. Hoefler and A. Lumsdaine:
 Overlapping Communication and Computation with High Level Communication Routines In Proceedings of the 8th IEEE Symposium on Cluster Computing and the Grid (CCGrid 2008), presented in Lyon, France, May 2008, (acceptance rate: 32%)
PMEO'08
[85] T. Hoefler, T. Schneider and A. Lumsdaine:
 Accurately Measuring Collective Operations at Massive Scale In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, PMEO'08 Workshop, presented in Miami, FL, ISSN: 1530-2075, ISBN: 978-1-4244-1694-3, Apr. 2008, In a journal special issue on top picks from PMEO'08.
CAC'08
[86] T. Hoefler and A. Lumsdaine:
 Optimizing non-blocking Collective Operations for InfiniBand In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, CAC'08 Workshop, presented in Miami, FL, ISSN: 1530-2075, ISBN: 978-1-4244-1694-3, Apr. 2008,
PASA'08
[87] T. Schneider, T. Hoefler, S. Wunderlich, T. Mehlan and W. Rehm:
 An optimized ZGEMM implementation for the Cell BE In Proceedings of the 9th Workshop on Parallel Systems and Algorithms (PASA), presented in Dresden, Germany, ISSN: 1617-5468, ISBN: 978-3-88579-218-5, Feb. 2008,
KiCC'07
[88] A. Friedley, T. Hoefler, M. Leininger, A. Lumsdaine:
 Scalable High Performance Message Passing over InfiniBand for Open MPI In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
SC07
[89] T. Hoefler, A. Lumsdaine and W. Rehm:
 Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI In Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07, presented in Reno, USA, IEEE Computer Society/ACM, Nov. 2007, (acceptance rate 20%, 54/268)
EuroMPI'07
[90] T. Hoefler, P. Kambadur, R. L. Graham, G. Shipman and A. Lumsdaine:
 A Case for Standard Non-Blocking Collective Operations Vol 4757, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, EuroPVM/MPI 2007, presented in Paris, France, pages 125-134, Springer, ISSN: 0302-9743, ISBN: 978-3-540-75415-2, Oct. 2007,
PARCO
[91] T. Hoefler, P. Gottschling, A. Lumsdaine and W. Rehm:
 Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations Elsevier Journal of Parallel Computing (PARCO). Vol 33, Nr. 9, pages 624-633, Elsevier, ISSN: 0167-8191, Sep. 2007,
HPCC'07
[92] T. Hoefler, T. Mehlan, A. Lumsdaine and W. Rehm:
 Netgauge: A Network Performance Measurement Framework Vol 4782, In Proceedings of High Performance Computing and Communications, HPCC'07, presented in Houston, USA, pages 659-671, Springer, ISBN: 978-3-540-75443-5, Sep. 2007,
PMEO'07
[93] T. Hoefler, A. Lichei and W. Rehm:
 Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks TU Chemnitz. In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium, PMEO'07 Workshop, presented in Long Beach, CA, USA, IEEE Computer Society, ISBN: 1-4244-0909-8, Mar. 2007,
CAC'07
[94] T. Hoefler, C. Siebert and W. Rehm:
 A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast TU Chemnitz. In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium (CAC'07 Workshop), presented in Long Beach, CA, USA, pages 232, IEEE Computer Society, ISBN: 1-4244-0909-8, Mar. 2007,
KiCC'07
[95] F. Mietke, D. Dunger, T. Mehlan, T. Hoefler and W. Rehm:
 A native InfiniBand Transporter for MySQL Cluster TU Chemnitz. In Proceedings of the 2nd Workshop 'Kommunikation in Clusterrechnern und Clusterverbundsystemen' (KiCC'07), presented in Chemnitz, Germany, Feb. 2007,
FHPCN'06
[96] T. Hoefler, J. Squyres, W. Rehm and A. Lumsdaine:
 A Case for Non-Blocking Collective Operations Vol 4331/2006, In Frontiers of High Performance Computing and Networking - ISPA'06 Workshops, presented in Sorrento, Italy, pages 155-164, Springer Berlin / Heidelberg, ISBN: 978-3-540-49860-5, Dec. 2006,
HPCNano'06
[97] T. Hoefler and R. Janisch and W. Rehm:
 Parallel scaling of Teter's minimization for Ab Initio calculations presented in Tampa, FL, USA, Nov. 2006, Presented at the workshop HPC Nano in conjunction with the IEEE international conference on Supercomputing (SC'06)
EuroMPI'06
[98] T. Hoefler, P. Gottschling, W. Rehm and A. Lumsdaine:
 Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations In Recent Advantages in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI User's Group Meeting, Proceedings, LNCS 4192, presented in Bonn, Germany, pages 374-382, Springer, ISSN: 0302-9743, ISBN: 3-540-39110-X, Sep. 2006, In a journal special issue on top picks from EuroMPI'06.
PARELEC'06
[99] T. Hoefler, C. Viertel, T. Mehlan, F. Mietke, W. Rehm:
 Assessing Single-Message and Multi-Node Communication Performance of InfiniBand In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 227-232, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,
PARELEC'06
[100] T. Mehlan, J. Strunk, T. Hoefler, F. Mietke and W. Rehm:
 IRS - A portable Interface for Reconfigurable Systems In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 187-191, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,
DAPSYS'06
[101] T. Hoefler, J. Squyres, G. Fagg, G. Bosilca, W. Rehm and A. Lumsdaine:
 A New Approach to MPI Collective Communication Implementations In Distributed and Parallel Systems - From Cluster to Grid Computing (DAPSYS'06), presented in Innsbruck, Austria, pages 45-54, Springer, ISBN: 978-0-387-69857-1, Sep. 2006,
EuroPar'06
[102] F. Mietke, R. Baumgartl, R. Rex, T. Mehlan, T. Hoefler and W. Rehm:
 Analysis of the Memory Registration Process in the Mellanox InfiniBand Software Stack In Proceedings of Euro-Par 2006 Parallel Processing, presented in Dresden, Germany, pages 124-133, Springer-Verlag Berlin, ISBN: 3-540-37783-2, Aug. 2006, (acceptance rate 37.9%, 110/290)
CAC'06
[103] T. Hoefler, T. Mehlan, F. Mietke and W. Rehm:
 Fast Barrier Synchronization for InfiniBand In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS), CAC'06 Workshop, presented in Rhodes, Greece, ISBN: 1-4244-0054-6, Apr. 2006,
PMEO'06
[104] T. Hoefler, T. Mehlan, F. Mietke and W. Rehm:
 LogfP - A Model for small Messages in InfiniBand In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS), PMEO-PDS'06 Workshop, presented in Rhodes, Greece, ISBN: 1-4244-0054-6, Apr. 2006,
ARCS'06
[105] T. Hoefler, T. Mehlan, F. Mietke and W. Rehm:
 Adding Low-Cost Hardware Barrier Support to Small Commodity Clusters In Proceedings of 19th International Conference on Architecture and Computing Systems - ARCS'06, presented in Frankfurt, Germany, pages 343-250, ISSN: 3-88579-175-7, Mar. 2006,
HPCE'05
[106] T. Hoefler, R. Janisch and W. Rehm:
 Improving the parallel scaling of ABINIT CINECA Consorzio Interuniversitario. In Science and Supercomputing in Europe - Report 2005, presented in Caseleccio di Reno, Italy, pages 551-559, CINECA Conzorzio Interuniversitario, ISBN: 88-86037-17-1, Dec. 2005,
Book Chapter
[107] T. Hoefler, R. Janisch and W. Rehm:
 A Performance Analysis of ABINIT on a Cluster System TU Chemnitz. In Parallel Algorithms and Cluster Computing, presented in Chemnitz, Germany, pages 37-51, Springer, Lecture Notes in Computational Science and Engineering, ISBN: 3-540-33539-0, Dec. 2005,
ICPP-W'05
[108] T. Hoefler, L. Cerquetti, T. Mehlan, F. Mietke and W. Rehm:
 A practical approach to the rating of barrier algorithms using the LogP model and Open-MPI In Proceedings of the 2005 International Conference on Parallel Processing Workshops, presented in Oslo, Norway, pages 562--569, ISBN: 0-7659-2381-1, Jun. 2005,
PARS'06
[109] T. Hoefler and W. Rehm:
 A Communication Model for Small Messages with InfiniBand PARS. In PARS Mitteilungen, presented in Luebeck, Germany, pages 32-41, PARS, ISSN: 0177-0454, Jun. 2005, PARS Junior Researcher Prize
PARS'05
[110] F. Mietke, M. Steiger, T. Mehlan, T. Hoefler und W. Rehm:
 SHIBA Shared Memory Support for InfiniBand MPICH2 Device In PARS Mitteilungen 2005, presented in Luebeck, Germany, pages 14-23, ISSN: 0177-0454, Jun. 2005,

Invited Talks and Presentations

MSU
[111] T. Hoefler:
 Remote Memory Access Programming - Tools and Fault Tolerance (Presentation) presented in Moscow, Russia, Jul. 2014,
ISC'14
[112] T. Hoefler:
 The Green Graph500 List (Presentation) presented in Leipzig, Germany, Jun. 2014,
ISC'14
[113] T. Hoefler:
 Using Simulation to Evaluate the Performance of Resilience Strategies at Scale (Presentation) In ISC workshop on International Cooperation, presented in Leipzig, Germany, Jun. 2014,
ExaMPI'13
[114] T. Hoefler:
 MPI Beyond 3.0 and Towards Larger-Scale Computing (Presentation) presented in Denver, CO, USA, Nov. 2013, Keynote at ExaMPI 2013 Workshop (in conjunction with SC13)
SC13
[115] T. Hoefler:
 The Green Graph500 List (Presentation) presented in Denver, Colorado, Nov. 2013,
ISC'13
[116] T. Hoefler:
 The Green Graph500 List (Presentation) presented in Leipzig, Germany, Jun. 2013,
EASC'13
[117] T. Hoefler:
 Application-Centric Benchmarking and Modeling for Co-Design (Presentation) presented in Edinburgh, Great Britain, Apr. 2013, Presented at the Exascale Applications and Software Conference (EASC'13)
MCC'12
[118] T. Hoefler:
 MPI-3.0: A Response to New Challenges in Hardware and Software (Presentation) presented in Stuttgart, Germany, Sep. 2012, Keynote at Multicore Challenge 2012
ISC'12
[119] T. Hoefler:
 The Green Graph500 (Presentation) presented in Hamburg, Germany, Jul. 2012,
TiTech'12
[120] T. Hoefler:
 Optimized routing and process mapping for arbitrary network topologies (Presentation) presented in Tokyo, Japan, Jun. 2012, Tokyo Institute of Technology
CUG 2012
[121] G. Bauer, T. Hoefler, W. Kramer and B. Fiedler:
 Analyses and Modeling of Applications Used to Demonstrate Sustained Petascale Performance on Blue Waters (Presentation) presented in Stuttgart, Germany, May 2012, Cray User Group
TUM'12
[122] T. Hoefler:
 New and old Features in MPI-3.0: The Past, the Standard, and the Future (Presentation) University of Illinois at Urbana-Champaign. presented in Munich, Germany, Apr. 2012,
RWTH'12
[123] T. Hoefler:
 Performance Modeling for Systematic Performance Tuning (Presentation) presented in Aachen, Germany, Mar. 2012,
SIAM award
[124] T. Hoefler:
 Performance-oriented Parallel Programming Integrating Hardware, Middleware, and Applications (Presentation) presented in Savannah, GA, USA, Feb. 2012, SIAM SIAG/SC Junior Scientist Award Lecture
Utah'12
[125] T. Hoefler:
 Energy-aware Software Development for Massive-Scale Systems (Presentation) presented in Salt Lake City, Utah, USA, Jan. 2012,
SC11 panel
[126] T. Hoefler:
 Performance Modeling for the Masses (Presentation) presented in Seattle, WA, USA, Nov. 2011,
EnA-HPC'11
[127] T. Hoefler:
 Energy-aware Software Development for Massive-Scale Systems (Presentation) presented in Hamburg, Germany, Sep. 2011, Keynote at the International Conference on Energy-Aware High Performance Computing (EnA-HPC'11) EnA-HPC'11 Keynote Presentation
EuroMPI'11
[128] T. Hoefler:
 Writing Parallel Libraries with MPI - The Good, the Bad, and the Ugly presented in Santorini, Greece, Sep. 2011, Keynote talk at 18th European PVM/MPI User's Group Meeting Keynote talk at EuroMPI 2011.
Juelich
[129] T. Hoefler:
 Model-Driven, Performance-Centric HPC Software and System Design and Optimization (Presentation) presented in Juelich, Germany, Apr. 2011, Talk at Juelich Supercomputing Center (JSC)
Aachen
[130] T. Hoefler:
 Characterizing the Influence of System Noise on Large-Scale Parallel Applications (Presentation) presented in Aachen, Germany, Apr. 2011, Talk at RWTH Aachen University
CSE'11
[131] W. Gropp, T. Hoefler and M. Snir:
 Performance Modeling for Systematic Performance Tuning (Presentation) In SIAM Conference on Computational Science and Engineering 2011 (Abstracts), presented in Reno, NV, SIAM, Feb. 2011,
NCSA
[132] T. Hoefler:
 Optimizing Communication on Blue Waters (Presentation) In Talk at the Blue Waters PRAC Workshop, presented in Urbana, IL, USA, Oct. 2010,
PROPER'10
[133] T. Hoefler:
 Analytical Performance Modeling and Simulation for Blue Waters (Presentation) In Keynote at Workshop on Productivity and Performance (PROPER 2010), presented in Ischia, Italy, Aug. 2010, PROPER'10 Keynote Presentation
ANL
[134] T. Hoefler:
 Nonblocking and Sparse Collective Operations on Petascale Computers (Presentation) presented in Argonne National Laboratory, Jun. 2010,
NCSA
[135] T. Hoefler:
 2010 Blue Waters Performance Modeling Workshop -- Opening and Introduction (Presentation) In Opening Slides for the Blue Waters Modeling Workshop, presented in Urbana, IL, USA, Mar. 2010,
MPICH BoF
[136] T. Hoefler:
 Selected MPI-2.2 and MPI-3 Features (Presentation) presented in Portland, OR, USA, Nov. 2009, MPICH Birds of a Feather Supercomputing 2009 (SC09), host: Darius Buntinas
TUM
[137] T. Hoefler:
 Improving Parallel Computing Platforms (Presentation) presented in Munich, Germany, Oct. 2009, Presentation at the Technical University of Munich, Host: Prof. M. Gerndt
MPI Forum
[138] T. Hoefler:
 MPI-3 Collective Working Group - December'08 Meeting (Presentation) Indiana University. presented in Menlo Park, CA, USA, Dec. 2008, Activity Report to the MPI Forum
MPI Forum
[139] T. Hoefler:
 MPI-3 Collective Working Group - October'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Oct. 2008, Activity Report to the MPI Forum
MPI Forum
[140] T. Hoefler:
 MPI-3 Collective Working Group - September'08 Meeting (Presentation) Indiana University. presented in Dublin, Ireland, Sep. 2008, Activity Report to the MPI Forum
Cisco
[141] T. Hoefler:
 The effects of common communication patterns in large-scale networks with switch-based static routing (Presentation) Nerd Lunch at Cisco Systems. presented in San Jose, CA, USA, Aug. 2008,
LBNL
[142] T. Hoefler:
 Multistage Interconnection Networks are not Crossbars (Presentation) Lawrence Berkeley National Lab. presented in Berkeley, CA, USA, Aug. 2008,
LLNL
[143] T. Hoefler:
 Non-blocking Collective Operations for MPI (Presentation) Lawrence Livermore National Lab. presented in Livermore, CA, USA, Aug. 2008,
TUM'08
[144] T. Hoefler:
 Towards coordinated optimization of computation and communication in parallel applications (Presentation) Fakultaet fuer Informatik, Universität Münster. presented in Muenster, Germany, Jun. 2008,
MPI Forum
[145] T. Hoefler, J. L. Traeff, C. Siebert and A. Lumsdaine:
 MPI-3 Collective Working Group - June'08 Meeting (Presentation) Indiana University. presented in Menlo Park, CA, USA, Jun. 2008, Activity Report to the MPI Forum
IWR-TUD
[146] T. Hoefler:
 Non-Blocking Collectives for MPI (Presentation) Institut fuer Wissenschaftliches Rechnen, Technische Universitaet Dresden. presented in Dresden, Germany, May 2008,
MPI Forum
[147] T. Hoefler and A. Lumsdaine:
 MPI-3 Collective Working Group - April'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Apr. 2008, Slides with proposals to the MPI-3 collective WG, all preliminary, published on request
MPI Forum
[148] T. Hoefler and A. Lumsdaine:
 MPI-3 Collective Working Group - January'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Mar. 2008, Slides with proposals to the MPI-3 collective WG, all preliminary, published on request
HLRS
[149] T. Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) High Performance Computing Center Stuttgart (HLRS). presented in Stuttgart, Germany, Dec. 2007,
NEC'07
[150] T. Hoefler:
 Accurately Measuring Collective Operations at Massive Scale (Presentation) C&C Research Laboratories, NEC Europe Ltd.. presented in Sankt Augustin, Germany, Dec. 2007,
ZiH-TUD
[151] T. Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) Dresden University of Technology, Center for Information Services and High Performance Computing (ZIH). presented in Dresden, Germany, Oct. 2007,
Talk
[152] F. Mietke, T. Hoefler, T. Mehlan and W. Rehm:
 Diskless Cluster und Lustre - Erfahrungsbericht zum CHiC (Presentation) TU Chemnitz. presented in Chemnitz, Germany, Apr. 2007,
ZKI'07
[153] F. Mietke, T. Mehlan, T. Hoefler and W. Rehm:
 Stand HPC Cluster CHiC (Presentation) TU Chemnitz. presented in Chemnitz, Germany, Apr. 2007,
CEA'06
[154] T. Hoefler:
 Application Optimization with non-blocking Collectives (Presentation) Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-chatel, France, Jan. 2007,
CEA'07
[155] T. Hoefler:
 Non-Blocking Collectives for MPI-2 (Presentation) Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-chatel, France, Jan. 2007,
ABINIT'07
[156] T. Hoefler and G. Zerah:
 Optimization of a parallel 3d-FFT with non-blocking collective operations (Presentation) Invited to the 3rd International ABINIT Developer Workshop. presented in Liege, Belgium, Jan. 2007,
NEC'06
[157] T. Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) C&C Research Laboratories, NEC Europe Ltd.. presented in Sankt Augustin, Germany, Nov. 2006,
OMPI'06
[158] T. Hoefler:
 Open MPI - Collv2 Design (Presentation) Cisco Systems. presented in San Jose, CA, USA, Apr. 2006,
ABINIT'06
[159] T. Hoefler:
 Parallelization Options for the Band-by-Band Minimization of Teter et. al. (Presentation) Universite catholique de Louvain. presented in Louvain-la-Neuve, Belgium, Feb. 2006,
SFB'05
[160] T. Hoefler and W. Rehm:
 Communication/Computation Overlap in MPI (Presentation) Technical University of Chemnitz. presented in Chemnitz, Germany, Nov. 2005,
TUM'05
[161] T. Hoefler:
 Fast Barrier Synchronization for InfiniBand (Presentation) Technical University of Chemnitz. presented in Munich, Germany, Sep. 2005,
URZ-WS'04
[162] T. Hoefler:
 Remote network analysis (Presentation) Technical University of Chemnitz, URZ Workshop. presented in Loebsal, Germany, May 2004,
Unix-ST'04
[163] T. Hoefler:
 Remote network analysis (Presentation) Technical University of Chemnitz, Unix Stammtisch. presented in Chemnitz, Germany, May 2004,

Other Publications or Technical Reports

TR
[164] S. Ramos and T. Hoefler:
 Modelling Communications in Cache Coherent Systems Technical Report. SPCL, ETH Zurich. presented in Zurich, Switzerland, Feb. 2013,
MPI-3.0
[165] Message Passing Interface Forum:
 MPI: A Message-Passing Interface Standard Version 3.0 Sep. 2012, Chapter author for Collective Communication, Process Topologies, and One Sided Communications
MPI-2.2
[166] Message Passing Interface Forum:
 MPI: A Message-Passing Interface Standard Version 2.2 Sep. 2009, Chapter author for Collective Communication and Process Topologies
MPI Forum
[167] T. Hoefler on behalf of the MPI Forum:
  MPI: A Message-Passing Interface Standard -- Working-Draft for Nonblocking Collective Operations MPI Forum. MPI Forum, Apr. 2009,
IUCS-TR
[168] T. Schneider, T. Hoefler and A. Lumsdaine :
 ORCS: An Oblivious Routing Congestion Simulator Indiana University. Nr. 675, Indiana University Computer Science, Feb. 2009,
MPI Forum
[169] D. Gregor, T. Hoefler, B. Barrett and A. Lumsdaine :
 Fixing Probe for Multi-Threaded MPI Applications Indiana University. Nr. 674, Indiana University Computer Science, Jan. 2009,
MPI Forum
[170] T. Hoefler, F. Lorenzen, D. Gregor and A. Lumsdaine:
 Topological Collectives for MPI-2 Open Systems Lab, Indiana University. MPI Forum, Feb. 2008,
MPI Forum
[171] D. Gregor, T. Hoefler and A. Lumsdaine:
 Dynamically-Sized Messages in MPI-3 Open Systems Lab, Indiana University. MPI Forum, Feb. 2008,
KiCC'07
[172] T. Hoefler, M. Mosch, T. Mehlan, W. Rehm:
 CollGM - A Myrinet/GM optimized collective component for Open MPI In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
KiCC'07
[173] F. Mietke, T. Mehlan, T. Hoefler, W. Rehm:
 Design and Evaluation of a 2048 Core Cluster System In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
CASCON'07
[174] T. Schneider, S. Wunderlich, W. Rehm, T. Hoefler and H. Schick:
 Code Optimization for Cell/B.E. - Opportunities for ABINIT In IBM CASCON 2006 Symposium, presented in Dublin, Ireland, IBM, Oct. 2007, Research Poster at the IBM CASCON 2006 Symposium, Dublin, Ireland
CEA-TR
[175] T. Hoefler and G. Zerah:
 Transforming the high-performance 3d-FFT in ABINIT to enable the use of non-blocking collective operations Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-Chatel, France, Feb. 2007,
IUCS-TR
[176] T. Hoefler, J. Squyres, G. Bosilca, G. Fagg, A. Lumsdaine and W. Rehm:
 Non-Blocking Collective Operations for MPI-2 Open Systems Lab, Indiana University. presented in Bloomington, IN, USA, School of Informatics, Aug. 2006,
IUCS-TR
[177] T. Hoefler and A. Lumsdaine:
 Design, Implementation, and Usage of LibNBC Open Systems Lab, Indiana University. presented in Bloomington, IN, USA, School of Informatics, Aug. 2006,
CIB-06-06
[178] T. Hoefler, M. Reinhardt, F. Mietke, T. Mehlan, W. Rehm:
 Low Overhead Ethernet Communication for Open MPI on Linux Clusters TU Chemnitz. Vol CSR-06, Nr. 06, In Chemnitzer Informatik Berichte, presented in Chemnitz, TU Chemnitz, ISSN: 0947-5125, Jul. 2006,
CUG'06
[179] R. Riesen, C. Vaughan, and T. Hoefler:
 What if MPI Collective Operations Were Instantaneous? Cray Inc.. In Proceedings of the 2006 Cray User Group Meeting, presented in Lugano, Switzerland, May 2006,
Report
[180] R. Kullmann, T. Hoefler:
 A short Performance Analysis of Abinit under different build environments TU Chemnitz. presented in Chemnitz, Germany, Jan. 2006,
22C3
[181] T. Hoefler:
 The Cell Processor 22. Chaos Communication Congress. In 22C3 Proceedings, presented in Berlin, Germany, pages 286-292, ISBN: 3-934636-04-7, Dec. 2005,
KiCC'05
[182] F. Mietke, R. Rex, T. Hoefler, T. Mehlan and W. Rehm:
 Reducing the Impact of Memory Registration in InfiniBand. TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
KiCC'05
[183] T. Mehlan, T. Hoefler, F. Mietke and W. Rehm:
 Integration of the SISCI Shared Memory Interface into Open MPI TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
KiCC'05
[184] T. Hoefler, J. Squyres, T. Mehlan, F. Mietke and W. Rehm:
 Implementing a Hardware-based Barrier in Open MPI TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
CAT-05-02
[185] T. Hoefler and W. Rehm:
 A short Performance Analysis of Abinit on a Cluster System Computer Architecture Technical Report. Technical University of Chemnitz. presented in Chemnitz, Germany, Jul. 2005,
CIB-04-04
[186] T. Hoefler and W. Rehm:
 A Meta Analysis of Gigabit Ethernet over Copper Solutions for Cluster-Networking Chemnitzer Informatik Berichte. Technical University of Chemnitz. Vol 04, Nr. 04, presented in Chemnitz, Germany, ISSN: 0947-5152, Dec. 2004,
CIB-04-03
[187] T. Hoefler, T. Mehlan, F. Mietke and W. Rehm:
 A Survey of Barrier Algorithms for Coarse Grained Supercomputers Chemnitzer Informatik Berichte. Technical University of Chemnitz. Vol 04, Nr. 03, presented in Chemnitz, Germany, ISSN: 0947-5152, Dec. 2004,
21C3
[188] T. Hoefler:
 Remote Network Analysis 21. Chaos Communication Congress. In 21C3 Proceedings, presented in Berlin, Germany, pages 33-37, ISBN: 3-934636-02-0, Dec. 2004,
Report
[189] T. Hoefler, C. Burkert, M. Telzer:
 A Comparative Firewall Study Technical University of Chemnitz. Oct. 2004, Studienarbeit
Report
[190] T. Hoefler, C. Burkert, M. Opitz, D. Roeder, A. Lichei, M. Telzer:
 Betriebssystembackup imhomogener Netze Technical University of Chemnitz. Jan. 2004, Projektarbeit

Theses

Ph.D.'08
[191] T. Hoefler:
 Principles for Coordinated Optimization of Computation and Communication in Large-Scale Parallel Systems Indiana University. presented in Bloomington, IN, USA, Sep. 2008,
M.Sc.'05
[192] T. Hoefler:
 Evaluation of publicly available Barrier-Algorithms and Improvement of the Barrier-Operation for large-scale Cluster-Systems with special Attention on InfiniBand Networks Technical University of Chemnitz. presented in Chemnitz, Germany, Apr. 2005, TU Chemnitz Best Student Award, 2005

serving: 54.234.0.85:37295© Torsten Hoefler