Non quia difficilia sunt non audemus, sed quia non audemus difficilia sunt
Home -> Publications
all years
    edited volumes
  Full CV [pdf]


  Past Events

Publications of Torsten Hoefler
Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

T. Hoefler and T. Schneider:

 Communication-Centric Optimizations by Dynamically Detecting Collective Operations

(In Proceedings of the 17th ACM symposium on Principles and practice of parallel programming, Feb. 2012, (poster paper) )


The steady increase of parallelism in high-performance computing platforms implies that communication will be most important in large-scale applications. In this work, we tackle the problem of transparent optimization of large-scale communication patterns using online compilation techniques. We utilize the Group Operation Assembly Language (GOAL), an abstract parallel dataflow definition language, to specify our transformations in a device-independent manner. We develop fast schemes that analyze data-flow and synchronization semantics in GOAL and detect if parts of the (or the whole) communication pattern express a known collective communication operation. The detection of collective operations allows us to replace the detected patterns with highly optimized algorithms or low-level hardware calls and thus improve performance significantly. Benchmark results suggest that our technique can lead to a performance improvement of orders of magnitude compared with various optimized algorithms written in Co-Array Fortran. Detecting collective operations also improves the programmability of parallel languages in that the user does not have to understand the detailed semantics of high-level communication operations in order to generate efficient and scalable code.


download article:
download attachment:


  author={T. Hoefler and T. Schneider},
  title={{Communication-Centric Optimizations by Dynamically Detecting Collective Operations}},
  booktitle={Proceedings of the 17th ACM symposium on Principles and practice of parallel programming},
  note={(poster paper)},

serving:© Torsten Hoefler