Non quia difficilia sunt non audemus, sed quia non audemus difficilia sunt
Home -> Research -> MPI Datatypes
Home
  Publications
  Awards
  Research
    
NB Collectives
    MPI Topologies
    MPI Datatypes
      
DDTBench
    Netgauge
    Network Topologies
    Ethernet BTL eth
    ORCS
    DFSSSP
    Older Projects
    cDAG
    LogGOPSim
    CoMPIler
  Teaching
  Miscellaneous
  Full CV [pdf]
  BLOG






  Events








  Past Events





MPI Derived Datatype (Benchmark) Page

Description

This page intends to advocate the use of MPI Derived Datatypes (DDT). DDTs are a very powerful mechanism to declare data access patterns to the MPI library which, in turn, can choose the best method for sending or receiving the structure. This mechanism is superior to manually packing and unpacking the data. However, early implementations have been suboptimal (and some still are) such that many users live with the assumption that DDTs are not useful. On this page, I present several examples where DDTs improved the performance of parallel codes. The paper also presents negative examples, however, the benchmarks are available here such that vendors can go ahead and optimize for the expected case. Also, datatypes are complete in that one can express any arbitrary permutation. This means that the potential optimization space is very huge (combinatorial). One observation is that only some patterns are common to most applications. The provided application benchmarks on this page try to provide such patterns to implementers and thus steer the optimization in a useful direction.

Datatype Application Benchmarks

This page hosts two benchmarks for MPI datatypes. The first one is a simple parallel two-dimensional Fast Fourier Transformation (FFT) using FFTW in a 1-d decomposition. The second benchmark is a full application code (MIMD Lattice Computation, MILC) acting on a four-dimensional matrix. See README and LICENSE files in the top directory of both packages for details.

Download

Datatyped Applications

Robert Gerstenberger extended three applications to use datatypes during his stay at NCSA. The results can be found below:

Download

The results have been summarized in the datatype microbenchmark DDTBench and the publication [3].

Semi-Automatic Datatype Generation

Marc Snir and Torsten Hoefler co-advised Fredrik Kjolstad's Master's work on automatic datatype extraction from source codes using refactoring techniques. Fredrik's webpage has additional details. Fredrik converted the NAS parallel benchmarks version 3.2 (Fortran) packing loops to straight-forward C loop code and applied his tool to convert the C loops to MPI datatypes. The patches can be downloaded from his webpage and is mirrored here nas-datatype-patches.tgz - (59.87 kb).

References

EuroMPI'12
[3] Timo Schneider, Robert Gerstenberger, Torsten Hoefler:
 Micro-Applications for Communication Data Access Patterns and MPI Datatypes Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, pages 121-131, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to a journal special issue on top picks from EuroMPI'12.
EuroMPI'11
[2] William Gropp, Torsten Hoefler, Rajeev Thakur and Jesper Larsson Träff:
 Performance Expectations and Guidelines for MPI Derived Datatypes Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 150-159, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,
EuroMPI'10
[1] Torsten Hoefler and S. Gottlieb:
 Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient using MPI Datatypes Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 132--141, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,




serving: 18.191.62.68:3244© Torsten Hoefler