Non quia difficilia sunt non audemus, sed quia non audemus difficilia sunt
Home -> Publications
all years
    edited volumes
  Full CV [pdf]


  Past Events

Publications of Torsten Hoefler
Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Felix Thaler, Stefan Moosbrugger, Carlos Osuna, Mauro Bianco, Hannes Vogt, Anton Afanasyev, Lukas Mosimann, Oliver Fuhrer, Thomas Schulthess, Torsten Hoefler:

 Porting the COSMO Weather Model to Intel KNL

(presented in Zurich, Switzerland, ACM, Jun. 2019, Accepted at the ACM Platform for Advanced Scientific Computing Conference (PASC19) )


Weather and climate simulations are a major application driver in high-performance computing (HPC). With the end of Dennard scaling and Moore's law, the HPC industry increasingly employs specialized computation accelerators to increase computational throughput. Manycore architectures, such as Intel's Knights Landing (KNL), are a representative example of future processing devices. However, software has to be modified to use these devices efficiently. In this work, we demonstrate how an existing domain-specific language that has been designed for CPUs and GPUs can be extended to Manycore architectures such as KNL. We achieve comparable performance to the NVIDIA Tesla P100 GPU architecture on hand-tuned representative stencils of the dynamical core of the COSMO weather model and its radiation code. Further, we present performance within a factor of two of the P100 of the full DSL-based GPU-optimized COSMO dycore code. We find that optimizing code to full performance on modern manycore architectures requires similar effort and hardware knowledge as for GPUs. Further, we show limitations of the present approaches, and outline our lessons learned and possible principles for design of future DSLs for accelerators in the weather and climate domain.


download article:


  author={Felix Thaler and Stefan Moosbrugger and Carlos Osuna and Mauro Bianco and Hannes Vogt and Anton Afanasyev and Lukas Mosimann and Oliver Fuhrer and Thomas Schulthess and Torsten Hoefler},
  title={{Porting the COSMO Weather Model to Intel KNL}},
  location={Zurich, Switzerland},
  note={Accepted at the ACM Platform for Advanced Scientific Computing Conference (PASC19) },

serving:© Torsten Hoefler