Omnia vincit amor
Home -> Publications
Home
  Publications
    
edited volumes
  Awards
  Research
  Teaching
  Miscellaneous
  Full CV [pdf]
  BLOG






  Events








  Past Events





Publications of Torsten Hoefler
Patrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler:

 High Performance Unstructured SpMM Computation Using Tensor Cores

(In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24), presented in Atlanta, GA, USA, pages 154:1-154:14, IEEE Press, ISBN: 979-8-3503-5291-7, Nov. 2024)

Publisher Reference

Abstract

High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. Yet, existing hardware, such as Tensor Cores (TC), is ill-suited for SpMM, as it imposes strict constraints on data structures that cannot be met by unstructured sparsity found in many applications. To address this, we introduce (S)parse (Ma)trix Matrix (T)ensor Core-accelerated (SMaT): a novel SpMM library that utilizes TCs for unstructured sparse matrices. Our block-sparse library leverages the low-level CUDA MMA (matrix-matrix-accumulate) API, maximizing the performance offered by modern GPUs. Algorithmic optimizations such as sparse matrix permutation, further improve performance by minimizing the number of non-zero blocks. The evaluation on NVIDIA A100 shows that SMaT outperforms SotA libraries (DASP, cuSPARSE, and Magicube) by up to 125x (on average 2.6x). SMaT can be used to accelerate many workloads in scientific computing, large model training, inference, and others.

Documents

Publisher URL: https://doi.org/10.1109/SC41406.2024.00060download article:     
 

BibTeX

@inproceedings{okanovic2024high,
  author={Patrik Okanovic and Grzegorz Kwasniewski and Paolo Sylos Labini and Maciej Besta and Flavio Vella and Torsten Hoefler},
  title={{High Performance Unstructured SpMM Computation Using Tensor Cores}},
  year={2024},
  month={Nov.},
  pages={154:1-154:14},
  booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'24)},
  location={Atlanta, GA, USA},
  publisher={IEEE Press},
  isbn={979-8-3503-5291-7},
  source={http://www.unixer.de/~htor/publications/},
}


serving: 18.221.240.14:9376© Torsten Hoefler