Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs
The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a mat...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | Objeto de conferencia |
Lenguaje: | Inglés |
Publicado: |
2023
|
Materias: | |
Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/155420 |
Aporte de: |
id |
I19-R120-10915-155420 |
---|---|
record_format |
dspace |
spelling |
I19-R120-10915-1554202023-07-11T20:01:41Z http://sedici.unlp.edu.ar/handle/10915/155420 isbn:978-950-34-2271-7 Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo 2023-06 2023 2023-07-11T17:09:21Z en Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices. Facultad de Informática Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 13-18 |
institution |
Universidad Nacional de La Plata |
institution_str |
I-19 |
repository_str |
R-120 |
collection |
SEDICI (UNLP) |
language |
Inglés |
topic |
Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability |
spellingShingle |
Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
topic_facet |
Ciencias Informáticas oneAPI SYCL GPU CUDA Performance portability |
description |
The heterogeneous computing paradigm has led to the need for portable and efficient programming solutions that can leverage the capabilities of various hardware devices, such as NVIDIA, Intel, and AMD GPUs. This study evaluates the performance and portability of the SYCL and CUDA languages for a matrix multiplication (MM) application across different GPU architectures. The experimental work showed that, while the CUDA implementation outperforms the SYCL implementation on NVIDIA devices due to optimizations provided by the nvcc compiler, the latter implementation demonstrated remarkable code portability to other GPU architectures, such as AMD and Intel. Furthermore, the architectural efficiency percentages obtained on AMD and Intel GPUs showed consistency with the results observed on NVIDIA devices. |
format |
Objeto de conferencia Objeto de conferencia |
author |
Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo |
author_facet |
Costanzo, Manuel Rucci, Enzo García-Sánchez, Carlos Naiouf, Marcelo |
author_sort |
Costanzo, Manuel |
title |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_short |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_full |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_fullStr |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_full_unstemmed |
Brief performance portability analysis of a matrix multiplication kernel on multiple vendor GPUs |
title_sort |
brief performance portability analysis of a matrix multiplication kernel on multiple vendor gpus |
publishDate |
2023 |
url |
http://sedici.unlp.edu.ar/handle/10915/155420 |
work_keys_str_mv |
AT costanzomanuel briefperformanceportabilityanalysisofamatrixmultiplicationkernelonmultiplevendorgpus AT ruccienzo briefperformanceportabilityanalysisofamatrixmultiplicationkernelonmultiplevendorgpus AT garciasanchezcarlos briefperformanceportabilityanalysisofamatrixmultiplicationkernelonmultiplevendorgpus AT naioufmarcelo briefperformanceportabilityanalysisofamatrixmultiplicationkernelonmultiplevendorgpus |
_version_ |
1771439083861573632 |