Performance of different GPU data access implementation strategies

This dataset contains results of experiments performed to benchmark the performance of different memory access implementation strategies for CUDA kernel execution in a multi-GPU environment. The performance tests were conducted using two NVIDIA V100 GPU cards interconnected via NVLink interconnect. The hardware environment was provided by the imec GPULab. The resulting performance data are provided to serve as a valuable reference for researchers and practitioners interested in designing multi-GPU applications or algorithms. Using a simple GPU kernel function computing the element-wise product of two vectors, we compared the GPU execution times obtained with traditional explicit host-GPU data transfer, and with Unified Memory and Unified Virtual Address mechanism-based implementations.

Data and Resources

Additional Info

Field Value
Author Zoltan Juhasz
Maintainer Zoltan Juhasz
Last Updated April 4, 2023, 10:29 (UTC)
Created April 4, 2023, 10:29 (UTC)