Changes between Version 24 and Version 25 of GEMM_Performance_Cypress
- Timestamp:
- Aug 19, 2010 8:10:49 AM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GEMM_Performance_Cypress
v24 v25 11 11 Comment(20100812): With the latest Catalyst 10.7, slow transfer speed for GPU to CPU memory is gone. We have roughly ~ 6 GB/sec for both directions. However, it just represents practical maximum transfer speed with a combination of Cypress and X58 chipset. The actual transfer speed from/to the main memory is rather slow ~ 2.5 GB/sec at most with the pinned memory allocation. This is the problem! 12 12 13 Update(20100819): We have implemented a proper treatment of alpha and beta. In the latest plot, we compare our results with ACML-GPU 1.1 and MAGMA BLAS 0.3 running on Fermi (we plot the numbers in "results_dgemm.txt" that is included in http://icl.cs.utk.edu/projectsfiles/magma/downloads/magmablas_gemm_fermi.tar.gz). Note the lines without "I/O" show the performance that does not take into account data transfer time between CPU and GPU.13 Update(20100819): We have implemented a proper treatment of alpha and beta. In the latest plot, we compare our results with ACML-GPU 1.1 and MAGMA BLAS 0.3 running on Fermi (we plot the numbers in "results_dgemm.txt" that is included in http://icl.cs.utk.edu/projectsfiles/magma/downloads/magmablas_gemm_fermi.tar.gz). ACML-GPU results are also updated with consistent setting. Note the lines without "I/O" show the performance that does not take into account data transfer time between CPU and GPU. 14 14 15 15 = Results =