Changes between Version 19 and Version 20 of GEMM_Performance_Cypress


Ignore:
Timestamp:
Aug 12, 2010 6:23:48 AM (14 years ago)
Author:
nakasato
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GEMM_Performance_Cypress

    v19 v20  
    99Update(20100808): The previous figure for DGEMM shows wrong results. The correct performance of our DGEMM is slightly slower than the previous data.  "NN" is my initial not matrix transpose options. This result is for At B. We are currently working on other options. And will work on scalar constants. 
    1010 
    11 Comment(20100812): With the latest Catalyst 10.7, slow transfer speed for GPU to CPU memory is gone. We have roughly ~ 6 GB/sec for both direction. However, it just represents practical maximum transfer speed with a combination of Cypress and X58 chipset. The actual transfer speed from/to the main memory is rather slow ~ 2.5 GB/sec at most with the pinned memory allocation. This is the problem! 
     11Comment(20100812): With the latest Catalyst 10.7, slow transfer speed for GPU to CPU memory is gone. We have roughly ~ 6 GB/sec for both directions. However, it just represents practical maximum transfer speed with a combination of Cypress and X58 chipset. The actual transfer speed from/to the main memory is rather slow ~ 2.5 GB/sec at most with the pinned memory allocation. This is the problem! 
    1212 
    1313= Results =