Changes between Version 17 and Version 18 of GEMM_Performance_Cypress


Ignore:
Timestamp:
Aug 8, 2010 11:14:49 PM (14 years ago)
Author:
nakasato
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GEMM_Performance_Cypress

    v17 v18  
    77Update(20100803): We put a preliminary result of our DGEMM routine with alpha = 1 and beta = 1 (the line with "NN"). The performance number presented here includes I/O time between CPU and GPU. It seems we have large room for more aggresive I/O optimizations that should adopt the transposing of input matrix A. 
    88 
    9 Update(20100808): The previous figure for DGEMM shows wrong results. The correct performance of our DGEMM is slightly slower than the previous data.  "NN" is my initial not matrix transpose options. This result is for At B. We are currently working on other options. 
     9Update(20100808): The previous figure for DGEMM shows wrong results. The correct performance of our DGEMM is slightly slower than the previous data.  "NN" is my initial not matrix transpose options. This result is for At B. We are currently working on other options. And will work on scalar constants. 
    1010 
    1111= Results =