Changes between Version 9 and Version 10 of GEMM_Performance_Cypress


Ignore:
Timestamp:
Aug 4, 2010 9:55:47 AM (14 years ago)
Author:
nakasato
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GEMM_Performance_Cypress

    v9 v10  
    33 
    44= Introduction =  
    5 We have tested the ACML-GPU version 1.1. We used Cypress GPU running at 850MHz for our tests. The operationt system we adopted is Ubuntu 10.4 LTS. Note we presented two lines for each plot below. One is the result obtained with X58 chipset and other with X38 chipset. PCI transfer speed of Cypress with X58 chipset shows rather limited performance of roughly 600 MB/sec for GPU to CPU. X38 chips shows reasonable speed of > 6 GB/sec for large data. However, it seems that the transfer speed is not critical for GEMM benchmark.  
     5We have tested the ACML-GPU version 1.1. We used Cypress GPU running at 850MHz for our tests. The operating system we adopted is Ubuntu 10.4 LTS. Note we presented two lines for each plot below. One is the result obtained with X58 chipset and other with X38 chipset. PCI transfer speed of Cypress with X58 chipset shows rather limited performance of roughly 600 MB/sec for GPU to CPU. X38 chips shows reasonable speed of > 6 GB/sec for large data. However, it seems that the transfer speed is not critical for GEMM benchmark.  
    66 
    7 Update(20100803): We put a preliminary result of our DGEMM routine with alpha = 1 and beta = 1. The performance number presented here inclue I/O time between CPU and GPU. It seems we have large room for more aggresive I/O optimizations that should adopt the transposing of input matrix A. 
     7Update(20100803): We put a preliminary result of our DGEMM routine with alpha = 1 and beta = 1 (the line with "NN"). The performance number presented here includes I/O time between CPU and GPU. It seems we have large room for more aggresive I/O optimizations that should adopt the transposing of input matrix A. 
    88 
    99= Results =