Changes between Version 1 and Version 2 of Fast_GEMM_Implementation_On_Cypress
- Timestamp:
- Nov 8, 2010 10:17:38 AM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Fast_GEMM_Implementation_On_Cypress
v1 v2 21 21 to heavily use shared memory on GPUs, we show that texture cache is very effective on the Cypress architecture. 22 22 23 == preprint == 24 25 [attachment:Nakasato_PBMS2010.pdf] 23 26 24 27 == Sample program for DGEMM == … … 29 32 * [wiki:"GEMM_Performance_Cypress"] 30 33 * [wiki:"MatrixMultiply"] 31 32 == preprint ==33 34 [attachment:Nakasato_PBMS2010.pdf]