Changes between Version 9 and Version 10 of MatrixMultiply
- Timestamp:
- Jun 7, 2010 10:11:16 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
MatrixMultiply
v9 v10 1 1 = Matrix Multiply on GPU = 2 == Our Results == 2 We have implemented single/double precision matrix multiply program for RV770/Cypress. In our implementation, we use two input streams. One is transposed input matrix A and other is input matrix B in normal format. Output matrix C is also not transposed. We adopted 8x8 block for single precision and 4x4 for double precision. Here is benchmark result: 3 3 4 == Single precision == 5 [[Image(SMM.png)]] 4 6 5 6 7 8 9 10 [[Image(MM1.png)]] 7 == Double precision == 8 [[Image(DMM.png)]] 11 9 12 10 = Useful forum discussions = … … 21 19 22 20 Meta-programing works in reality. 21 22 23 [[Image(MM1.png)]]