Version 15 (modified by nakasato, 15 years ago) (diff) |
---|
Perfomrance comparison of GPU boards from AMD as of Oct 2009
We have tested GPU borads from AMD with our test program implementing a simple N2 force evaluation algorithm. A nominal performacne of each board is shown in the following table.
board | arch | clock | memory | No.SC | SPmul-add | DPadd | DPmul | BW |
HD4850 | RV770 | 625 MHz | DDR3 662 MHz 256bit | 800 | 1040 GFLOPS | 208 GFLOPS | 104 GFLOPS | 63.6 GB/sec |
HD4870 | RV770 | 750 MHz | DDR5 900 MHz 256bit | 800 | 1200 GFLOPS | 240 GFLOPS | 120 GFLOPS | 115.2 GB/sec |
HD4770 | RV740 | 750 MHz | DDR5 800 MHz 128bit | 640 | 960 GFLOPS | 192 GFLOPS | 96 GFLOPS | 51.2 GB/sec |
HD5870 | RV870 | 850 MHz | DDR5 1.2 GHz 256bit | 1600 | 2720 GFLOPS | 544 GFLOPS | 272 GFLOPS | 153.6 GB/sec |
The program is basically same as our demo program posted here. The demo program should work with 5870 but I did not test it.
Result
Note
- We count one-force-interaction as 38 FP operations. This is a traditional flop-count in this field.
- At large N, 5870 shows 2.2x better performance than 4870. it It reaches ~ 2.2 Tflops.
- 4850 and 4770 show identical performance. Memory BW is not critical at all in this test program.
Benchmark system
CPU | Core2 E8400 3.0 GHz |
MB | Asus P5EWS |
Memory | DDR2 800 1 GB x 4 |
Power unit | Schythe CorePower3 600W |
OS | Ubuntu 8.04 x86_64 2.6.24-23-generic |
Catalyst | 9.9 |
CAL version | 1.4beta |
This system looks old but PCIe bus on P5EWS is faster than any other systems we have so far.
Attachments (2)
- GFLOPS.png (36.5 KB) - added by nakasato 15 years ago.
- GFLOPS_opt.png (28.3 KB) - added by nakasato 15 years ago.
Download all attachments as: .zip