| | 5 | I compared the performance of N-body simulations |
| | 6 | on CPU and GPU with a several optimization techniques. |
| | 7 | Each program is written in OpenCL which |
| | 8 | standardizes APIs for GPU, and an important optimization |
| | 9 | technique in OpenCL is a vectorization. It |
| | 10 | enables us to utilize multiple variables as one variable. |
| | 11 | As a result, the program which utilized 4 variables as |
| | 12 | one was the best performance. |
| | 13 | I optimized the program using shuffle function. I found |
| | 14 | the calculation of N-body problem using shuffle function |
| | 15 | was about 1.3 times faster than without it. I also |
| | 16 | found Intel SDK had an ability to efficiently vectorize |
| | 17 | the kernel program. |
| | 18 | |
| | 19 | |
| | 20 | |