| 5 | I compared the performance of N-body simulations |
| 6 | on CPU and GPU with a several optimization techniques. |
| 7 | Each program is written in OpenCL which |
| 8 | standardizes APIs for GPU, and an important optimization |
| 9 | technique in OpenCL is a vectorization. It |
| 10 | enables us to utilize multiple variables as one variable. |
| 11 | As a result, the program which utilized 4 variables as |
| 12 | one was the best performance. |
| 13 | I optimized the program using shuffle function. I found |
| 14 | the calculation of N-body problem using shuffle function |
| 15 | was about 1.3 times faster than without it. I also |
| 16 | found Intel SDK had an ability to efficiently vectorize |
| 17 | the kernel program. |
| 18 | |
| 19 | |
| 20 | |