Version 6 (modified by nakasato, 10 years ago) (diff) |
---|
Y.Suzuki
Fast N-Body Calculation Implemented by OpenCL with Vectorization
I compared the performance of N-body simulations on CPU and GPU with a several optimization techniques. Each program is written in OpenCL which standardizes APIs for GPU, and an important optimization technique in OpenCL is a vectorization. It enables us to utilize multiple variables as one variable. As a result, the program which utilized 4 variables as one was the best performance. I optimized the program using shuffle function. I found the calculation of N-body problem using shuffle function was about 1.3 times faster than without it. I also found Intel SDK had an ability to efficiently vectorize the kernel program.
file:///home/committee/aac/Thesis2011/s1140123
K.Kamijima
Performance Evaluation of the Octree Method on GPU
The purpose of my research is to examine the performance of graphic processing unit (GPU) with a numerical algorithm for particle simulations. Specifically, I adopt the Octree method that requires many branch instructions. In general, GPU is intrinsically not good at dealing with branch instructions. I have implemented the Octree method in single and double precision on Cypress GPU. With the GPU, the peak performance of single precision operations is five times better than that of double precision. Although it was expected that the performance of the Octree method with double precision is much slower, I found that it is not the case. On GPU, I found that the performance of the Octree method is constrained by not the computing power of the GPU but the performance penalty due to branch instructions.
file:///home/committee/aac/Thesis2011/s1150062
K.Seiwa
GPU Acceleration of Numerical Simulation of Fluid by the Lattice Boltzmann Method
Numerical simulations of fluid are used for analyzing
the gas and liquid motion of air or water. It has been developed drastically by the performance enhancement of the computers. It is regarded as important in the designing vehicle such as a car, an air plane, and a ship moving in the fluid.
The lattice Boltzmann method (LBM) is the one of the method for
numerically simulating the thin fluid and it simulates the fluid field motion expressed by lattice and particles placed on the lattice points. When we need to analyze more strictly, we need many lattice and particles and this method will take longer time.
To simulate the fluid field effectively, I have implemented the LBM
on Graphic Processing Unit (GPU). GPU is a processor turned for data parallel computation with a large number of computing cores. I try to accelerate the fluid simulation using OpenCL which is a framework for parallel programming. As the result, my numerical simulation by the LBM becomes faster about 5 times than on CPU. We conclude that using GPU is effective to accelerate the LBM simulations.
file:///home/committee/aac/Thesis2011/s1150132
T.Suzuki
OpenCL Implementation of Exact String Matching
Graphics Processing Units (GPUs) have evolved over the past few years from dedicated graphics rendering devices to powerful parallel processors and they are outperforming traditional Central Processing Units (CPUs) in many areas of scientific computing. This paper presents experimental results on the parallel processing for some well known on-line string matching algorithms using OpenCL that is a standard API for writing parallel programs for CPU and GPU. I found that the simplest algorithm with help of vectorization is the fastest on GPU. The performance of my optimized string matching kernel on GPU is 10 times faster than the standard utility command “grep” for simultaneously matching enough large number of strings.
file:///home/committee/aac/Thesis2011/s1160119