Changes between Version 8 and Version 9 of UGT2011
- Timestamp:
- Nov 5, 2014 9:37:34 AM (10 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
UGT2011
v8 v9 23 23 The purpose of my research is to examine the performance of graphic processing unit (GPU) with a numerical algorithm for particle simulations. Specifically, I adopt the Octree method that requires many branch instructions. In general, GPU is intrinsically not good at dealing with branch instructions. I have implemented the Octree method in single and double precision on Cypress GPU. With the GPU, the peak performance of single precision operations is five times better than that of double precision. Although it was expected that the performance of the Octree method with double precision is much slower, I found that it is not the case. On GPU, I found that the performance of the Octree method is constrained by not the computing power of the GPU but the performance penalty due to branch instructions. 24 24 25 26 25 file:///home/committee/aac/Thesis2011/s1150062 27 26 … … 45 44 GPU is effective to accelerate the LBM simulations. 46 45 47 48 46 file:///home/committee/aac/Thesis2011/s1150132 49 47 … … 52 50 == OpenCL Implementation of Exact String Matching == 53 51 Graphics Processing Units (GPUs) have evolved over the past few years from dedicated graphics rendering devices to powerful parallel processors and they are outperforming traditional Central Processing Units (CPUs) in many areas of scientific computing. This paper presents experimental results on the parallel processing for some well known on-line string matching algorithms using OpenCL that is a standard API for writing parallel programs for CPU and GPU. I found that the simplest algorithm with help of vectorization is the fastest on GPU. The performance of my optimized string matching kernel on GPU is 10 times faster than the standard utility command “grep” for simultaneously matching enough large number of strings. 54 55 52 56 53 file:///home/committee/aac/Thesis2011/s1160119 … … 75 72 In the result about 12 times faster on GPU and about 9 times faster on multi-core CPU. 76 73 77 78 74 file:///home/committee/aac/Thesis2011/s1160154 79 75 … … 82 78 = T.Watanabe = 83 79 == Fluid Simulations in Curved Pipes using Smoothed Particle Hydrodynamics on GPU == 80 For simulating incompressible fluid such as water with Smoothed Particle Hydrodynamics (SPH), we need a lot of time for calculating interactions between particles. Especially, we need a lot of additional particles for wall boundaries. In this paper, we introduce a new approach for setting wall boundaries without particles for curved pipes, and we present the results of the incompressible fluid flow in the curved wall boundaries using SPH. In my proposed approach for wall boundaries, we calculate the force between the boundary of the pipe and particles using the distance between a particle and the centerline of the pipe. We implement the simulation on GPU using OpenCL. At a high number of particles, calculation speed on GPU is more than 10 times faster than CPU on single core in the simulation. 81 82 file:///home/committee/aac/Thesis2011/s1170174 83