Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Version 21 and Version 22 of Compiler_For_High_Performance_Computing_With_Many_Core_Accelerators

Timestamp:: Aug 9, 2010 5:50:36 AM (15 years ago)
Author:: nakasato
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

Compiler_For_High_Performance_Computing_With_Many_Core_Accelerators

v21	v22
1	1	= A Compiler for High Performance Computing With Many-Core Accelerators =
2		by N.Nakasato and J.Makino (~~to be~~ presented at PPAC Workshop http://www.checs.eng.vt.edu/ppac09 )
	2	by N.Nakasato and J.Makino (presented at PPAC Workshop http://www.checs.eng.vt.edu/ppac09 )
3	3	== abstract ==
4	4	We introduce a newly developed compiler for high performance computing using many-core accelerators. A high peak performance of such accelerators attracts researchers who are always demanding faster computers. However, it is difficult to create an efficient implementation of an existing serial program for such accelerators even in the case of massively parallel problems. While existing parallel programming tools force us to program every details of an implementation from loop-level parallelism to 4-vector SIMD operations, our novel approach is that given a compute intensive problem expressed as a nested loop, the compiler only ask us to define a compute kernel inside the inner-most loop. We observe that input variables appeared in the kernel is classified into two types; invariant during the loop and variables updated in each iteration. The compiler let us to specify either type of the inputs so as it pick a predefined optimal way to process them. The compiler successfully generates the fastest code ever for many-particle simulations with the performance of 500 GFLOPS (single precision) on RV770 GPU. Another successful application is the evaluation of a multi-dimensional integral. It runs at a speed of 5 - 7 GFLOPS (quadruple precision) on both GRAPE-DR and GPU.