Context Navigation

Version 63 (modified by nakasato, 12 years ago) (diff)
--

Members

High performance computing and numerical simulations in astronomy and astrophysics.

A development of an accelerator board dedicated for multi-precision arithmetic operations and its application to Feynman loop integrals, S.Motoki, H.Daisaka, N.Nakasato, T.Ishikawa, F.Yuasa, T.Fukushige, A.Kawai and J.Makino, 2014, submitted to the proceedings of the 16th International workshop on Advanced Computing and Analysis Techniques in physics research (ACAT 2014), Prague, preprint http://arxiv.org/abs/1410.3252

GPU accelerated Hybrid Tree Algorithm for Collision-less N-body Simulations, T.Watanabe and N.Nakasato, 2014, Fifth International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART2014), preprint http://arxiv.org/abs/1406.6158

Implementation and Performance Evaluation of Astrophysical Tree-code for GPU Clusters, G.Ogiya, Y.Miki, T.Boku, M.Mori, & N.Nakasato, 2013, http://id.nii.ac.jp/1001/00095272/

Studying the core-cusp problem in cold dark matter halos using N-body simulations on GPU clusters, G.Ogiya, M.Mori, Y.Miki, T.Boku, & N.Nakasato, 2013, Journal of Physics: Conference Series, 454, 012014, http://dx.doi.org/10.1088/1742-6596/454/1/012014

Acceleration of Feynman loop integrals in high-energy physics on many core GPUs, F.Yuasa , T.Ishikawa, N.Hamaguchi, T.Koike and N.Nakasato, 2013, Journal of Physics: Conference Series, 454, 012081, http://dx.doi.org/10.1088/1742-6596/454/1/012081

Blocked United Algorithm for the All-Pairs Shortest Paths Problem on Hybrid CPU-GPU Systems,K.Matsumoto, N. Nakasato, & S.Sedukhin, 2012, IEICE Transactions, Vol.E95- D, No.12, pp. 2759-2768,Dec. 2012., http://search.ieice.org/bin/summary.php?id=e95-d_12_2759

Performance tuning of matrix multiplication in OpenCL on different GPUs and CPUs,Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, In the 3rd International Workshop on Performace Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12) - Proceedings of the 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), IEEE CS's Conference Publishing Service, pp. 396-405, Salt Palace Convention Center, Salt Lake City, Utah, USA, November 12, 2012. DOI:10.1109/SC.Companion.2012.59

GRAPE-MPs: Implementation of an SIMD for quadruple/hexuple/octuple-precision arithmetic operation on a structured ASIC and an FPGA, N.Nakasato, H.Daisaka, T.Fukushige, A.Kawai, J.Makino, F.Yuasa & T.Ishikawa, 2012, IEEE MCSoC 2012, pp.75–83, http://dx.doi.org/10.1109/MCSoC.2012.31

Implementing a Code Generator for Fast Matrix Multiplication in OpenCL on the GPU, K.Matsumoto, N.Nakasato, & S.G.Sedukhin, 2012, IEEE MCSoC 2012, pp.198–204, http://dx.doi.org/10.1109/MCSoC.2012.30

Blocked All-Pairs Shortest Paths Algorithm for Hybrid CPU-GPU System, K.Matsumoto, N.Nakasato, S.G.Sedukhin, HPCC 2011: 145-152., http://dx.doi.org/10.1109/HPCC.2011.28

Multi-level Optimization of Matrix Multiplication for GPU-equipped Systems, K.Matsumoto, N.Nakasato, T.Sakai, H.Yahagi, S.G.Sedukhin, Procedia CS 4: 342-351 (2011), http://dx.doi.org/10.1016/j.procs.2011.04.036

GRAPE-MP: An SIMD Accelerator Board for Multi-precision Arithmetic, H.Daisaka, N.Nakasato, J.Makino, F.Yuasa, T.Ishikawa,2011, http://dx.doi.org/10.1016/j.procs.2011.04.093

Implementation of a Parallel Tree Method on a GPU, N.Nakasato, Journal of Computational Science, 2011, http://dx.doi.org/10.1016/j.jocs.2011.01.006, Recent Results

A fast GEMM implementation on the cypress GPU, N.Nakasato, ACM SIGMETRICS Performance Evaluation Review, 2011, http://dx.doi.org/10.1145/1964218.1964227

Chemodynamical Simulations of the Milky Way Galaxy, C.Kobayashi, N.Nakasato, Astrophysical Journal, 2011, http://dx.doi.org/10.1088/0004-637X/729/1/16

A fast GEMM implementation on the cypress GPU, N.Nakasato, 1st International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS 10), paper&slide

Application of Many-core Accelerators for Problems in Astronomy and Physics , N.Nakasato, Plenary Talk ACAT2010, 2010, http://adsabs.harvard.edu//abs/2010acat.confE..15N

A compiler for high performance computing with many-core accelerators, N.Nakasato, J.Makino, Cluster Computing and Workshops, 2009. CLUSTER '09, 2009, http://dx.doi.org/10.1109/CLUSTR.2009.5289127