wiki:OpenCL_Memo

Version 37 (modified by nakasato, 13 years ago) (diff)

--

OpenCL Notes

Disable auto vectorization

In the section 6.7.2 (187 page) in the OpenCL Specification Version: 1.1 (Revision: 33), "attribute((vec_type_hint(<typen>))" is described. This hint controls the autovectorizer in the compiler for OpenCL C. I tested this feature by dumping the assembly code for a kernel targeted for AVX instructions with Intel SDK (version 1.5).

lines performance
no hint 1567
with the hint 333

This hint amazingly reduces the size of the generated kernel code! Cool!

How to use "ioc" command equipped with Intel SDK.

We need to set the environment variable INTELOCLSDKROOT

export INTELOCLSDKROOT=/usr/lib64/OpenCL/vendors/intel  

To dump the assembly code:

ioc -input=kernel_file.cl -asm 

Dump IL/ISA with AMD SDK

Set the following the environment variable (APP Programming Guide August 2011, section 4.2 (63 page)).

export GPU_DUMP_DEVICE_KERNEL=3         

SDK and driver

Latest SDK

AMD http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx

Intel http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/

Nvidia

Apple'SDK comes with MacOS X only

Latest Driver for AMD

http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx?type=2.4.1&product=2.4.1.3.42&lang=English

old info

Standard Compute Layer Library

http://www.browndeertechnology.com/stdcl.html

A wrapper library API for OpenCL API used in the tutorial below. It seems that the libstdcl greatly simplify a sample program ($OPENCLDIR/samples/opencl/cl/app/NBody) supplied with Stream SDK 2.0beta.

OpenCL Tutorial: N-Body Simulation

http://www.browndeertechnology.com/docs/BDT_OpenCL_Tutorial_NBody.html

A tutorial that modifies the sample NBody program written in OpenCL API.

local & global

http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=123350&enterthread=y

It's a tricky question . On 4xxx __local mem is really __global mem ( ATI thinks it's too much work to optimize compiler to use 48xx  LDS - although it's possible ). 
On 5xxx __local is LDS - so it's located in simd core.

Catalyst 10.1 with Ubuntu 9.10 workarounds

Change the kernel boot option

Edit "/etc/default/grub" and execute update-grub. I add "nopat" to GRUB_CMDLINE_LINUX_DEFAULT for Catalyst 10.1.

The configuration file for the grub is at /boot/grub/grub.cfg. This file is automatically generated by update-grub command.

This trick is not necessary for Catalyst 10.2.

X server setting

The default login manager gdm is difficult to properly configure, I install xdm instead of gdm. The configuration file for xdm is at /etc/X11/xdm directory.

Edit "Xservers" file as

:0 local /usr/bin/X :0 vt7 -nolisten tcp -ac

This "-ac" option enable remote applications to access the local X server. Note this option is generally regarded as "bad" for security. Be careful.

Random notes

icc

http://software.intel.com/en-us/articles/using-intel-compilers-for-linux-with-ubuntu/

packages

Ubuntu 10.04.1 LTS

sudo aptitude install libgsl0-dev gfortran libnetcdf-dev linux-headers-2.6.32-22-generic g++ libblas-dev rake emacs lv zsh ssh xdm ia32-libs  linux-image-2.6.32-22-generic subversion git-core  openmpi-bin openmpi-dev 

10.04 LTS Server

sudo aptitude install xdm ia32-libs subversion zsh libgsl0-dev gfortran libnetcdf-dev g++  freeglut3-dev xserver-xorg rake emacs lv xterm 

http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=147002

process affinity

From command line: http://www.cyberciti.biz/tips/setting-processor-affinity-certain-task-or-process.html

http://www.open-mpi.org/projects/hwloc/

DOUBLE

http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=92

Attachments (3)

Download all attachments as: .zip