Quantcast
Channel: Oakdale Software Ltd
Viewing all articles
Browse latest Browse all 6

Getting Started with OpenCL

$
0
0

Probably the most amazing thing about OpenCL is its heterogeneous nature. An OpenCL kernel can run on just about any compute device in your computer, the CPU, the GPU or even a FPGA and it can all be orchestrated from the host with ease.

As you may be aware, 3rd generation Intel Core (and later) processors include an integrated graphics component and in the  HD400 and later chips this compute power is not to be sniffed at and certainly worth exploiting however its not entirely clear how you access it. If like me you have a discrete graphics card you may be wondering as I did why the Intel GPU is not accessible.

Here’s what to do.

Boot your computer into the BIOS settings and look for a section probably entitled something like “System Agent”, under this menu :

  • “Initiate Graphic Adapter” – set this to PCIe/PCI
  • “iGPU Multi-Monitor” – set this to Enabled

Save your settings and re-boot.

Now visit the Intel website and download the appropriate graphics driver for your CPU, install it and re-boot once more, then when you open your device panel you can see the integrated Intel graphics device like this :

Graphics Device List

We’re ready to start programming.

Next you are going to need an OpenCL SDK so that you have the headers you need to build an OpenCL program (the drivers already have a run-time). It doesn’t really matter who’s you use, in my case I downloaded the Nvidia tools which are part of the CUDA SDK. Currently the download is here but may move at a later date.

Once installed you will need to set-up your project to access the SDK. In Visual Studio 2013 (12 is the same) select the property manager tab and select your build target, in my case I select “Debug | x64″ then double-click “Microsoft.Cpp.x64.user” so that you only modify properties for this project. Now you have the property dialog open select “VC++ Directories” and enter :

  • Include Directories – $(CUDA_PATH)\include;$(IncludePath)
  • Library Directories – $(CUDA_PATH)\lib\x64;$(LibraryPath)

The CUDA installer has conveniently created an environment variable called CUDA_PATH to make this nice and clean.

Now go to the “Linker” then “General” section and update :

  • Additional Library Directories – $(CUDA_LIB_PATH);%(AdditionalLibraryDirectories)

Then “Linker”, “Input” and update :

  • Additional Dependencies – OpenCL.lib;%(AdditionalDependencies)

Hit OK and we’re ready to go.

This is a little program to look for compute devices on your system and print out their capabilities :

This gives us output like this :

Number of OpenCL platforms found: 2
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.1 CUDA 6.0.1
CL_PLATFORM_NAME : NVIDIA CUDA
CL_PLATFORM_VENDOR : NVIDIA Corporation
CL_PLATFORM_EXTENSIONS : cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_n3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_devicv_pragma_unroll
Number of detected OpenCL devices: 2
GPU detected
        Device name is GeForce GTX 680
        Device vendor is NVIDIA Corporation
        VENDOR ID: 0x10de
        Device max memory allocation: 512 mega-bytes
        Device global cacheline size: 128 bytes
        Device global mem: 2048 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 64  )
        Maximum number of work-items in a work-group: 1024
GPU detected
        Device name is GeForce GTX 680
        Device vendor is NVIDIA Corporation
        VENDOR ID: 0x10de
        Device max memory allocation: 512 mega-bytes
        Device global cacheline size: 128 bytes
        Device global mem: 2048 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 64  )
        Maximum number of work-items in a work-group: 1024
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.2
CL_PLATFORM_NAME : Intel(R) OpenCL
CL_PLATFORM_VENDOR : Intel(R) Corporation
CL_PLATFORM_EXTENSIONS : cl_khr_fp64 cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_ntel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sharing cl_intl_khr_dx9_media_sharing cl_khr_d3d11_sharing
Number of detected OpenCL devices: 1
CPU detected
        Device name is        Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
        Device vendor is Intel(R) Corporation
        VENDOR ID: 0x8086
        Device max memory allocation: 8159 mega-bytes
        Device global cacheline size: 64 bytes
        Device global mem: 32639 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 1024  )
        Maximum number of work-items in a work-group: 1024

Lovely.

The post Getting Started with OpenCL appeared first on Oakdale Software Ltd.


Viewing all articles
Browse latest Browse all 6

Trending Articles