hyPACK-2013 Topics of Interest : Mode-04- Device GPUs

In Mode-04 programme, topics dealing with practical approaches to write Heterogeneous programs on GPGPUs are considered. CUDA enabled NVIDIA GPU Programming, and heterogeneous programming (OpenCL) on NVIDIA /AMD GPUs GPGPUs are used for development of programs on matrix computations. Tuning & performance aspects of codes on GPGPUs will be carried out. Programming approaches on Heterogeneous HPC GPU Clusters platforms will be discussed. Topics of interest are listed below.

Introduction to NVIDIA-PGI Complier Directives - OpenACC on GPUs; CUDA enabled NVIDIA GPUs
Performance of Matrix Computations - NVIDIA-PGI Complier Directives OpenACC on GPUs; CUDA enabled NVIDIA GPUs
Performance of Application Kernels - NVIDIA-PGI Complier Directives OpenACC on GPUs; CUDA enabled NVIDIA GPUs
Simple example programs on Multi-Core Processors with NVIDIA - GPU Computing CUDA 4.1 SDK.
Write programs based on the AMD APP Software Development Kit (SDK) based on OpenCL (Open Computing Language)
Performance of selective programs on Multi-Core Processors with NVIDIA - GPU Computing CUDA SDK and AMD-APP Tech. SDK based on OpenCL.
Special example programs using CUDA Tool Chain on Multi-Core Processors with NVIDIA - GPU Computing CUDA SDK (CULA Tools, CUBLAS, CUFFT, CUSPARSE)
Special example programs on matrix computations using Concurrent Asynchronous Execution APIs of CUDA 4.1 enabled NVIDIA GPUs (single/Multiple devices).
Special example programs based on Streams (Concurrent Asynchronous Execution) of CUDA 4.1 of NVIDIA GPU
LLVM-based CUDA complier and toolkit technologies for matrix computation and application kernels; GPU Accelerator Programming Model - Compiler Optimizations
Expousre to NVIDIA Parallel Nsight tool kit.
Codes to understand different memory types of CUDA enabled NVIDIA GPUs for matrix computations.
Example programs based on Numerical Linear Algebra using CUDA enabled NVIDIA GPUS and AMD-APP OpenCL.
Example programs (BLAS, FFTs) based on AMD Accelerated Parallel Processing Math Libraries (APPML) using OpenCL.
Example programs based on special class of problems- Dense &. Sparse Matrix Computations, Fast Search Algorithms, & Partial Differential Eqs.(PDEs) will be discussed using CUDA enabled NVIDIA GPUs & AMD-APP OpenCL of HPC GPU Cluster.
Example programs on Heterogeneous Programming - OpenCL based on CUDA enabled NVIDIA GPUs and AMD-APP GPUs.
Code Walk through and execution of parallel programs based on mixed programming environment using using TBB, Pthreads, OpenMP on host Multi-Core systems with GPU Accelerator devices.
Selective example programs on numerical and non-numerical computations using NVIDIA - GPU Computing CUDA SDK and AMD - APP SDK OpenCL.
Application & System Benchmarks related to HPC GPU Cluster based on CUDA/OpenCL NVIDIA & OpenCL AMD-APP programming paradigms.
Example programs based on The OpenACC Application Program Interface (a collection of compiler directives and the details are implicit in the programming model and are managed by the OpenACC API-enabled compilers and runtimes) for matrix computations on NVIDIA GPUs.
Example programs based on AMD APP - Aparapi Data Parallel workloads in Java
Example programs based on CUDA APIs to completely overlap CPU and GPU execution and I/O in HPC GPU Cluster environment.
Performance of memory (pinned/locked) & CUDA shared memory usage on CUDA enabled GPUs for application kernels.
Develop test suites to launch multiple kernels on CUDA enabled NVIDIA single & multiple GPU devices.
Programming exercises for Numerical Computations based on CUDA/OpenCL enabled NVIDIA, & AMD-APP OpenCL Programming for Matrix Computations (Dense & Sparse Matrices Computations)
Implementation of Image Processing applications (Edge Detection, Face Detection & Image inpainting algorithms) on GPGPUs using CUDA/OpenCL enabled NVIDIA GPUs and OpenCL AMD-APP GPUs of HPC GPU Cluster
Implementation of String Search Algorithms - CUDA/OpenCL enabled NVIDIA GPUs and OpenCL AMD-APP GPUs of HPC GPU Cluster
Solution of Partial Differential Equations (Poisson Equation in two dimensional & three dimensional regions) by finite element Method (FEM) using CUDA/OpenCL enabled NVIDIA GPUs & OpenCL on HPC GPU Cluster.
Tuning & Performance using CUDA enabled NVIDIA GPU Libraries; Memory Optimisation, Data-access optimization for matrix computations
Demonstration of Integrated Numerical Linear Algebra Kernels for Matrix Computations (using Open Source Software) on CUDA enabled NVIDIA GPUs & OpenCL enabled AMD-APP Tech.
Tuning & Performance - Selective Application Benchmarks CUDA enabled NVIDIA GPUs & AMD-APP SDK - OpenCL Programming
Demonstration of Application kernels - CUDA enabled NVIDIA GPUs; OpenCL & AMD-APP GPUs - SDKs

Centre for Development of Advanced Computing