C-DAC,Pune : High-Perf. Comp. Frontier Technologies Exploration Group and CMSD, University of Hyderabad, Technology Workshop hyPACK (October 15-18), 2013

hyPACK-2013 : OpenCL Prog. on Intel Xeon Coprocessor

Intel Xeon Host and Xeon Phi Coprocessor Software harnesses the tremendous processing power of many-core processors for high-performance, data-parallel computing in a wide range of applications. The OpenCL enviornment on Intel Xeon Phi provides complete heterogeneous OpenCL development platform for both the CPU and MIC. Intel Xeon Phi OpenCL software development platform is available for x86-based CPUs MIC programming environment and it provides complete heterogeneous OpenCL development platform for both Xeon X86 host and Intel Xeon-Phi Coprocessor (x86 SMP Chip many core processors). Please refer to Intel Xeon Phi Coprocessor technical documents to understand OpenCL. The techncial contents are developed using several technical reports, & Books and other web sites of Intel as given in the References.

Dense matrices are stored in the computer memory by using two-dimensional arrays. For example, a matrix with n rows and m columns, is stored using a n x m array of real numbers. However, using the same two-dimensional array to store sparse matrices has two very important drawbacks. First, since most of the entries in the sparse matrix are zero, this storage scheme wastes a lot of memory. Second, computations involving sparse matrices often need to operate only on the non-zero entries of the matrix. Use of dense storage format makes it harder to locate these non-zero entries. For these reasons sparse matrices are stored using different data structures. The Compressed Row Storage format (CRS) is a widely used scheme for storing sparse matrices. In the CRS format, a sparse matrix A with n rows having k non-zero entries is stored using three arrays: two integer arrays rowptr and colind, and one array of real entries values. The array rowptr is of size n+1, and the other two arrays are each of size k. The array colind stores the column indices of the non-zero entries in A, and the array values stores the corresponding non-zero entries. In particular, the array colind stores the column-indices of the first row followed by the column-indices of the second row followed by the column-indices of the third row, and so on. The array rowptr is used to determine where the storage of the different rows starts and ends in the array colind and values. In particular, the column-indices of row i are stored starting at colind [rowptr[i]] and ending at (but not including) colind [rowptr[i+1] ]. Similarly, the values of the non-zero entries of row i are stored at values [rowptr[i] ] and ending at (but not including) values [rowptr[i+1] ]. Also note that the number of non-zero entries of row i is simply rowptr[i+1]-rowptr[i].