• Mode-1 Multi-Core • Memory Allocators • OpenMP • Intel TBB • Pthreads • Java - Threads • Charm++ Prog. • Message Passing (MPI) • MPI - OpenMP • MPI - Intel TBB • MPI - Pthreads • Compiler Opt. Features • Threads-Perf. Math.Lib. • Threads-Prof. & Tools • Threads-I/O Perf. • PGAS : UPC / CAF / GA • Power-Perf. • Home



<
Programming on Multi-Core Processors Using OpenMP APIs

The OpenMP API is used for writing portable multi-threaded applications written in Fortran, C and C++ languages. The OpenMP programming model plays a key role by providing an easy method for threading applications without burdening the programmer with the complications of creating, synchronizing load balancing, and destroying threads. The OpenMP model provides a platform independent set of compiler pragmas, directives, function calls, and environment variables that explicitly instruct the compiler how and where to use the parallelism in the application. Example programs using compiler pragmas, directives, function calls, and environment variables, Compilation and execution of OpenMP programs and programs numerical and non-numerical computations are discussed.

Example 4.1
OpenMP program : Computing Kernels for Matrix Computation
(Source - References : Books     Multi-threading     OpenMP -[MCMTh-01], [MCMTh-02], [MCMTh-I03], [MCMTh-05], [MCMTh-09], [MCMTh-11], [MCMTh-15], [MCMTh-21], [MCBW-44], [MCOMP-01], [MCOMP-02], [MCOMP-04], [MCOMP-12], [MCOMP-19], [MCOMP-25])

Description of OpenMP Computing Kernels /Benchmarks

Example 4.1 : Performance of Matrix Computation Test Kernels on Multi-Cores :
( Download WinRAR ZIP archive:
OpenMP-In-House-Benchmarks-codes (WinRAR ZIP archive)

  • Objective
  • The main objective is to execute computing Kernels on Multi-cores and evaluate the performance on Multi-Core systems for various problem sizes and threads.

  • Description 
  • This benchmark comprises of suites performing Integer / Floating-Point Numerical and Non-Numerical computations using Shared Memory Programming (OpenMP).

    These suites measures the execution time of kernels of Dense Matrix Computations involving Computation of Square Matrix Norm by Row-wise/Column-wise Partitioning ,Matrix and Vector Multiplication using checkerboard algorithm & Matrix and Matrix Multiplication using self scheduling algorithm ; PI computation using Numerical Integration and Monte Carlo Method ; Solving the Linear equation Ax = b using Jacobi Method .

    In this program OpenMP PARALLEL directive, and CRITICAL section is used. The CRITICAL directive specifies a region of program that must be executed by only one thread at a time. If a thread is currently executing inside a CRITICAL region and another thread reaches that CRITICAL region and attempts to execute it, it will block until the first thread exits that CRITICAL region.

  • Input
  • The suites run for problem sizes - Class A,B,C on 1/2/4/8 threads.

  • Output
  • This Multi Core Benchmark gives the performance of system in terms of Time , Memory Utilized.

Centre for Development of Advanced Computing