• Mode-1 Multi-Core • Memory Allocators • OpenMP • Intel TBB • Pthreads • Java - Threads • Charm++ Prog. • Message Passing (MPI) • MPI - OpenMP • MPI - Intel TBB • MPI - Pthreads • Compiler Opt. Features • Threads-Perf. Math. Lib. • Threads-Prof. & Tools • Threads-I/O Perf. • PGAS : UPC / CAF / GA • Power-Perf. • Home





Tuning and Performance / Benchmarks on Multi-Core Processors


Tuning and Performance of Application Programs using Compiler optimisation techniques, Codre restructuring techniques on Multi-Core Processors is challenging. Understanding Programming Programming Paradigms (MPI, OpenMP, Pthreads), effective use of right Compiler Optimisation flags and obtaining correct results for given application is important. Enhance performance and scalability on multiple core processors for given application with respect to increase in problem size require serious effrots. Several Optmisation techniques are discussed below.


Threads versus OpenMP on Multi-Core Processor Systems

The requirements of threads and OpenMP APIs on Multi-Core Systems require serious study to understand performance aspects of Multi-threaded application. The programming paradigms have advantages and disadvantages from implementation point of view and performance of applications. A programmer must weigh all considerations as explained below before deciding on an API for programming and Performance analysis, A brief summary of considerations is discussed below.

1. OpenMP provides a layer on top of native threads to facilitate a variety of thread-related tasks.
2. Using directives provided by OpenMP, a programmer is rid of the tasks of initializing attributes objects, setting up arguments to threads, partitioning iteration spaces, etc.
3. The above features of OpenMP is especially useful when the underlying problem has a static and/or regular task graph in which the work load can be defined apriority.
4. The overheads associated with automated generation of threaded code from OpenMP directives have been shown to be minimal in the context of a variety of applications.
5. The performance and scalability of application using OpenMP pragmas may not be achieved for certain class of application problems.
6. The drawbacks to use directives is non-explicit nature of threading on Multi Cores and extreme caution must be taken to avoid race condition and parallel overheads associated with synchronization.
7. The Data movement overheads can be handled by Threads but the use of OpenMP pragmas may incur overheads which can be not be estimated. Different Threaded APIs can be used to re-design application to reduce the Overhead.This helps in alleviating some of the overheads from data movement, false sharing and contention.
8. Explicit threading also provides a richer API in the form of condition waits, locks of different types, and increased flexibility for building composite synchronization operations.
9. Explicit threading is used more widely than OpenMP.
10. Tools support for Pthreads programs is easier to find, in comparison to OpenMP
Centre for Development of Advanced Computing