C-DAC,Pune : High-Perf. Comp. Frontier Technologies Exploration Group and CMSD, University of Hyderabad, Technology Workshop hyPACK (October 15-18), 2013

Shared memory architectures are gradually becoming more prominent in the HPC market, as advances in technology have allowed larger numbers of CPU's to have access to a single memory space. In addition, manufacturers are increasingly clustering these SMP systems together to go beyond the limits of a single system. As clustered SMPs become more prominent, it becomes more important for applications to be portable and efficient on these systems.

Message passing code written in MPI are obviously portable and should transfer easily to clustered SMP systems. While message passing is required to communicate between boxes, it is not immediately clear that this is the most efficient parallelisation technique within an SMP box. In theory a shared memory model such as Pthreads should offer a more efficient parallelisation strategy within an SMP box. Hence a combination of shared memory and message passing parallelisation paradigms within the same application (mixed mode programming) may provide a more efficient parallelisation strategy than pure MPI.

While mixed code may involve other programming languages such as High Performance Fortran (HPF) and Open-MP, both MPI and POSIX Threads are an industry standard. MPI provides distributed computing support and POSIX Threads provides Shared Memory Programming support.

While SMP clusters offer the greatest reason for developing mixed mode code, both the Pthreads and MPI paradigms have different advantages and disadvantages and by developing such a model these characteristics might even be exploited to give the best performance on a single SMP system.

Message Passing Interface

A proposed standard Message Passing Interface(MPI) is originally designed for writing applications and libraries for distributed memory environments. The main advantages of establishing a message-passing interface for such environments are portability and ease of use, and a standard memory-passing interface is a key component in building a concurrent computing environment in which applications, software libraries, and tools can be transparently ported between different machines.

The Message Passing Interface (MPI) is the most widely used of the new standards. It is not a new programming language; rather it is a library of subprograms that can be called from C and Fortran programs. An open, international forum consisting of representatives from industry, academia, and government laboratories developed it. It has rapidly received widespread acceptance because it has been carefully designed to permit maximum performance on a wide variety of systems, and it is based on message passing, one of the most powerful and widely used paradigms for programming parallel systems (MPI forum 1994,). MPI is a good example of using a few independent (orthogonal) language features. MPI is based on four main concepts that are orthogonal to one another: data type, communication operations, communicator, and virtual topology. Any combination of the four is valid. This orthogonal independence brings a multiplicative effect.

The current version of MPI assumes that processes are statically allocated; i.e., the number of processes is set at the beginning of program execution, and no additional processes are created during execution. Each processes is assigned a unique integer rank in the range 0, 1, 2,...,p-1, where p is the number of processes. This approach to programming MIMD systems is called single-program multiple data (SPMD). In SPMD programs, the effect of running different programs is obtained by the use of conditional branches within the source code.

A nice feature of the MPI design is that MPI provides a powerful functionality based on four orthogonal concepts. These four concepts in MPI are message data types, communicators, communication operations, and virtual topology.

Pthreads:

The Pthreads library is a POSIX C API thread library that has standardized functions for using threads across different platforms. Historically, hardware vendors have implemented their own proprietary versions of threads. These implementations differed substantially from each other making it difficult for programmers to develop portable threaded applications. In order to take full advantage of the capabilities provided by threads, a standardized programming interface was required. For UNIX systems, this interface has been specified by the IEEE POSIX 1003.1c standard (1995). Implementations that adhere to this standard are referred to as POSIX threads, or Pthreads. Most hardware vendors now offer Pthreads in addition to their proprietary API's. Pthreads are defined as a set of C language programming types and procedure calls. Vendors usually provide a Pthreads implementation in the form of a header/include file and a library that you link with your program.

A Thread is a 'Light Weight Process'. A thread is a stream of instructions that can be scheduled as an independent unit. A thread exists within a process, and uses the process resources. Since threads are very small compared with processes, thread creation is relatively cheap in terms of CPU costs. As processes require their own resource bundle, and threads share resources, threads are likewise memory frugal. There can be multiple threads within a process. Multithreaded programs may have several threads running through different code paths "simultaneously".

In shared memory multiprocessor architectures, such as SMPs, threads can be used to implement parallelism. Historically, hardware vendors have implemented their own proprietary versions of threads, making portability a concern for software developers. For UNIX systems, a standardized C language threads programming interface has been specified by the IEEE POSIX 1003.1c standard. Implementations that adhere to this standard are referred to as POSIX threads, or Pthreads.

The tutorial begins with an introduction to concepts, motivations, and design considerations for using Pthreads. Each of the three major classes of routines in the Pthreads API are then covered: Thread Management, Mutex Variables, and Condition Variables. Example codes are used throughout to demonstrate how to use most of the Pthreads routines needed by a new Pthreads programmer.

hyPACK-2013 Mode 1 : Mixed Mode of Programming Using MPI & Pthreads