Programming on Multi-Core Processors Using Pthreads (POSIX Threads) |
Pthreads are defined as a set of C-language programming types and procedure calls,
implemented with a pthread.h header/include file and a thread library.
Solaris threads are easily
understood by someone familiar with POSIX threads, and while Java threads and the
multi-threading in the Win32 and OS/2 APIs are a little different . The subroutines,
which comprise the Pthreads APIs, can be formally grouped into three classes such as
Thread Management, Mutex Variables and Condition Variables. Threaded applications offer
potential performance gains and practical advantages over non-threaded applications
in several other ways as we can observe from the different programs.
Example programs using different APIs. Compilation and execution
of Pthread programs, programs numerical and non-numerical computations
are discussed using
different thread APIs to understand Performance issues on mutli-core processors.
|
Example 2.1
|
Pthread prorgam to compute Pie value by Numerical Integration method.
|
Example 2.2
|
Write Pthread code to perform Vector-Vector Multiplication using block striped partitioning.
|
Example 2.3
|
Write Pthread code to find Infinity norm of the Matrix using block striped partitioning (Row-wise Partitioning of Matrix)
|
Example 2.4
|
Write Pthread code to find Infinity norm of the Matrix using block striped partitioning (Column-wise Partitioning of Matrix)
|
Example 2.5
|
Write a Pthreads program to solve a system of linear equations AX=b using Parallel Jacobi Method.
|
(Source - References :
Books
Multi-threading
-[MCMTh-01], [MCMTh-02], [MCMTh-I03], [MCMTh-05], [MCMth-09], [MCMth-11],
[MCMTh-15], [MCMTh-21], [MCBW-44] )
|
Description of Pthread Programs |
Example 2.1:
Write a Pthreads program to compute the value of PI function by numerical
integration.
( Download source code :
pthread-numerical-integration.c
)
|
- Objective
Write a Pthreads program to compute the value of PI function by numerical integration.
- Description
This program computes the value of PI over a given interval using Numerical integration. The main thread distributes
the given interval uniformly over the number of threads. Each thread calculates its part of the interval and finally adds
it up to the result variable. Each thread locks a mutex before doing the same to guarantee the atomicity of
the operation.
Threaded APIs provide support for implementing critical sections and atomic operations using
mutex-locks (mutual exclusision locks).
Each thread calculates its part of the interval and finally adds it up to the result variable.
Mutex-Locks have two states : locked and unlocked. At any point of time, only one thread can
lock a mutex lock.
Each thread
locks a mutex before doing the same to guarantee the atomicity of the operation.
The Mutex-lock is an atomic operation generally associated with a piece of code that manipulates
shared data. To access the sared data, a thread must first try to acquire a mutex-lock. If the mutex-lock
is already locked, the process trying to acquire the lock is blocked.
- Input
Number of Threads.
- Output
Calculated Value of PI
|
Example 2.2:
write a Pthreads program to compute the vector-vector multiplication with
Pthreads using block striped partitioning for uniform data distribution.
( Download source code :
pthread-vectorvector-multi.c
)
|
- Objective
To write a Pthreads program to compute the vector-vector multiplication with Pthreads using block striped partitioning
for uniform data distribution.
- Description
This is an implementation of Vector-Vector multiplication using the block striped partitioning algorithm.
Each thread multiplies the corresponding elements and writes the product into the result vector.
A Mutex is used on the result vector to guarantee atomicity. The thread accesses the elements based on its id which
is allocated by the main thread in the order of their creation. As the number of threads and the number of elements is
known, the corresponding elements to be accessed can easily be computed.
- Input
Vector Size and Number Threads. Number of threads should be a factor of Vector size.
- Output
Dot Product of the given vectors.
|
Example 2.3:
Write Pthread code to find Infinity norm of the Matrix using block striped partitioning
- Row Wise distribution.
( Download source code :
pthread-infinitynorm-rowwise.c
)
|
- Objective
Write Pthread code to find Infinity norm of the Matrix using block striped
partitioning - Row Wise distribution.
- Description
Infinity Norm of a Matrix: The Row-Wise infinity norm of a matrix is defined to be the maximum of sums of absolute
values of elements in a row, over all rows.In the row wise distribution, each thread finds the sum of the elements of that
row and compares it with the result variable which is initialized to zero. If the sum is greater than the current value of
the result, it updates the result variable to reflect the new result. A Mutex is used on the result variable. Distribution
of rows is done knowing the id of each thread assigned by the main thread in the order of their creation and the total
number of threads.
- Input
Number of Threads and the input file. Number of threads must be a factor of number of rows.
- Output
Infinity Norm of the given Matrix.
|
Example 2.4:
Write Pthread code to find Infinity norm of the Matrix using block striped
partitioning - column Wise distribution.
( Download source code :
pthread-infinitynorm-colwise.c
)
|
- Objective
Write Pthread code to find Infinity norm of the Matrix using block
striped partitioning - column Wise distribution.
- Description
In this method, the total columns are distributed among the child threads. Each thread adds the value to the
corresponding element in a vector whose length is equal to the number of rows of the matrix. Each element of the
vector that holds the row-wise sum of the matrix is protected by a mutex to guarantee granularity of the operation.
When all the threads are done, the vector holds the sum of each row of the matrix. The main thread now picks the
maximum of the sums and prints it as the Infinity Norm.
- Input
Number of Threads and the input file. Number of threads must be a factor of number of columns.
- Output
Infinity Norm of the given Matrix.
|
Example 2.5:
Write a Pthreads program to solve a system of linear equations AX=B using
Parallel Jacobi Method.
( Download source code :
pthread-jacobi.c
)
|
- Objective
Write a Pthreads program to solve a system of linear equations AX=B using Parallel Jacobi Method.
- Description
Write a Pthread program to solve the system of linear equations [A]{x} = {b} using Gaussian elimination without
pivoting and a back-substitution. Assume that A is symmetric positive definite dense matrix of size n. You may assume
that n is evenly divisible by p.
- Input
Number of Threads, Size of Real Symmetric Positive definite Matrix in terms of Class where
Class A : 1024
Class B : 2048
Class C : 4096
- Output
The solution of Ax=b and the number of iterations for convergence of the method and also the
execution time.
|
|
| |
|