hyPACK-2013 Hands On Session
|
hyPACK-2013 Hands-on Sessions (HoS) will be conducted on HPC Cluster with coprocessors and accelerators. Programming on ARM Processor Cluster and ARM Processor
system with CUDA enabled NVIDIA carma and DSP Multi-Core processor systems for
Mode-1, Mode-2, Mode-3, Mode-4, & Mode-5 modules.
The approach adopted to heterogeneous programming for applications
kernels and numerical linear algebra on hybrid computing systems (
HPC GPU Cluster) is discussed
in Mode-1, Mode-2, Mode-3 & Mode-4 modules of
hypack-2013 ) are given below.
|
Mode-3 : Systems with Coprocessors
|
-
System 1 : Intel Xeon Phi Co-processor :
The pragma-based offload model and using Intel Xeon Phi as an SMP processor is one of
the easiest approached to write a program similar to existing x86 systems.
The Intel Xeon Phi Knights ferry processor is a 61-core SMP chip where each core has a
dedicated 512-bit wide SSE (Streaming SIMD Extensions) vector unit. All the cores are
connected via a 512-bit bidirectional ring interconnect. Currently, the Phi
coprocessor is packaged as a separate PCIe device, external to the host processor.
Each Phi contains 15 GB of RAM that provides all the memory and file-system storage that
every user process, the Linux operating system, and ancillary daemon processes will use.
The theoretical maximum bandwidth of the Intel Xeon Phi memory system is 352 GB/s
(5.5GTransfers/s * 16 channels * 4B/Transfer).
Each Intel Xeon Phi core is based on a modified Pentium processor design that supports
hyperthreading and some new x86 instructions created for the wide vector unit.
The parallel threads issue instructions to the wide vector units quickly enough to
keep the vector pipeline full. The current generation of coprocessor cores support up to four
concurrent threads of execution via hyperthreading.
The Coprocessor is integrated with Intel X86 Xeon Processor Sandybride System for laboratory session.
-
System 2 : PARMA YUVA-II - a hybrid computing platform is a message passing cluster and configuration
of a compute node with co-processors are given below.
Compute Node : Two Quad Socket Eight Core Systems ( 16 CPU - Intel(R)
Xeon(R) CPU E5-2670 @ 2.68GHz with sandy bridge Arch; RAM - 64 GB, cache - 20MB ; GCC 4.4.6;
Infiniband, Interconnects having PARAMNet-II and InfiniBand. Each node has two Intel Xeon Phi
Coprocessors.
Intel Xeon Phi Coprocessor : 60 Cores; -8 GB GDDR5 RAM; -32kB L1-cache per
core; -512kB L2-cache per core
|
|
|
|