• Topics of Interest • Tech. Prog. Schedule • Topic : Multi-Core • Topic : ARM Proc • Topic : Coprocessor • Topic : GPGPUs • Topic : HPC Cluster • Topic : App. Kernels • Lab. Overview • Key-Note/Invited Talks • Home




hyPACK-2013 Hands On Session

hyPACK-2013 Hands-on Sessions (HoS) will be conducted on HPC Cluster with coprocessors and accelerators. Programming on ARM Processor Cluster and ARM Processor system with CUDA enabled NVIDIA carma and DSP Multi-Core processor systems for Mode-1, Mode-2, Mode-3, Mode-4, & Mode-5 modules. The approach adopted to heterogeneous programming for applications kernels and numerical linear algebra on hybrid computing systems ( HPC GPU Cluster) is discussed in Mode-1, Mode-2, Mode-3 & Mode-4 modules of hypack-2013 ) are given below.

Mode-1 : Multi-Core Processor Systems
  • Mode-1 : (Host-CPU : Multi-Core Processor) Tuning & Performance of programs on Multi-Core Processors & Distributed Shared Address Space (PGAS) memory Models (Host-CPU : Multi-Cores) Tuning & Performance of programs on Multi-Core Processors & Distributed Shared Address Space (PGAS) memory Models

Multi-Core Processor Systems (Xeon)

Peak performance (in double precision) of HPC GPU Cluster with one node having Single CUDA enabled NVIDIA GPU is 615 Gflop/s

  • Intel Xeon 64bit Quad Core (x5450 processor) : One Intel Xeon 64bit Quad Core (X5450 processor series (Harpertown Processor) with two PCI-e 2.0 x16 Slots; RAM-16 GB; Clock Speed : 3.0 GHz; Cent OS 5.2; GCC Version 4.1.2; Dual Socket Quad Core (6 Processors or cores)

    Intel MKL version 10.2, CUBLAS version 3.2, Intel icc11.1 Peak Performance : CPU : 96 Gflops (1 Node - 8 Cores)

  • x86 Processor : Intel Xeon Processor E5-2643) - Sandy Bride Arch : Super Micro SYS-7047GR-TPRF Server [ Intel Xeon Processor E5-2643 (10M Cache, 3.30 GHz, 8.00 GT/s Intel QPI) Chipset : Intel C602 Chipset, Mother board : Super X9DRG-QF, CPU : Intel Xeon processor E5-2600 (up to 150W TDP), Support for Xeon Phi - 5110P , 32 GB DDR3 ECC Registered memory(1600 MHz ECC supported DDR3 SDRAM 72-bit, 240-pin gold-plated DIMMs), with 4x PCI-E 3.0 x16 (double-width), 2x PCI-E 3.0 x8 (1 in x16 slot), 1x PCI-E 2.0 x4 (in x8), Support for IPMI (Support for Intelligent Platform Management Interface v.2.0, IPMI 2.0 with virtual media over LAN and KVM-over-LAN support), 4U Rackmountable / Tower (Model - CSE-747BTQ-R1K62B), 1620W high-efficiency redundant power supply w/ PMBus. SATA 3.0 6Gbps with RAID 0, 1 support , 1 TB SATA Hard Disk, Intel i350 Dual Port Gigabit Ethernet with support of Supports 10BASE-T, 100BASE-TX, and 1000BASE-T, RJ45 output and 1x Realtek RTL8201N PHY (dedicated IPMI port ) ]

    Intel Xeon Phi code name (Knights Corner) Coprocessor 5110P (8GB, 1.053 GHz, 60 core) , with max TDP 225 watts

  • PARAM YUVA Compute Node : Intel Xeon Processor -R2208GZ; Intel Xeon E52670 - Sandy Bride Arch : Intel-R2208GZ; Intel Xeon E52670; with Co-Processor : Intel Xeon Phi; Cores/Node : 16; Core Frequency : 2.6GHz; Peak Performance /Node : 2.35 TF; Memory : 64 GB; OS : Linux

    Intel Development Tools : Intel MPI, Pubic domain MVAPICH2); MKL, NAG & CDAC KSHIPRA; and Varda Prog. Env - RCS


Multi-Core Processor Systems (AMD)

Peak performance (double precision) of HPC GPU Cluster with one node having Single AMD Fire Stream 9305 is 415 Gflop/s

  • One AMD Opteron X86 24 Core Multi-Core Processor systems with two PCI-e 2.0 x16 Slots; RAM-48 GB; Clock Speed : 3.0 GHz; Cent OS 5.2; GCC Version 4.1.2; Dual Socket 12 Core (24 cores)

  • ACML version , OpenCL and BLAS Libraries; Peak Performance : CPU : 144 Gflops (1 Node - 12 Cores) and AMD-APP with OpenCL Prog. Env.


Three Message Passing GPU clusters i..e, Intel Xeon Phi Coprocessor Cluster , CUDA enabled NVIDIA GPU cluster , and OpenCL enabled AMD GPU Cluster are used for laboratory sesson of hyPACK-2013.


Centre for Development of Advanced Computing