The HPC3 cluster is an in-house designed high performance computing (HPC) facility at HKUST set up in May 2020. As of Sep 2021, it has 165 CPU compute nodes and 25 GPU compute nodes which are infiniBand (IB) connected at 100 Gbit/s with 2PB raw disk storage. The total number of CPU cores and GPU cards are 7412 and 230 respectively.
The HPC3 cluster design is based on the principle to maximum performance and available computing resources with the allocated funding, as such, the design emphasized on performance, number of CPU / GPU nodes available and maximum raw disk storage, while redundancy is only available for essential equipment.
The HPC3 cluster consists of the following equipment:
Master & login node: The master node has been setup with OpenHPC cluster management system for cluster management, job scheduling and monitoring. The login node is the entry point for user to login to compile and submit their job
Master node | login node | |
Number of node | 1 | 1 |
CPU | 2x Intel Xeon Gold 6230 (20-core/2.1 GHz/28MB cache) | 2x Intel Xeon Gold 5217 (8-core/3.0 GHz/11MB cache) |
RAM | 6x 16GB DDR4-2933 | 6x 16GB DDR4-2933 |
Storage |
2x 2.4TB 12Gb/s 10K rpm Hot-swap SAS HDD 1x 4TB 12Gb/s 7.2K rpm SAS HDD Inspur SAS3008 (IMR) 12Gb/s RAID Adapter |
2x 2.4TB 12Gb/s 10K rpm Hot-swap SAS HDD 1x 4TB 12Gb/s 7.2K rpm SAS HDD Inspur SAS3008 (IMR)12Gb/s RAID Adapter |
Network | Dual-port EDR (100Gbps) InfiniBand (IB) network card Dual-port 10Gbps ethernet network card with SR SFP+ connector Dual 1Gb Ethernet adapter |
Dual-port EDR (100Gbps) InfiniBand (IB) network card Dual-port 10Gbps ethernet network card with SR SFP+ connector Dual 1Gb Ethernet adapter |
Compute node: 160 CPU compute nodes each with two 20-cores Intel Xeon 6230 CPU processors and 192GB physical memory
CPU nodes | |
Number of node | 160 |
CPU | 2x Intel Xeon Gold 6230 (20-core/2.1 GHz/28MB cache) |
RAM | 12x 16GB DDR4-2933 |
Storage |
2x 2.4TB 12Gb/s 10K rpm Hot-swap SAS HDD Inspur SAS3008 (IMR) 12Gb/s RAID Adapter |
Network | Single-port EDR (100Gbps) InfiniBand (IB) network card Dual 1Gb Ethernet adapter |
Large memory node: 5 compute nodes with the same CPU as above and 1.5TB physical memory
Large memory node | |
Number of node | 5 |
CPU | 2x Intel Xeon Gold 6230 (20-core/2.1 GHz/28MB cache) |
RAM | 24x 64GB DDR4-2933 |
Storage |
2x 2.4TB 12Gb/s 10K rpm Hot-swap SAS HDD Inspur SAS3008 (IMR) 12Gb/s RAID Adapter |
Network |
Single-port EDR (100Gbps) InfiniBand (IB) network card Dual 1Gb Ethernet adapter |
GPU node: 25 GPU nodes with different CPU and GPU models
GPU node (RTX 2080 Ti) | GPU node (RTX 2080 Ti) | GPU node (RTX 6000) | GPU node (RTX 3090) | |
Number of node | 10 | 3 | 1 | 11 |
CPU | 2x Intel Xeon Gold 6244 (8-core/3.6GHz/25MB Cache) | 2x Intel Xeon Silver 4210 (10-core/2.2GHz/13.75MB) | 2x Intel Xeon Silver 4210 (10-core/2.2GHz/13.75MB) | 2x Intel Xeon Gold 6230R (26-core/2.1GHz/35.75MB Cache) |
RAM | 12x 32GB DDR4-2933 | 8x 32GB DDR4-2933 | 16x 32GB DDR4-2933 | 8x 32GB DDR4-2933 |
GPU | 8x Nvidia GeForce RTX 2080 Ti | 10x Nvidia GeForce RTX 2080 Ti | 10x Nvidia Quadro RTX 6000 | 10x Nvidia GeForce RTX 3090 |
Storage |
2x 960GB SATA 6G SSD Inspur SAS3008 (IMR) (No Cache) 12Gb/s RAID Adapter |
2x 960GB SATA 6G SSD Inspur SAS3008 (IMR) (No Cache) 12Gb/s RAID Adapter |
2x 960GB SATA 6G SSD Inspur SAS3008 (IMR) (No Cache) 12Gb/s RAID Adapter |
2x 2TB SATA 6Gb/s 7.2K rpm |
Network |
1x Single-port EDR (100Gbps) InfiniBand (IB) network card 1x Dual 1Gb Ethernet adapter |
1x Single-port EDR (100Gbps) InfiniBand (IB) network card 1x Dual 1Gb Ethernet adapter |
1x Single-port EDR (100Gbps) InfiniBand (IB) network card 1x Dual 1Gb Ethernet adapter |
AOC-MCX555A-ECAT ConnectX-5, 100GbE Single-Port QSFP28 |
File Systems: i) Parallel file system with 2PB raw storage installed with BeeGFS parallel cluster file system. ii) Archive (NFS) file system with 2PB storage as secondary storage of files
Interconnect: All servers are interconnected with Mallanox EDR InfinBand in a fat-tree topology (with blocking factor of 2).