HPC Cluster

HPC: 

Welcome to the high performance computing (HPC) community site at CCHMC! We operate as the Research Computing group under Information Systems for Research (IS4R).

Currently, we maintain one Red Hat (RHEL 9) Linux based HPC cluster and one AI/ML GPU cluster for research. Our primary HPC cluster environment currently has 2000+ cores and is heterogeneous with both large-memory SMPs totaling 30TB of RAM across 80 nodes. The primary connection for the cluster nodes is a high speed Ethernet with 10-25Gbps and the scheduler / resource manager is IBM LSF. This environment also contains nodes with GPU (NVIDIA) capabilities. Which contains a combination of dual V100 with 32GB of RAM and quad A100 with 40GB and 80GB. The AI/ML cluster is tailored for containerized GPU workloads and currently has 32 x H100 GPUs and 4x V100 GPUs.

The software available on the cluster is installed upon request and is managed via TCL modules, conda environments, and containers. We have several versions of R/Rstudio, Python, Nextflow, Picard, and Samtools as examples. Also, there is a Web interface "HPC OnDemand" where most tools and desktops can be deployed from any web browser with easy use. The AI/ML cluster utilizes Run:AI on top of Kubernetes and features a rich web interface along with CLI and API based access.

The File system is NFS (Network File System) where each user has 100GB allocated for home and 5TB for scratch. Also, data shares can be requested for additional storage and shared with multiple users and other institutions using Globus or Active MFT.

The clusters are open to all CCHMC employees and collaborators with a valid use case.

If you have a question that is unanswered after reviewing this information, please email our support system at help-cluster@bmi.cchmc.org.