How to set up High Performance Computing

AHPCC from University of Arkansas

Tutorial
AHPCC
HPC
Author

Jihong Zhang

Published

January 14, 2024

1 General Information

Arkansas High Performance Computing Center (AHPCC, official website) is available for research and instructional use to faculty and students of any Arkansas university and their research collaborators. There is no charge for use of our computing resources.

To use the HPC, an AHPCC account must be requested through Internal Account Request Form. Please see here for more information about AHPPC inventory. ]

2 Connect to HPC

  • You can use online dashboard or terminal to have access to AHPCC nodes.

2.1 For terminal

As long as you have an AHPCC account, you can connect to HPC through SSH. For Windows users, you can use PuTTY or Powershell to connect to HPC. For Mac and Linux users, you can use the terminal to connect to HPC. The command is:

ssh [username]@hpc-portal2.hpc.uark.edu

Replace [username] with your username of AHPCC account. Passwords will be required. After you enter your password, you will be connected to HPC and your terminal/Powershell will look like this.

Login Screenshot

Login Screenshot

Note: Pinnacle is a new resource at the University of Arkansas in 2019. It consists of 100 Intel based nodes with 20 NVIDIA V100 GPU nodes enabling data science and machine learning and 8 big memory nodes with 768 Gb ram/each for projects requiring a large memory footprint.

2.2 SSH login without password

  1. Generate a pair of authentication keys in your local machine and do not enter a passphrase using the following codes:
ssh-keygen -t rsa   

Please note that make the passphrase empty:

Generating public/private rsa key pair.
Enter file in which to save the key (/Users/[username]/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /Users/[username]/.ssh/id_rsa
Your public key has been saved in /Users/[username]/.ssh/id_rsa.pub
  1. In your local machine, type in following commands to copy local public key to the hpc server.
scp ~/.ssh/id_rsa.pub [loginname]@hpc-portal2.hpc.uark.edu:/home/[loginname]/.ssh/authorized_keys
  1. Now you should be able to login the hpc login node without password:
ssh [loginname]@hpc-portal2.hpc.uark.edu

3 Upload or Download Data

There are two ways to download data: (1) dashboard (2) terminal

3.1 Dashboard

One the dashboard page, find Files tag on the navigation bar:

Files > /karpinski/[username]

You then should be able to upload or download your files from local to server or form server to local.

3.2 Terminal

  • To upload data files from local machine to HPC, type in following codes on your local machine:
scp program.c [username]@hpc-portal2.hpc.uark.edu:/home/username/

where program.c is one example file you want to upload. If you target file is located in Downloads folder, use ~/Downloads/program.c instead.

To copy an entire folder and its subfolds using SCP, add parameter -r following scp for recursive operations (src is a folder for example):

scp -r src [username]@hpc-portal2.hpc.uark.edu:/home/username/
  • To download data files from HPC to local machine, type in following codes on your local machine:
scp -r [username]@hpc-portal2.hpc.uark.edu:/home/username/src ./

4 Jobs submission

4.1 Workflow

There are multiple steps to submit the R file to cluster to run.

  1. We need to determine the computing nodes we want to use. Please refer to this link for detailed information about HPC equipment. A general ‘job submit’ command is like this:
sbatch -q q06h32c -l walltime=1:00 -l nodes=1:ppn=32 example.sh
Note

The sbatch command (a slurm command) aims to submit a job that is saved to the job file example.sh or example.slurm. The command above submitted the job to the q06h32c queue with a wall-time of 1 minute requesting all 32 cores on 1 node.

  1. We want to create a job file with .sh or .slurm extension to automatically run the job submission commands aforementioned. Here’s a simple example of a job file example.sh that can tell HPC how to run your R code (#SBATCH represents all parameters we feed to job submission):
#!/bin/bash
#SBATCH --job-name=mpi
#SBATCH --output=zzz.slurm
#SBATCH --partition comp06
#SBATCH --nodes=2
#SBATCH --tasks-per-node=32
#SBATCH --time=6:00:00
module load gcc/11.2.1 mkl/19.0.5 R/4.2.2
# module load gcc/11.2.1 intel/21.2.0 mkl/21.3.0 R/4.3.0

Rscript HelloWorld/example.R

R relevant modules

Where Line 8 loaded all required modules:

  • gcc and mkl are required for R package installation (Note: To the date, gcc/11.2.1 is the latest version of gcc than can compile the cmdstanr successfully). Please see here for more details.

  • Rscript is the bash command to execute R file on HPC. HelloWorld/example.R is the path of your R script.

  • Anything behind the #SBATCH are options for the SLURM scheduler. Please see following summary or view it online: