We provide a simple script that can be used to submit a job to CentOS Linux cluster with specifications below:
- LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
- Distributor ID: CentOS
- Description: CentOS release 6.10 (Final)
- Release: 6.10
- Codename: Final
Here, we show one way for running a job on the HPC cluster by setting up a job script. This script will request cluster resources and list, in sequence, the commands that we want to execute.
Let's call our job script "job_script", which is a plain text file that we can edit with a UNIX editor such as vi/vim, nano, or emacs. Therefore, we can create an empty script with vim editor as
- $vim job_script
copy and paste the following contents into it,
- !/bin/sh
- #SBATCH --nodes=3
- #SBATCH --ntasks-per-node=8
- #SBATCH --mem=64000
- #SBATCH --time=01::00
- #If your job requires python, load Python modules here
- source /home/user/setup.sh
- #Commands that you actually want to run go here.
- cd /home/tssfl
- ipython code.py
- #Mail alert at start, end and abortion of execution
- #SBATCH --mail-type=ALL
- #Send mail to this address
- #SBATCH --mail-user=name@tssfl.com
save, and close (Esc, followed by : and then wq or x, hit Enter) the script.
The first four lines above that start with #SBATCH (never uncomment them), respectively, specifies compute nodes, number of tasks per node, memory and time (1 Hour). source /home/user/setup.sh loads modules (in this case Python modules) from another HP cluster user, that's, there is no need to re-install resources in your home directory, you can load them from another user if they are available. We move to the home directory tssfl where we assume the code "code.py" we want to run is placed, and then issue the command ipython code.py. We can also set mail alert at start, end and abortion of execution to monitor the progress of our computations. All these instructions are defined in the script.
Finally, we can run our job by submitting it via Terminal as
- $sbatch job_script
To cancel the job or see it in queue, you can respectively, issue the commands scancel Job_ID, squeue.