Last updated on Mar 19, 2019.
These instructions assume Linux/Mac OS and Python 3.x. They do not cover how to run jobs on GPUs.
Contact me (hshan at g dot harvard dot edu) for suggestions, corrections and comments.
ssh [your RC username]@login.rc.fas.harvard.edu
You will be prompted to enter your RC account password and the “verification code”,
a six-digit password from your Duo Mobile app for two-step verification.
To run Python scripts, you need to create an environment with all the packages you need. To do this, use a line like this
conda create -n [env_name] python=3.6 numpy scipy
Here, I created an environment called [env_name]
,
using python 3.6, and requested additional packages numpy
, and scipy
.
You need to do this even if you have import xxx
in your script.
This is like installing packages on your computer.
Note that the package you can install in this way are very basic.
For not-ordinary ones (e.g. pytorch
), don’t do it here.
Once you finishe this step, type
source activate [env_name]
Now you can install additional packages like you normally will on your own computer. For example, you can use
conda install pytorch torchvision -c pytorch
to install pytorch
.
With all the packages you need in your script, the environment is now ready.
The way to run a performance-intensive code on Odyssey is to submit it with a batch file. On your computer,
create a plain text file with these content:
#!/bin/bash
#SBATCH -p general
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --mem 64000
#SBATCH -t 0-6:00
module load python
source activate [env_name]
python C3v_cluster.py
Let’s go though the lines one by one.
\-N
is the number of nodes you request. In general it should be 1.\-n
is the number of CPU cores you request.-t
is the amount of runtime you request.source activate [env_name]
activates the Python environment you created.Upload the text file (see above) and the python script to your directory on the cluster. Then simply run
sbatch [text_file_name]
and your job should be submitted.
print()
outputs in Python)slurm-1234567.out
, where the number is the number of your job.module load python/3.6.3-fasrc02
This is a commonly encountered scenario when running simulations. The way it works on Odyssey is the following
Let’s set these up one by one, in reverse order.
At the beginning of your Python script, do this
import sys
args = sys.argv # pull arguments from the environment
What does it do? When you run a Python script in the terminal, typically you would use something like
python script.py
It turns out that you can add arguments to it. For example, you can do
python script.py 1 2 3
If script.py
includes the Python code we mentioned above, then args
would be a list with entries
['script.py', '1', '2', '3']
This way, you can pull a parameter from the environment by converting entries in this list to integers etc.
You need to create a plain text file. Here we call it [sbatch_file_name]
. Inside, type
#!/bin/bash
# [sbatch_file_name]
#
#SBATCH -p general.
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --mem 64000
#SBATCH -t 0-24:00
python script.py ${PARAM1} ${PARAM2}
# Batch control file
What this script does is that it submits a job (like the sbatch
command); in each job, it runs script.py
with arguments ${PARAM1}
and ${PARAM2}
. You could add more parameters. Values for these parameters will be assigned by the next part.
When you have uploaded your Python script and the special batch file, run this in the terminal
for PARAM1 in $(seq 1 5); do
#
echo "${PARAM1}"
export PARAM1
#
sbatch -o job_no_${PARAM1}.stdout.txt \
--job-name=something_${PARAM1} \
[sbatch_file_name]
#
sleep 1
done
With these codes, we create a for loop where the value of PARAM1
goes from 1 to 5. For each value, we run the special batch file [sbatch_file_name]
, name the job something_${PARAM1}
, and asks it to name the output file as job_no_${PARAM1}.stdout.txt
.