This is a simple example of running a basic Python program under HTCondor. This example uses a single CPU and can serve as a template for Python programs that may require some specialized packages but does not need a GPU. For GPU examples, please see the tensorflow and PyTorch directories.
For most Python users we recommend installing Conda and using that to manage your environment. To install Conda:
wget https://github.com/conda-forge/miniforge/releases/download/24.7.1-0/Miniforge-pypy3-24.7.1-0-Linux-x86_64.sh
bash Miniforge-pypy3-24.7.1-0-Linux-x86_64.sh -b -p $HOME/miniconda3
eval "$(${HOME}/miniconda3/bin/conda shell.bash hook)"
conda initIn order to be make Conda available automatically when you log into the cluster
you will also need to add the following to your ~/.bash_profile
if [ -e ${HOME}/.bashrc ]
then
source ${HOME}/.bashrc
fiHere is some information on the difference between bashrc and bash_profile
After making these changes log out and log back in.
You can now use the conda command to install additional packages you'll need.
For example to install numpy
conda install numpyIt's worth reading through the Conda users guide. Some useful commands are
conda listlists all installed packagesconda searchfinds available packages that match the provided name, for exampleconda search torchwill find all avaialable versions oftorch,pytorchetcconda updateupdates packages
This directory contains a sample program python_demo.py which simply adds the
numbers from 1 to 100 and prints the result. To submit this to the cluster the command is
condor_submit python_demo.subAfter submitting you can check on the progress with
condor_q netidor monitor it with
watch -n 5 condor_q netidIn both cases replace netid with your SU Net ID.
When it completes you can check the output with
cat output/python_demo.outNote that python_demo.sub does not call python_demo.py directly. This is because the job needs to be
set up so that it will run inside th Conda environment, which is not enabled by default. The submit
files therefor calls a wrapper script, which sets up the environment and then runs the python code. For most
simple Python applications you should be able to modify conda_wrapper.sh without modifying the submit
file.
There are also documents on how to parallelize code to make optimal use of the clsuter and how to use specialized file formats to optimize data storage and access.
Please email any questions or comments about this document to Research Computing at researchcomputing@syr.edu.