Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

readme.md

Python

This is a simple example of running a basic Python program under HTCondor. This example uses a single CPU and can serve as a template for Python programs that may require some specialized packages but does not need a GPU. For GPU examples, please see the tensorflow and PyTorch directories.

Installing Conda

For most Python users we recommend installing Conda and using that to manage your environment. To install Conda:

wget https://github.com/conda-forge/miniforge/releases/download/24.7.1-0/Miniforge-pypy3-24.7.1-0-Linux-x86_64.sh

bash Miniforge-pypy3-24.7.1-0-Linux-x86_64.sh  -b -p $HOME/miniconda3
eval "$(${HOME}/miniconda3/bin/conda shell.bash hook)"
conda init

In order to be make Conda available automatically when you log into the cluster you will also need to add the following to your ~/.bash_profile

if [ -e ${HOME}/.bashrc ]
then
    source ${HOME}/.bashrc
fi

Here is some information on the difference between bashrc and bash_profile

After making these changes log out and log back in.

Install additional packages

You can now use the conda command to install additional packages you'll need. For example to install numpy

conda install numpy

It's worth reading through the Conda users guide. Some useful commands are

  • conda list lists all installed packages
  • conda search finds available packages that match the provided name, for example conda search torch will find all avaialable versions of torch, pytorch etc
  • conda update updates packages

Running the sample program

This directory contains a sample program python_demo.py which simply adds the numbers from 1 to 100 and prints the result. To submit this to the cluster the command is

condor_submit python_demo.sub

After submitting you can check on the progress with

condor_q netid

or monitor it with

watch -n 5 condor_q netid

In both cases replace netid with your SU Net ID.

When it completes you can check the output with

cat output/python_demo.out

The wrapper script

Note that python_demo.sub does not call python_demo.py directly. This is because the job needs to be set up so that it will run inside th Conda environment, which is not enabled by default. The submit files therefor calls a wrapper script, which sets up the environment and then runs the python code. For most simple Python applications you should be able to modify conda_wrapper.sh without modifying the submit file.

What to read next

There are also documents on how to parallelize code to make optimal use of the clsuter and how to use specialized file formats to optimize data storage and access.


Please email any questions or comments about this document to Research Computing at researchcomputing@syr.edu.