Overview Getting Started Software Hardware Jobs For Faculty

Software Setup

Miniconda Installation

Miniconda is a minimal installer for conda that provides Python and essential packages. It's the recommended Python environment manager on Beehive for efficient space usage.

Project-Based Installation (Recommended)

For better reproducibility and isolation, we recommend creating a separate Miniconda installation for each project in your scratch space:

Create a project directory in scratch:

export PROJECT_DIR=/scratch/$USER/my_project
mkdir -p $PROJECT_DIR && cd $PROJECT_DIR

Download and install Miniconda for this project:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p $PROJECT_DIR/miniconda
rm Miniconda3-latest-Linux-x86_64.sh

Create a project environment file:

cat > project.env << EOF
eval "$($PROJECT_DIR/mc3/bin/conda 'shell.bash' 'hook')"
EOF

Using Your Project Environment

Activate the environment file:

source project.env

Create or update your conda environment:

# Create a new environment from an environment.yml file
conda env create -f environment.yml

# Or update an existing environment
conda env update -f environment.yml --prune

The --prune option removes dependencies that are no longer specified in the environment.yml file.

This approach has several advantages: - Each project has its own isolated Python environment - Dependencies are explicitly defined in environment.yml - You can easily share your environment configuration with collaborators - No conflicts between different projects' dependencies

PyTorch Installation

PyTorch is a popular deep learning framework with excellent GPU support. This guide will help you install PyTorch on the Beehive cluster.

Installation Steps

The following steps will install PyTorch in a Python virtual environment. We recommend using /scratch for the installation to avoid consuming your home directory quota.

Get an interactive job on a GPU node:

srun -p gpu -G1 --pty bash

Create a Python virtual environment:

python3 -m venv /scratch/$USER/venv

Activate the virtual environment:

. /scratch/$USER/venv/bin/activate

Install PyTorch:

pip3 install torch torchvision torchaudio

Test your installation:

python -c "import torch; print(torch.cuda.is_available())"

This should print True if PyTorch can access the GPU.

Using PyTorch in Future Sessions

In future sessions, you'll just need to activate your virtual environment:

. /scratch/$USER/venv/bin/activate

Running PyTorch Jobs

When submitting batch jobs that use PyTorch, make sure to include the activation command in your job script:

#!/bin/bash
#SBATCH --job-name=pytorch_job
#SBATCH --output=pytorch_%j.log
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=4

# Activate the virtual environment with PyTorch
. /scratch/$USER/venv/bin/activate

# Run your PyTorch script
python train_model.py

Jupyter Notebook Usage

Jupyter Notebooks provide an interactive environment for code development, data exploration, and visualization. Here's how to run Jupyter on the Beehive cluster.

Setup and Installation

First, you'll need to install Jupyter Notebook. Follow these steps:

Get an interactive job on a node:

srun -p dev-cpu --pty bash

Create and set up a Python virtual environment:

python3 -m venv ~/myenv          # Create the virtualenv
source ~/myenv/bin/activate      # Activate the env
pip install --upgrade pip        # Update pip
pip install notebook             # Install jupyter-notebook
pip cache clear                  # Empty cache to save home directory space

Running Jupyter Notebook

You can run Jupyter Notebook as either an interactive or batch job.

Interactive Method

Start an interactive job:

srun --pty bash                  # Run an interactive job

Set up your environment:

source ~/myenv/bin/activate      # Activate virtual env
export NODEIP=$(hostname -i)     # Get the IP address of your node
export NODEPORT=$(( $RANDOM + 1024 ))  # Get a random port above 1024
echo $NODEIP:$NODEPORT           # Note these values for the SSH tunnel
jupyter-notebook --ip=$NODEIP --port=$NODEPORT --no-browser

In a new terminal on your local machine, create an SSH tunnel:

ssh -N -L 8888:$NODEIP:$NODEPORT username@beehive.ttic.edu

(Replace $NODEIP, $NODEPORT, and username with the actual values)

Open your local browser and visit: http://localhost:8888 The login token is displayed in the output of the jupyter-notebook command.

Batch Method

Create a batch job script named jupyter-notebook.sbatch:

#!/bin/bash
#SBATCH --job-name=jupyter
#SBATCH --output=jupyter_%j.log
#SBATCH --partition=cpu
#SBATCH --cpus-per-task=2

NODEIP=$(hostname -i)
NODEPORT=$(( $RANDOM + 1024))
echo "ssh command: ssh -N -L 8888:$NODEIP:$NODEPORT $(whoami)@beehive.ttic.edu"

source ~/myenv/bin/activate
jupyter-notebook --ip=$NODEIP --port=$NODEPORT --no-browser

Submit the batch job:

sbatch jupyter-notebook.sbatch

Check the job output file to find the SSH command to use when accessing your notebook.
Create an SSH tunnel as instructed in the output:

ssh -N -L 8888:###.###.###.###:#### username@beehive.ttic.edu

Open your local browser and visit: http://localhost:8888

Troubleshooting

If you're having problems with the token or password:

Stop the notebook server
Remove the runtime files:

rm -rf ~/.local/share/jupyter/runtime

Restart the server