Interactive GPU nodes
Aim: Describe how to access and use the interactive GPU nodes on the Stoomboot cluster.
Target audience: Users of the Stoomboot cluster and its GPUs.
There are different generations and brands of GPUs in our data center. Depending on the software, you may see different performance figures on these nodes, and you may need to tweak and/or recompile your sources for different types.
To make things a bit easier, we have interactive GPU nodes for each GPU type available. These nodes can be used for compilation and tests, as well as computing. For scaling up the computing needs you should use the batch system GPU nodes instead.
Please keep GPU consumption and testing time to a minimum, and run your real jobs on the batch system.
These machines are intended for the following purposes:
- Running interactive jobs, like analysis work (making plots etc).
- Testing GPU jobs with short runtime
- Interaction with the batch system (see below for the relevant commands).
Access and Use
Four interactive GPU nodes are available via ssh for interactive and testing GPU use:
|Node name||Node manufacturer||Node type name||GPU manufacturer||GPU type||GPU number|
| ||Fujitsu||CELCIUS C740||NVIDIA||GeForce GTX 1080||1|
| ||Fujitsu||CELCIUS C740||NVIDIA||Quadro GV100||1|
| ||Lenovo||ThinkSystem SR655||AMD||Radeon Instinct MI50||2|
| ||Lenovo||ThinkSystem SR655||NVIDIA||Tesla V100||2|
If trying to access the nodes from home via ssh, use eduVPN and or login through
The drivers for the GPUs and following versions of the the NVIDIA CUDA libraries are installed:
The relevant version of the CUDA Deep Neural Network (cuDNN) library is also installed.
Python + GPU
To get access to Python software in an environment that supports using the GPUs, it is recommended to use conda to create a virtual environment, activate it and install the software you need.
Conda virtual environment
Create and activate a new virtual environment using:
> conda create --prefix /data/your_project/your_username/gpu_venv python=3.9 > conda activate /data/your_project/your_username/gpu_venv
Installing Python packages inside the virtualenv
To install additional software inside the virtualenv, after activating it, use conda to install it; e.g.:
> conda install tensorflow=2.11.0 > conda install pytorch=1.13.1
> conda search tensorflow tensorflow 2.11.0 cpu_py310hd1aba9c_0 conda-forge tensorflow 2.11.0 cpu_py38h66f0ec1_0 conda-forge tensorflow 2.11.0 cpu_py39h4655687_0 conda-forge tensorflow 2.11.0 cuda112py310he87a039_0 conda-forge tensorflow 2.11.0 cuda112py38hded6998_0 conda-forge tensorflow 2.11.0 cuda112py39h01bd6f0_0 conda-forge > conda install tensorflow=2.11.0=cuda112py39h01bd6f0_0
Using the software
Once things are installed, they can be used directly:
> python Python 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:39:03) [GCC 11.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf 2023-02-23 14:47:18.909239: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. >>> tf.config.list_physical_devices('GPU') [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
Email firstname.lastname@example.org for questions about the GPUs.