Storage Overview
Aim: Help you choose the right storage for your data and use it correctly.
Target audience: All users of Nikhef computing infrastructure.
Quick decision guide
(Click on the "Yes" answer for further information about the storage type.)
flowchart TB
START([Where does my data go?]):::start
Q1{"Temporary or intermediate data?"}:::question
Q2{"Personal or private files?"}:::question
Q3{"Precious or unique? Needs backup?"}:::question
Q4{"Large data files, over 1 GB?"}:::question
TMPDIR["$TMPDIR — local to node, ephemeral, fast"]:::tmpdir
HOME["$HOME — backed up, ~2 GB quota, private"]:::home
CONDA(["ℹ Conda: set pkgs_dirs to /data"]):::conda
PROJECT["/project — backed up, group quota, NFS"]:::project
DCACHE["/dcache — no backup, write-once, petabytes"]:::dcache
DATA["/data — no backup, POSIX, 1–100 GB"]:::data
START --> Q1
Q1 -->|Yes| TMPDIR
Q1 -->|No| Q2
Q2 -->|Yes| HOME
HOME -.-> CONDA
Q2 -->|No| Q3
Q3 -->|Yes| PROJECT
Q3 -->|No| Q4
Q4 -->|Yes| DCACHE
Q4 -->|No| DATA
click TMPDIR href "https://kb.nikhef.nl/ct/Node_scratch_space.html" _blank
click HOME href "https://kb.nikhef.nl/ct/Directory_home.html" _blank
click CONDA href "https://kb.nikhef.nl/ct/Conda_environments.html#creating-a-virtual-environment-with-conda" _blank
click PROJECT href "https://kb.nikhef.nl/ct/Directory_project.html" _blank
click DCACHE href "https://kb.nikhef.nl/ct/Directory_dcache_stoomboot.html" _blank
click DATA href "https://kb.nikhef.nl/ct/Directory_data.html" _blank
classDef start fill:#d3d1c7,stroke:#5f5e5a,color:#2c2c2a
classDef question fill:#f5f5f4,stroke:#b4b2a9,color:#2c2c2a
classDef tmpdir fill:#faeeda,stroke:#ba7517,color:#412402
classDef home fill:#e1f5ee,stroke:#0f6e56,color:#04342c
classDef project fill:#eeedfe,stroke:#534ab7,color:#26215c
classDef dcache fill:#e6f1fb,stroke:#185fa5,color:#042c53
classDef data fill:#faece7,stroke:#993c1d,color:#4a1b0c
classDef conda fill:#faece7,stroke:#993c1d,color:#712b13 Conda environments and $HOME
By default, Conda installs packages and environments into $HOME, which has a small quota (~2 GB) and will fill up quickly. Create your virtual environment in /data instead.
Add the following to ~/.condarc to redirect package downloads away from $HOME (replace your_project and your_username with your actual group directory and username):
Then create your environment directly in /data:
conda create --prefix /data/your_project/your_username/my_venv python=3.11
conda activate /data/your_project/your_username/my_venv
See the full Conda environments documentation for further guidance on installing packages and adding custom kernels to JupyterLab. For help setting up environments, contact stbc-admin@nikhef.nl.
Using storage with Stoomboot jobs
Data products for Stoomboot batch jobs should be written to /dcache (Not $HOME, /project or /data).
Use condor file transfer with your jobs to help write data to storage with
in your job description file. See HTCondor documentation.Quick reference
| Location | Use for | Backed up | Modifiable | Quota | Notes |
|---|---|---|---|---|---|
$TMPDIR | Temporary / intermediate job data | No | Yes | Node-local | Auto-cleaned when job ends |
$HOME | Personal files, config, dot files | Yes | Yes | ~2 GB | Shared between Linux and Windows |
/project | Code, thesis, conditions, unique plots | Yes | Yes | Group quota | Expensive — keep it tidy |
/data | Software environments, containers, log files, modifiable results | No | Yes | Several TB/user | POSIX compliant; NFS mounted |
/dcache | Large data files > 1 GB: ntuples, ROOT files, MC samples | No | No (write-once) | Petabytes (group) | Also accessible via xrootd / WebDAV |
Warning
/data, /dcache, and $TMPDIR are not backed up. If data is lost, it cannot be recovered. Store anything precious or irreplaceable in /project or $HOME.
Storage locations
$TMPDIR — Node scratch space
Use for: Temporary and intermediate results during a running job — for example, intermediate MC output that will be merged into larger files before being written to /dcache or /data.
What goes here:
- Intermediate files and logs produced during a job
- MC output files before merging
- Any data that is only needed within the lifetime of a single job
What does NOT go here:
- Final analysis results or ntuples (put those in
/dcacheor/data) - Private data (put that in
$HOME) - Code and scripts (put those in
/projector$HOME)
Warning
$TMPDIR is automatically cleaned up when the job ends. Do not store anything here that you need after job completion.
Always use $TMPDIR (not /tmp/) to find your scratch directory, as the actual path may vary by node. For scripts that also run on your laptop, use ${TMPDIR:-/tmp} for portability.
$HOME — Home directory
Use for: Personal files, configuration settings, and data used only by yourself.
What goes here:
- "Dot" files and shell configuration (
.bashrc,.profile, etc.) - Personal analysis results and draft versions of documents
- Simple personal scripts
- Private files (emails, personal data)
What does NOT go here:
- Ntuples or large data files (use
/dcacheor/data) - Scripts and frameworks shared with colleagues (use
/project) - Conda environments or package caches (use
/data— see note below) - Intermediate files (use
$TMPDIR)
Backed up with 3 replicas (disk, local backup, and external TSM). Quota is typically 2 GB for new users. Do not use it for high-throughput I/O.
Tip
Your home directory is not readable by other users. However, your public_html/ directory is publicly accessible from the web unless you add .htaccess controls.
/project — Project storage
Use for: Unique, precious data that must outlive your personal stay at Nikhef.
What goes here:
- Unique software and analysis frameworks used by a group
- Conditions data, calibration data, and settings files
- Final thesis chapters and supporting materials
- Precious plots, histograms, and tabular data that feed into publications
- Jupyter notebooks and scripts used to produce those plots
What does NOT go here:
- Personal private files (use
$HOME) - Analysis results that can be reproduced (use
/dcacheor/data) - Large files that can be replicated to the Grid
- Intermediate results (use
$TMPDIR)
Warning
Storage is limited and shared across your group. If you fill the group quota, your colleagues will be affected. Keep /project tidy and remove stale files regularly.
Tip
Remember to also deposit notebooks, tabular data, and results in a FAIR data repository such as HEPdata, CERN Open Data, or Zenodo.
/data — Data directory
Use for: Modestly-sized working data that could in principle be reproduced or re-downloaded.
What goes here:
- Software environments (Conda environments, virtual environments)
- Apptainer / Singularity container images (
.siffiles) - Private ntuples and results where files need to be rewritten or modified
- Log files you need to collect and review later
- Analysis results for ongoing work
What does NOT go here:
- Your software and scripts (use
/projector$HOME) - Intermediate log files produced during jobs (use
$TMPDIR)
Tip
/data/tunnel is specifically designated for sharing data between the Nikhef local environment and the public Grid environment.
Warning
/data is not backed up. If the system fails catastrophically, data cannot be recovered. The NFS server has limited transaction throughput — heavy use will impact both desktop and Stoomboot users.
/dcache — dCache storage
Use for: Large-scale data files that are written once and read many times.
What goes here:
- ROOT files and ntuples
- MC samples and simulation output
- Large analysis results (> 1 GB)
- Any data that needs to be accessed from many Stoomboot nodes simultaneously
What does NOT go here:
- Software and scripts (use
/projector$HOME) - Files you need to actively edit (files cannot be modified once written — see below)
- Intermediate log files (use
$TMPDIR)
Files in /dcache are immutable
Once written to dCache, a file cannot be overwritten or appended to. Attempting to do so will result in Permission denied. dCache is designed for files that are written once and read many times.
Accessing dCache remotely: dCache can also be accessed from outside Stoomboot using the xrootd, WebDAV, and GridFTP protocols (certificate required). Link your certificate to your Nikhef account via sso.nikhef.nl and select "Connect your certificate". The xrootd door is at dcache.nikhef.nl.
Tip
Monitor your group's dCache usage via the dCache usage dashboard (requires Nikhef network or eduVPN).
Using storage with batch jobs
When submitting jobs to Stoomboot via HTCondor, follow these conventions:
- Read input data from
/dcacheor/data— both are accessible from all Stoomboot nodes via NFS. - Write intermediate output to
$TMPDIR— fastest option; avoids NFS load during the job. - Write final output to
/dcache(files > 1 GB) or/data(smaller modifiable files). - Avoid writing directly to
/projector$HOMEor/datafrom batch jobs — the NFS servers hosting these are shared with critical Nikhef desktop services and are not designed for high-throughput batch I/O.
Typical batch job storage pattern:
Input: /dcache/[group]/[your_data] ← read ntuples / MC samples
Working: $TMPDIR ← intermediate files during job
Final output: /dcache/[group]/[your_output] ← large results (> 1 GB)
/data/[group]/[username]/ ← smaller modifiable results
File recovery
If you accidentally delete or overwrite a file in $HOME or /project, recovery may be possible as these filesystems are backed up. Contact the CT helpdesk at helpdesk@nikhef.nl as soon as possible.
Recovery is not available for /data, /dcache, or $TMPDIR.
Read more about file recovery.
Getting help
- General helpdesk: helpdesk@nikhef.nl
- Stoomboot storage questions: stbc-users@nikhef.nl or stbc-admin@nikhef.nl
- Mattermost: #stbc-users
- dCache usage dashboard: steker.nikhef.nl (Nikhef network / eduVPN required)