Grid jobs
Aim: Provide the basics of working with Grid jobs in Distributed Computing.
Target audience: Users of the Distributed Computing.
Introduction
These days, most experiments provide users with instructions on how to submit jobs to the grid. If you have access to your experiment's facilities, please use them. If you are curious how things work or would like to submit jobs directly, read on.
To submit jobs to the grid cluster, you submit them to a Compute Element or Compute Entrypoint (also called a CE). At Nikhef, for example, you can follow the instructions below. Nikhef has three CEs for accepting jobs from the grid:
- dissel.nikhef.nl
- brug.nikhef.nl
- klomp.nikhef.nl
Usage
Getting started with X509 certificates
You will need to request a grid certificate before submitting any jobs to the Grid.
- Follow the detailed step-by-step Nikhef instructions to request a certificate: http://ca.dutchgrid.nl/tcs/
- The next step is to contact your experiment, or look on their computing pages, to find out if you need to register with their Virtual Organization Management Server (VOMS).
- Experiments with VOMS servers are listed below with a link to their registration pages:
- ALICE https://voms24.cern.ch:8443/voms/alice/register/start.action
- ATLAS https://voms24.cern.ch:8443/voms/atlas/register/start.action
- KM3NeT https://voms02.scope.unina.it:8443/voms/km3net.org/register/start.action
- LHCb https://voms24.cern.ch:8443/voms/lhcb/register/start.action
- Virgo https://voms.cnaf.infn.it:8443/voms/virgo/register/start.action
- Xenon https://voms.grid.sara.nl:8443/voms/xenon.biggrid.nl/register/start.action
Once you have your certificate you should be able to follow the below instructions for generating X509 certificate proxies.
Getting an X509 proxy
Once you have sourced the grid middleware tools, type
to create a VOMS proxy. The voms-proxy-* commands are available once the setup script has been sourced.
Sample output:
$ voms-proxy-init -voms pvier
Enter GRID pass phrase:
Your identity: /O=dutchgrid/O=users/O=nikhef/CN=Some User
Creating temporary proxy .......................................... Done
Contacting voms.grid.sara.nl:30000
[/O=dutchgrid/O=hosts/OU=sara.nl/CN=voms.grid.sara.nl] "pvier" Done
Creating proxy
................................................................... Done
Your proxy is valid until Fri Dec 7 00:08:49 2007
or you could use the arcproxy tool to generate a proxy:
$ arcproxy --voms <YOUR-VO>
Enter Password for PKCS12 certificate:
Your identity: /DC=org/DC=terena/DC=tcs/C=NL/O=Nikhef/CN=<Some User>
Contacting VOMS server (named pvier): voms.grid.sara.nl on port: 30000
Proxy generation succeeded
Your proxy is valid until: 2020-11-18 04:32:41
Congratulations! You are now ready to use grid middleware tools.
Submitting your job
To submit a job to the Nikhef Grid you will need a job description file. This file describes the kind of resources you need to request from the site to run your jobs (see an example job description file).
Please be sure to specify a queue in your job description with
General queues and walltimes available:
Queue Name | Max. Walltime (hhss) | Allowed VOs |
---|---|---|
short | 04:00:00 | alice atlas dans projects.nl pvier virgo dune lsgrid lofar tutor enmr.eu bbmri.nl xenon.biggrid.nl chem.biggrid.nl drihm.eu |
medium | 36:00:00 | alice atlas dans projects.nl pvier virgo dune lsgrid lofar tutor enmr.eu bbmri.nl xenon.biggrid.nl chem.biggrid.nl drihm.eu |
long | 96:00:00 | alice atlas dans projects.nl pvier virgo dune lsgrid lofar tutor bbmri.nl xenon.biggrid.nl chem.biggrid.nl drihm.eu |
For more information or to find other queues, use lcg-info
or lcg-infosites
which will give you more information about what is available to your VO. For example,
More information is also available on the SURF wiki: http://doc.grid.surfsara.nl/en/latest/Pages/Service/system_specifications/gina_specs.html#queues
Specifying Job Requirements
The default values for memory, nodes, CPUs and local scratch space may not be adequate for your use case. It is possible to specify the requirements for your jobs in the XRSL file which will then be translated into requirements on the grid batch system. This will either match suitable resources, or match nothing at all if your requirements exceed what is available. If this is the first time you need to specify additional requirements, please ask the site administrators for advice.
Memory requirements
The amount of main memory (RAM) required for the job can be passed by adding this line to the XRSL file:
This example requests 8GB of RAM (the unit is Megabytes). Be aware that exceeding the requested amount in the actual job may result in termination of the job by the batch system. Multi-core jobs
XRSL parameters:
These examples request 4 cores on 1 node.
Example job description file
An example job description file to submit to an ARC-CE looks something like:
&( executable = "test.sh" )
( stdout = "stdout" )( stderr = "stderr" )
( gmlog = "gmlog" )
(count=1)
(runtimeenvironment=ENV/GLITE)
(inputFiles=("things.txt" ""))
Information about how to create your job description files for an ARC-CE can be found at http://www.nordugrid.org/arc/arc6/users/xrsl.html.
After creating a proxy and have your job description file ready, some commands to start running your job can look something like:
# Submit your job to an ARC endpoint with your xrsl or adl file specified
arcsub -c brug.nikhef.nl [YOUR XRSL OR ADL FILE]
# Check the status of all your jobs. Adding -l will give you a long description of each of your jobs.
arcstat -a(l)
# Or you can add the unique ID for your jobs with:
arcstat [gsiftp|https]://brug.nikhef.nl:443/[jobs|arex]/[UNIQUE JOB ID]
# Fetch your job output, logs etc with...
arcget -a
# (or arcget with a single job id)
Links
- Grid jobs
- Nikhef's Grid cluster
- Nikhef Grid Cluster Graphs
- More information about submitting jobs to an ARC-CE
Contact
- Email grid.sysadmin@nikhef.nl for questions about anything Grid related.