Nordugrid ARC job submission and management
To submit jobs via Nordugrid ARC middleware, you need:
- SiGNET certificate
- ARC client
- membership in the national virtual organization
gen.vo.sling.si
.
You can join with a valid digital certificate at voms.sling.si.
We recommend you use instructions for your specific operating system for the ARC client. Information about setup and installation can be found in the documentation.
ARC Client should always run as non-root user, as certificate is issued for user.
ARCINFO
First, verify that the user interface is installed:
$ arcinfo --version
Check cluster information (in case cluster is down or you aren't allowed to submit jobs, arcinfo won't return cluster endpoints).
$ arcinfo -c arc01.vega.izum.si
$ arcinfo -c arc02.vega.izum.si
ARCPROXY
Generates proxy with your credentials:
$ arcproxy
View generated proxy information:
$ arcproxy -I
Verify that membership of the national general virtual organization is active:
$ arcproxy -S <virtual org.>
$ arcproxy -I
In order to submit jobs via ARC middleware, you need to be a member of Vega gen.vo.sling.si
subgroup. To generate ARC proxy, run:
$ arcproxy -S gen.vo.sling.si:/gen.vo.sling.si/vega
JOB SUBMISSION
Check to see if you can submit a trivial job test.xrsl:
$ vi test.xrsl
&
(executable = /usr/bin/env)
(jobname = "test")
(stdout=test.log)
(join=yes)
(gridtime=1000)
(gmlog=log)
(memory=2000)
$ arcsub -c arc01.vega.izum.si -S org.nordugrid.gridftpjob -o joblist.xml test.xrsl -d DEBUG
...
Job submitted with jobid: <job ID>
ARCSTAT (Cluster job status)
Prints job status and some additional information from cluster, such as jobid, name, and status:
$ arcstat -c arc01.vega.izum.si
$ arcstat -c arc02.vega.izum.si
$ arcstat <job id>
ARCKILL
Kill all jobs:
$ arckill
ARCCLEAN
Clean all jobs:
$ arcclean
ARCGET
When the status of the task is FINISHED, you can download the results of one specific task:
$ arcget <job id>
Results stored at: 4fQLDmY3BxjnmmR0Xox1SiGmABFKDmABFKDmvxHKDmABFKDmiPhU9m
Jobs processed: 1, successfully retrieved: 1, successfully cleaned: 1
or results of all completed tasks:
$ arcget -a
or all tasks in the list in the joblist.xml file:
$ arcget -i joblist.xml
ARC and MPI tasks
Preparation of a descriptive xrsl file with a description of requirements:
&
(count = 4)
(jobname = "test-mpi")
(inputfiles =
("hellompi.sh" "")
("hellompi.c" "")
)
(outputfiles = ("result-mpi.txt" "") )
(executable = "hellompi.sh")
(stdout = "hellompi.log")
(join = yes)
(walltime = "10 minutes")
(gmlog = log)
(memory = 100)
(runtimeenvironment = "APPS/BASE/OPENMPI-DEFAULT")
The job definition contains the input and output files, the log files, and the script to start the job (hellompi.sh
). This means that the local LRMS implementation in SLING typically with the SLURM SBASH script copies all the files to the working directory of the job and runs the script.
The setting (runtimeenvironment = "APPS / BASE / OPENMPI-DEFAULT"
) (triggers the correct value for each cluster) triggers the hardware selection settings suitable for MPI (InfiniBand low-bandwidth connections) and the settings for the default MPI library. The environment needs to be prepared so that we can start the business in the standard way (because we do not know whether srun
or something else is available).
The script compiles the program into C and runs it with mpirun:
#!/bin/bash
echo "Compiling example"
mpicc -o hello hellompi.c
echo "Done."
echo "Running example:"
mpirun --mca btl openib,self -np 4 \${PWD}/hello > result-mpi.txt
echo "Done."
ARCRUNNER
Arcrunner is a tool for easier management and submitting arc jobs. The Arcrunner tool is available on git repository:
Gitlab repo: link
- arcrunner -R < list of clusters separated by commas>
- arcrunner -X <*.xrsl file>
- arcrunner -P
- arcrunner -h extra help
There are also additional python programs in the repository:
- clusterJobTest.py
- runLocally.py
- arc_manipulator.py
clusterJobTest.py: Useful for testing if scripts and xrsl files are working properly before being sent to a cluster. runLocally.py: Runs jobs on the local computer instead of on the cluster. arc_manipulator.py:
Arcrunner – good practice
- Add Arcrunner to PATH with the command:
export PATH = "$PATH:/home/$USER/arcrunnerhacks"
- Make an alias
alias ar = "arcrunner -R ~/.arcrunner/clusters -P ~/.arcrunner / pass -X"
and then call as:ar nameDat.xrsl.
- We store the above scripts in
~/.bashrc
to run each time we run the bash shell.