The following table shows SLURM commands on the SOE cluster.
Command | Description |
---|---|
sbatch | Submit batch scripts to the cluster |
scancel | Signal jobs or job steps that are under the control of Slurm. |
sinfo | View information about SLURM nodes and partitions. |
squeue | View information about jobs located in the SLURM scheduling queue |
smap | Graphically view information about SLURM jobs, partitions, and set configurations parameters |
sqlog | View information about running and finished jobs |
sacct | View resource accounting information for finished and running jobs |
sstat | View resource accounting information for running jobs |
sinfo |
scontrol show partition=SOE_main |
scontrol show node=<nodename> |
scontrol show node=soenode[05-06,35-36] |
#!/bin/bash #SBATCH --job-name=OMP_run # job name, "OMP_run" #SBATCH --partition=SOE_main # partition (queue) #SBATCH -t 0-2:00 # time limit: (D-HH:MM) #SBATCH --mem=32000 # memory per node in MB #SBATCH --nodes=1 # number of nodes #SBATCH --ntasks-per-node=16 # number of cores #SBATCH --output=slurm.out # file to collect standard output #SBATCH --error=slurm.err # file to collect standard errors |
sbatch myscript.sh |
squeue -u <username> |
squeue -j 706 --format="%S" |
START_TIME 2015-04-30T09:54:32 |
squeue -j 706 --format="%i %P %j %u %T %l %C %S" |
JOBID PARTITION NAME USER STATE TIMELIMIT CPUS START_TIME 706 SOE_main Par_job_3 mike PENDING 3-00:00:00 64 2015-04-30T09:54:32 |
squeue --start |
sqlog -u <username> |
sqlog -j <JobID> |
CA CANCELLED Job was cancelled. CD COMPLETED Job completed normally. CG COMPLETING Job is in the process of completing. F FAILED Job termined abnormally. NF NODE_FAIL Job terminated due to node failure. PD PENDING Job is pending allocation. R RUNNING Job currently has an allocation. S SUSPENDED Job is suspended. TO TIMEOUT Job terminated upon reaching its time limit.You can specify the fields you would like to see in the output of sqlog:
sqlog --format=list |
sqlog -j 2831 --format=jid,user,state,start,end |
sstat -j <jobid> |
sstat --format="JobID,MaxRSS" -j <jobid> |
sacct --format="JobID,JobName,MaxRSS,Elapsed" -j <jobid> |
sacct --format="JobID,JobName,MaxRSS,Elapsed" -u <username> |
sacct --helpformat |
sacct -j 2831 --format="JobID,JobName,State,Start,End" |
scontrol show job=<jobid> |
scancel <jobid> |
sdel <jobid> |
scancel -u <username> |
scancel --name <myJobName> |