In slurm we can pass an environment variable to a job to get the number of cpus allocated to the job.
SLURM_JOB_CPUS_PER_NODE
Count of CPUs available to the job on the nodes in the allocation, using the format CPU_count [(x number_of_nodes )][, CPU_count [(x number_of_nodes )] …]. For example: SLURM_JOB_CPUS_PER_NODE=‘72(x2),36’ indicates that on the first and second nodes (as listed by SLURM_JOB_NODELIST) the allocation has 72 CPUs, while the third node has 36 CPUs. NOTE : The select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on allocated nodes. The select/cons_res and select/cons_tres plugins allocate individual CPUs to jobs, so this number indicates the number of CPUs allocated to the job.
Do you need only an environment variable? because there are alternative ways.
At python I use for example: import psutil number_of_cpus = len(psutil.Process().cpu_affinity())
Note you can also watch which PCU ID is assigned.
I think there is no such PBS environment variable.
thanks for your replies. I also found out ${N_CPUS}, thus setting to export OMP_NUM_THREADS=${NCPUS}. I’m curious about how is the standard to way to set memory limitation as well ? thanks
you can use a hook to modify the job’s environment at submission time.
This is what I used to set some values as memory and cpus:
import pbs
E = pbs.event()
J = E.job
R = J.Resource_List
V = J.Variable_List
if R["select"] != None:
sel = repr(R["select"])
ncpus = 0
mem = pbs.size(0)
chunks = sel.split("+")
for chunk in chunks:
mult = 1
for c in chunk.split(":"):
kv = c.split("=")
if len(kv) == 1:
mult = int(kv[0])
elif len(kv) == 2:
if kv[0] == "ncpus":
ncpus += mult * int(kv[1])
elif kv[0] == "mem":
size = pbs.size_to_kbytes(pbs.size(kv[1]))*1024
mem += pbs.size(mult*int(size))
if ncpus == 0:
if R["ncpus"] != None:
ncpus = int(R["ncpus"])
if pbs.size_to_kbytes(mem)*1024 == 0:
if R["mem"] != None:
m = pbs.size(R["mem"])
mkb = pbs.size_to_kbytes(m)*1024
mem = pbs.size(mkb)
V["PBS_MEM"] = mem
V["PBS_NPROCS"] = ncpus
print(V)
E.accept()
V is the list of variables exported to the job’s env.