Multiple Job ID's listed

PBSPro-Me · September 23, 2024, 7:39pm

Hi

Could someone explain why there are duplicate Job Id’s listed in this output and why they are counted as jobs under ‘njobs’? TIA

Command used: pbsnodes -aSvj


                                       mem       ncpus   nmics   ngpus
vnode           njobs   run   susp      f/t        f/t     f/t     f/t   jobs
--------------- ------ ----- ------ ------------ ------- ------- ------- -------

Node1                8     8      0    336gb/1tb   15/88     0/0     0/0 1017603.hpc-pbs,1020539.hpc-pbs,1020539.hpc-pbs,1020539.hpc-pbs,1020539.hpc-pbs,1020539.hpc-pbs,1020539.hpc-pbs,1020970[6].hpc-pbs

Node2                3     3      0    300gb/1tb   17/88     0/0     0/0 1011621.hpc-pbs,1020542.hpc-pbs,1021079.hpc-pbs


Node3                9     9      0    158gb/1tb   46/88     0/0     0/0 1021230.hpc-pbs,1019593.hpc-pbs,1019593.hpc-pbs,1019593.hpc-pbs,1019593.hpc-pbs,1019593.hpc-pbs,1019593.hpc-pbs,1021230.hpc-pbs,1021092.hpc-pbs

dtalcott · September 23, 2024, 9:50pm

Is there anything interesting about how the duplicate jobs requested resources or how they ran?

Perhaps a tracejob 1020539 would shed some light?

Also a pbsnodes Node1

Source · September 24, 2024, 3:40am

in pbsnodes, it displays the job by chunks. e.g, using select=1:ncpus=32 there would be 1 entry, select=32:ncpus=1 there would be 32 entries. select=2:ncpus=16 there would be 2 entries.

dtalcott · September 24, 2024, 8:12pm

When you add the -Sjv option, pbsnodes tries to consolidate entries for the same job into one entry. For some reason, this is not working. I was hoping the exact pbsnodes Node1 output would give a clue as to why.

I noticed one issue. The entries have ‘1020539.hpc-pbs’ instead of just ‘1020539’. This suggests there is some confusion about the default server name.

PBSPro-Me · September 25, 2024, 1:13pm

Thanks.
There’s definitely a disconnect between njobs displayed with ‘pbsnodes -aSjv’ and the number (count R’s per node) captured using ‘qstat -a -n1’. This is ultimately what I need to figure out.

Topic		Replies	Views
Pbsnodes counting same job twice Users/Site Administrators	0	290	October 15, 2023
Get number of jobs for each node Users/Site Administrators	2	132	September 18, 2024
Show how many jobs are running on a node Users/Site Administrators	2	520	November 24, 2020
Job stuck in queue, multiple servers Users/Site Administrators	5	1036	September 14, 2022
How to query the number of available cores to your job Users/Site Administrators	17	15640	October 11, 2018

Multiple Job ID's listed

Related topics