Max running Job

Hi Expert,

I have a cluster with 30 nodes, each node has 30 Cores, so in the PBS setup, each node has 30 slots.

I run an array job with 600 sub jobs.
But when I see the qstat -t jobid[] few jobs queued, even the cluster still has a lot of empty slots.

how to configure so the 600 sub-jobs can run at once?

Please try this
qsub -J 1-600 -l select=1:ncpus=1:mem=10mb – /bin/sleep 60
qstat -t

Hi @adarsh,

Thank you, unfortunately still the same, about 2 hundred jobs still queued even i have a lot of empty slots

What does the job comment say? “qstat -tf | grep comment”

Hi Agra,
below is the output:

comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job run at Fri Aug 06 at 11:16 on (node16:ncpus=1:mem=10240kb)
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16
comment = Job Array Began at Fri Aug 06 at 11:16

Please share the output of
qmgr -c ‘p s’
pbsnodes -av
qstat -tansw1

That looks like either all jobs ran, or scheduler didn’t see some of the jobs at all. Can you try triggering another sched cycle (qmgr -c ‘s s scheduling=t’) and check the sched logs to see what happens to the jobs which didn’t get run? sched logs are at $PBS_HOME/sched_logs/

Server config:

Qmgr: p s
#
# Create resources and set their properties.
#
#
# Create and define resource ngpus
#
create resource ngpus
set resource ngpus type = long
set resource ngpus flag = hn
#
# Create and define resource gpu_id
#
create resource gpu_id
set resource gpu_id type = string
set resource gpu_id flag = h
#
# Create queues and set their attributes.
#
#
# Create and define queue workq
#
create queue workq
set queue workq queue_type = Execution
set queue workq max_user_run = 1000
set queue workq enabled = True
set queue workq started = True
#
# Create and define queue cpu
#
create queue cpu
set queue cpu queue_type = Execution
set queue cpu enabled = True
set queue cpu started = True
#
# Create and define queue gpu
#
create queue gpu
set queue gpu queue_type = Execution
set queue gpu enabled = True
set queue gpu started = True
#
# Create and define queue testq
#
create queue testq
set queue testq queue_type = Execution
set queue testq enabled = True
set queue testq started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = root@node01
set server managers += root@*
set server default_queue = workq
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.ncpus = 1
set server resources_default.place = pack
set server default_chunk.ncpus = 1
set server scheduler_iteration = 600
set server flatuid = True
set server resv_enable = True
set server node_fail_requeue = 310
set server max_array_size = 500000
set server default_qsub_arguments = -V
set server pbs_license_min = 0
set server pbs_license_max = 2147483647
set server pbs_license_linger_time = 31536000
set server eligible_time_enable = False
set server job_history_enable = True
set server max_concurrent_provision = 5
set server max_job_sequence_id = 9999999

pbsnodes -av, its like this for all of the nodes:

Mom = nodegraph01.hpcc.local
     Port = 15002
     pbs_version = 19.1.3
     ntype = PBS
     state = free
     pcpus = 12
     resources_available.arch = linux
     resources_available.host = nodegraph01
     resources_available.mem = 131768012kb
     resources_available.ncpus = 12
     resources_available.vnode = nodegraph01
     resources_assigned.accelerator_memory = 0kb
     resources_assigned.hbmem = 0kb
     resources_assigned.mem = 0kb
     resources_assigned.naccelerators = 0
     resources_assigned.ncpus = 0
     resources_assigned.vmem = 0kb
     queue = cpu
     resv_enable = True
     sharing = default_shared
     last_state_change_time = Wed Jul 21 21:20:03 2021
     last_used_time = Wed Jul 21 21:29:08 2021

below is the example of the qstat -tansw1

   Job run at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[171].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/2
   Job run at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[172].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/3
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[173].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/4
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[174].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/5
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[175].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/6
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[176].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/7
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[177].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/8
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[178].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/9
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[179].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  R  --  node15/10
   Job was sent for execution at Sun Aug 08 at 18:54 on (node15:ncpus=1:mem=10240kb)
4057[180].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[181].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[182].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[183].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[184].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[185].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54
4057[186].nodemgr01            testhpc         workq           Test                 --     1     1   10mb   --  Q  --   -- 
   Job Array Began at Sun Aug 08 at 18:54

This is under the sched logs

8/08/2021 18:54:29;0040;pbs_sched;Job;4057[252].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Req;;Leaving Scheduling Cycle
08/08/2021 18:54:29;0080;pbs_sched;Req;;Starting Scheduling Cycle
08/08/2021 18:54:29;0004;pbs_sched;Fil;holidays;The holiday file is out of date; please update it.
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[225].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[226].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[227].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[228].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[229].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[230].nodemgr01;Job run
08/08/2021 18:54:29;0080;pbs_sched;Job;4057[].nodemgr01;Considering job to run
08/08/2021 18:54:29;0040;pbs_sched;Job;4057[231].nodemgr01;Job run

Thank you for sharing these details @jxdn , much appreciated.

Please check this node, Mom = nodegraph01.hpcc.local and it is mapped to a queue called cpu . The admin might have mapped nodes to queue, hence the queue you have submitted might not have enough nodes assigned to it.

You have submitted jobs to workq, hence your job cannot run on the nodes that are assigned to cpu queue. Check this command and find out the list of nodes

pbsnodes -av | grep -e Mom -e queue

Hi Apologize,

node21       free            --       --       node21       workq         189gb      28       0       0 --
node22       free            --       --       node22       workq         189gb      28       0       0 --
node23       free            --       --       node23       workq         189gb      28       0       0 --
node24       free            --       --       node24       workq         189gb      28       0       0 --
node31       free            --       --       node31       workq         189gb      28       0       0 --
node32       free            --       --       node32       workq         189gb      28       0       0 --
node33       free            --       --       node33       workq         189gb      28       0       0 --
nodegraph01     free            --       --       nodegraph01     cpu           126gb      12       0       0 --

These is properties for each workq nodes:

node34
     Mom = node34
     Port = 15002
     pbs_version = 19.1.3
     ntype = PBS
     state = free
     pcpus = 28
     resources_available.arch = linux
     resources_available.host = node34
     resources_available.mem = 197690636kb
     resources_available.ncpus = 28
     resources_available.vnode = node34
     resources_assigned.accelerator_memory = 0kb
     resources_assigned.hbmem = 0kb
     resources_assigned.mem = 0kb
     resources_assigned.naccelerators = 0
     resources_assigned.ncpus = 0
     resources_assigned.vmem = 0kb
     queue = workq
     resv_enable = True
     sharing = default_shared
     last_state_change_time = Tue Jul 27 12:34:04 2021

its correct it goes to workq, workq has more than 600 slots. but not all of 600 array job executed. 100+ still queued even the cluster empty (no jobs running)

Thank you

Please unset these attribute
qmgr: unset server resources_default.place # or else request in your job -l place=free
qmgr: unset queue workq max_user_run

Could you please share your job script

Hi Mr. adarsh,

Thanks its now run all at once

1 Like