Route queue to workq

mguereta · July 17, 2024, 5:31pm

Hello everyone,

I have a small environment (1 head and 5 compute nodes) and I need to implement queue roles based on CPU request. According to CPU request, the job must be send to nodes 1, 2, and 3. And others to nodes 4 and 5.

I created a hook to classify the job according to CPU requests. The script hook works as expected, setting a designated queue based on CPU request rules. Also, I created two routing queues and set as destination queue the default workq, but when I specify the host on routing destination, the job stays on queue and does not start.

Below are queues configurations:

create queue workq
set queue workq queue_type = Execution
set queue workq acl_host_enable = False
set queue workq from_route_only = False
set queue workq resources_max.walltime = 240:00:00
set queue workq resources_min.walltime = 00:00:00
set queue workq enabled = True
set queue workq started = True

Create and define queue execq_s

create queue execq_s
set queue execq_s queue_type = Route
set queue execq_s acl_host_enable = False
set queue execq_s route_destinations = workq@node4
set queue execq_s enabled = True
set queue execq_s started = True

qstat -Qf:
Queue: workq
queue_type = Execution
total_jobs = 4
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:4 Exiting:0 Begun
:0
acl_host_enable = False
from_route_only = False
resources_max.walltime = 240:00:00
resources_min.walltime = 00:00:00
resources_assigned.mem = 0mb
resources_assigned.mpiprocs = 124
resources_assigned.ncpus = 124
resources_assigned.nodect = 4
hasnodes = True
enabled = True
started = True

Queue: exeq_s
queue_type = Route
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun
:0
acl_host_enable = False
route_destinations = workq@node4
enabled = True
started = True

and tracejob
07/17/2024 14:24:26 S enqueuing into execq_s, state Q hop 1
07/17/2024 14:24:26 S Job Queued at request of guereta@lsmchnode01.cm.cluster, owner = guereta@lsmchnode01.cm.cluster, job name = md_npt, queue = execq_s
07/17/2024 14:24:26 S dequeuing from execq_s, state T
07/17/2024 14:24:26 A user=guereta group=starccm project=_pbs_project_default jobname=md_npt queue=execq_s ctime=1721237066 qtime=1721237066 etime=0 Resource_List.mem=10000mb Resource_List.mpiprocs=1 Resource_List.ncpus=1 Resource_List.nodect=1
Resource_List.place=free Resource_List.select=1:arch=linux:ncpus=1:mem=10000mb:mpiprocs=1 Resource_List.software=NAMD
07/17/2024 14:24:56 A Job rejected by all possible destinations
07/17/2024 14:24:56 S Job rejected by all possible destinations

Using route destination only the workq (without @node4), the routing works well and job starts, but to first available host.

I am sure that specified node4 has the enough resource.

I checked the PBS Admin Guide, and there is not further information about using this kinding of routing destinations. According to guide, it should work.

Could you please help me?
Thanks in advance,

adarsh · July 17, 2024, 6:49pm

While submitting a job using qsub, did you request for walltime less than 240:00:00

qsub -l select=1:ncpus=1:mem=1gb -l walltime=20:00:00 – /bin/sleep 1000

If you do not request walltime, then the default walltime assigned to the job is 5 years, hence this might be one of the reason to be rejected by all the desitnation execution queue(s) of the routing queue.

mguereta · July 17, 2024, 9:40pm

Hi,
Thank you for quick reply.

I submitted a job with walltime, and the result was the same:

Job: 3656.lsmchnode01

07/17/2024 18:34:26 S enqueuing into execq_s, state Q hop 1
07/17/2024 18:34:26 S Job Queued at request of guereta@lsmchnode01.cm.cluster, owner = guereta@lsmchnode01.cm.cluster, job name = nwchem_input, queue = execq_s
07/17/2024 18:34:26 S dequeuing from execq_s, state T
07/17/2024 18:34:26 A user=guereta group=starccm project=_pbs_project_default jobname=nwchem_input queue=execq_s ctime=1721252066 qtime=1721252066 etime=0 Resource_List.mem=512mb Resource_List.mpiprocs=2 Resource_List.ncpus=2 Resource_List.nodect=1
Resource_List.place=pack Resource_List.select=1:arch=linux:ncpus=2:mem=512mb:mpiprocs=2 Resource_List.software=NWCHEM Resource_List.walltime=12:34:56
07/17/2024 18:34:56 A Job rejected by all possible destinations
07/17/2024 18:34:56 S Job rejected by all possible destinations

Thanks,

adarsh · July 17, 2024, 10:49pm

This seems to be the issue with your routing queue configuration.

The below configuration worked for me.

pbsdata@hn ~]$ qmgr -c 'p q @default'
#
# Create queues and set their attributes.
#
#
# Create and define queue workq
#
create queue workq
set queue workq queue_type = Execution
set queue workq acl_host_enable = False
set queue workq from_route_only = False
set queue workq resources_max.walltime = 240:00:00
set queue workq resources_min.walltime = 00:00:00
set queue workq enabled = True
set queue workq started = True

#
# Create and define queue execq_s
#
create queue execq_s
set queue execq_s queue_type = Route
set queue execq_s acl_host_enable = False
set queue execq_s route_destinations = workq
set queue execq_s enabled = True
set queue execq_s started = True


[pbsdata@hn ~]$ qsub -q execq_s -l select=1:ncpus=1 -l walltime=10:00:00 -- /bin/sleep 100
138.hn
[pbsdata@hn ~]$ qstat -answ1

hn: 
                                                                                                   Req'd  Req'd   Elap
Job ID                         Username        Queue           Jobname         SessID   NDS  TSK   Memory Time  S Time
------------------------------ --------------- --------------- --------------- -------- ---- ----- ------ ----- - -----
138.hn                         pbsdata         execq_s         STDIN                --     1     1    --  10:00 Q  --   -- 
    -- 

hn: 
                                                                                                   Req'd  Req'd   Elap
Job ID                         Username        Queue           Jobname         SessID   NDS  TSK   Memory Time  S Time
------------------------------ --------------- --------------- --------------- -------- ---- ----- ------ ----- - -----
138.hn                         pbsdata         workq           STDIN              10733    1     1    --  10:00 R 00:00:00 hn/0
   Job run at Wed Jul 17 at 23:46 on (hn:ncpus=1)
   
   
[pbsdata@hn ~]$ tracejob 138.hn

Job: 138.hn

07/17/2024 23:45:54  S    enqueuing into execq_s, state Q hop 1
07/17/2024 23:45:54  S    Job Queued at request of pbsdata@hn, owner = pbsdata@hn, job name = STDIN, queue = execq_s
07/17/2024 23:46:37  L    Considering job to run
07/17/2024 23:46:37  S    Job Run at request of Scheduler@hn on exec_vnode (hn:ncpus=1)
07/17/2024 23:46:37  L    Job run
07/17/2024 23:46:37  M    Started, pid = 10733
07/17/2024 23:46:37  S    dequeuing from execq_s, state Q
07/17/2024 23:46:37  S    enqueuing into workq, state Q hop 1

mguereta · July 18, 2024, 12:49pm

Hi,

This kind of route queue configuration worked to me as well, when you only set as the destination the execution queue. But, according to the PBS guide, it is possible address queue and host in the routing destination. Below the information I got from PBS Admin Guide:

Destinations can be specified in the following ways:
route_destinations = Q1
route_destinations = Q1@Server1
route_destinations = “Q1, Q2@Server1, Q3@Server2”
route_destinations += Q1
route_destinations += “Q4, Q5@Server3”

I need to address the execution to specific hosts, according to the routing queue designated by hook script.

Thank you,

adarsh · July 18, 2024, 5:05pm

Could you please check whether you can submit jobs this way (assuming you are on server0)

qsub -q q1@server1 -- /bin/sleep 10
qsub -q q2@server2  -- /bin/sleep 10

the deatils of q1 and q2 is required ( qmgr -c ‘p q q1’ and qmgr -c ‘p q2’ from server 1 and 2)
whehter uid/gid is the same
whether you can run jobs from either of the server targetting queues of the other server

Thank you

dtalcott · July 19, 2024, 12:39am

This is your problem. The host in a route queue queue@host specification is a pbs_server host, not an execution host.

Could you supply more details about which jobs you want to go to nodes 1-3 and which to nodes 4-5? How are 1-3 different from 4-5?

I suspect you can get what you want by defining a custom boolean resource missing from 1-3 but present on 4-5. Your hook then adds the appropriate resource request to the job.

mguereta · July 19, 2024, 2:35pm

HI,

I guess we found the problem:

qsub -q workq@lsmccnode04 – /bin/sleep 10
Connection refused
qsub: cannot connect to server lsmccno (errno=15010)

but, if I run to the headnode, the job is accepted and addressed to a compute node:

qsub -q workq@lsmchnode01 – /bin/sleep 10
3663.lsmchnode01

3663.lsmchnode* usrpam-* workq STDIN 215318 1 1 – 240:0 R 00:00 lsmccnode01/0

I guess it is missing some configuration n the cluster/PBS.

Thank you,

mguereta · July 19, 2024, 5:49pm

Hi,

Summing up there are two conditions based on to job CPU request.
If less than 64 CPUs then route to node1, node2.
If more than 64 CPUs, then route to node3, node4, or node5.

The hook script can check job CPU request and set max walltime. The issue is route to designated host.

I appreciate your help,

dtalcott · July 19, 2024, 10:37pm

I haven’t tested this, but something like the following should work:

Create a queue “small” with resources_max.ncpus=63
Set the queue for node1 and node2 to small.

Create a queue “large” with resources_min.ncpus=64
Set the queue for nodes3-5 to large.

Create a routing queue “normal” with route_destinations=small,large

Make normal the default queue.

No hook is needed.

With this, jobs will get routed to the small or large queues based on how many CPUs they ask for. Once in the correct queue, they will run only on the nodes assigned to that queue.

mguereta · August 5, 2024, 10:01pm

Hi, sorry for late reply.

I followed your suggestion and changed a bit to fits my scenario.
Instead creating a routing queue, I set the queue on hook, since I need to set the walltime as well. Hook works as expected, but I am still facing problem on queue.

To test, I created a queue (execq_s) to small jobs and set just one node on this queue:

create queue execq_s
set queue execq_s queue_type = Execution
set queue execq_s from_route_only = False
set queue execq_s resources_default.ncpus = 1
set queue execq_s resources_default.nodect = 1
set queue execq_s resources_default.nodes = lsmccnode05
set queue execq_s enabled = True
set queue execq_s started = True

When I submit a job and hook sets queue properly, but the job stucks on queue state:

qstat output:

lsmchnode01: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
4097.lsmchnode* guereta  execq_s  md_npt        --    1   1   10gb   --  Q   --   --

Jobtrace result:

Job: 4097.lsmchnode01

08/05/2024 18:37:24  L    Considering job to run
08/05/2024 18:37:24  S    Job Queued at request of guereta@lsmchnode01.cm.cluster, owner = guereta@lsmchnode01.cm.cluster, job name = md_npt, queue = execq_s
08/05/2024 18:37:24  S    Job Modified at request of Scheduler@lsmchnode01.cm.cluster
08/05/2024 18:37:24  L    Not enough total nodes available
08/05/2024 18:37:24  L    Job will never run with the resources currently configured in the complex
08/05/2024 18:37:24  S    enqueuing into execq_s, state Q hop 1
08/05/2024 18:37:24  A    user=guereta group=starccm project=_pbs_project_default jobname=md_npt queue=execq_s ctime=1722893844 qtime=1722893844 etime=1722893844 Resource_List.max_walltime=00:01:00 Resource_List.mem=10000mb
                          Resource_List.min_walltime=00:00:01 Resource_List.mpiprocs=1 Resource_List.ncpus=1 Resource_List.nodect=1 Resource_List.nodes=lsmccnode05 Resource_List.place=free
                          Resource_List.select=1:arch=linux:ncpus=1:mem=10000mb:mpiprocs=1 Resource_List.software=NAMD 
08/05/2024 18:39:40  L    Considering job to run
08/05/2024 18:39:40  L    Not enough total nodes available
08/05/2024 18:39:40  L    Job will never run with the resources currently configured in the complex
08/05/2024 18:40:20  L    Considering job to run
08/05/2024 18:40:20  L    Not enough total nodes available
08/05/2024 18:40:20  L    Job will never run with the resources currently configured in the complex

BTW, there is enough resource available on lsmccnode05 node. During tests there were more than 10 ncpus available and I job asked only 1.

I guess there is missing configuration on execq_s queue, but I do not know where.

Thank you,

mguereta · August 6, 2024, 6:14pm

Guys,

I found the error in Bright Cluster Manager configuration. It was not allowed other execution queues, only workq default. After adjusting, I could submit jobs using the new queue.

I can move forward with hook + queue development.

Thanks you so much,

Topic		Replies	Views
Routing queue for other queue Users/Site Administrators	3	1372	November 14, 2019
How to route to multiple execution queues? Users/Site Administrators	8	4505	August 7, 2017
Routing queues and queuejob event hooks Developers	5	1116	November 19, 2021
Hook: Move job to Execution-Queue Users/Site Administrators	6	2385	September 7, 2017
Array jobs not entering exec queue in all cases Users/Site Administrators	4	770	September 10, 2019

Route queue to workq

Related topics