Following an upgrade from 18 to 20.0.1, preemption is not taking place as expected. I suspect something odd is happening due to custom resources. Jobs in express_queues are not preempting lower priority queues and jobs in the higher priority low-priority queue are not preempting the lowest priority queue.
There are at least 20 nodes which meet the requirements of the qsub running jobs from lower priority queues when I submit:
qsub -I -lselect=9 -q user_a
but the jobs submitted to queue user_a don’t seem to get preempted.
From a tracejob:
10/15/2021 13:27:56 L Insufficient amount of resource: xeon6148_sockets
10/15/2021 13:29:56 L Considering job to run
10/15/2021 13:29:56 L Failed to satisfy subchunk: 9:mpiprocs=40:ncpus=40:xeon6148_sockets=2
10/15/2021 13:29:56 L Employing preemption to try and run high priority job.
10/15/2021 13:29:56 L Allocated one subchunk: mpiprocs=40:ncpus=40:xeon6148_sockets=2
10/15/2021 13:29:56 L Evaluating subchunk: mpiprocs=40:ncpus=40:xeon6148_sockets=2
10/15/2021 13:29:56 L Failed to satisfy subchunk: 9:mpiprocs=40:ncpus=40:xeon6148_sockets=2
10/15/2021 13:29:56 L Limited running jobs used for preemption from 56 to 17
10/15/2021 13:29:56 L Found no preemptable candidates
10/15/2021 13:29:56 L Insufficient amount of resource: xeon6148_sockets
To run through the setup:
We have two types of nodes defined of the pattern:
create node compute001
set node compute001 state = job-busy
set node compute001 resources_available.arch = linux
set node compute001 resources_available.host = compute001
set node compute001 resources_available.mem = 196498720kb
set node compute001 resources_available.ncpus = 40
set node compute001 resources_available.vnode = compute001
set node compute001 resources_available.xeon6148_sockets = 2
set node compute001 resv_enable = True
and:
create node other001
set node other001 state = job-busy
set node other001 resources_available.arch = linux
set node other001 resources_available.host = compute001
set node other001 resources_available.mem = 196498720kb
set node other001 resources_available.ncpus = 40
set node other001 resources_available.vnode = compute001
set node other001 resources_available.xeon8268_sockets = 2
set node other001 resv_enable = True
We have a collection of express queues, we’ll call user_a, user_b, user_c of the pattern:
create queue user_a
set queue user_a Priority = 1000
set queue user_a resources_default.mpiprocs = 40
set queue user_a resources_default.ncpus = 40
set queue user_a default_chunk.mpiprocs = 40
set queue user_a default_chunk.ncpus = 40
set queue user_a default_chunk.xeon6148_sockets = 2
set queue user_a resources_available.ncpus = 360
set queue user_a max_user_res.ncpus = 360
set queue user_a enabled = True
set queue user_a started = True
and then two lower priority queues, s1 and s2, are defined:
create queue s1
set queue s1 queue_type = Execution
set queue s1 Priority = 100
set queue s1 acl_host_enable = False
set queue s1 acl_user_enable = True
set queue s1 resources_max.walltime = 72:00:00
set queue s1 resources_min.walltime = 00:00:00
set queue s1 resources_default.mpiprocs = 40
set queue s1 resources_default.ncpus = 40
set queue s1 resources_default.preempt_targets = QUEUE=s2
set queue s1 default_chunk.mpiprocs = 40
set queue s1 default_chunk.ncpus = 40
set queue s1 resources_available.ncpus = 6040
set queue s1 max_user_res.ncpus = 720
set queue s1 max_user_run_soft = 0
set queue s1 enabled = True
set queue s1 started = True
and:
create queue s2
set queue s2 queue_type = Execution
set queue s2 Priority = -1000
set queue s2 acl_host_enable = False
set queue s2 acl_user_enable = True
set queue s2 resources_max.walltime = 72:00:00
set queue s2 resources_min.walltime = 00:00:00
set queue s2 resources_default.mpiprocs = 40
set queue s2 resources_default.ncpus = 40
set queue s2 resources_default.preempt_targets = NONE
set queue s2 default_chunk.mpiprocs = 40
set queue s2 default_chunk.ncpus = 40
set queue s2 resources_available.ncpus = 8040
set queue s2 max_user_res.ncpus = 8040
set queue s2 max_user_run_soft = 0
set queue s2 enabled = True
set queue s2 started = True
the server is set:
set server scheduling = True
set server default_queue = s2
set server log_events = 2047
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.ncpus = 1
set server resources_default.preempt_targets = QUEUE=s1
set server resources_default.preempt_targets += QUEUE=s2
set server default_chunk.ncpus = 1
set server scheduler_iteration = 120
set server resv_enable = True
set server node_fail_requeue = 310
set server max_array_size = 10000
set server default_qsub_arguments = -keod
set server rpp_highwater = 16000
set server pbs_license_min = 0
set server pbs_license_max = 2147483647
set server pbs_license_linger_time = 31536000
set server eligible_time_enable = False
set server job_history_enable = True
set server job_history_duration = 00:30:00
set server max_concurrent_provision = 5
set server max_job_sequence_id = 9999999
and the scheduler is set:
create sched default
set sched sched_host = deluge-hn1.cm.cl7.hpc.lle.rochester.edu
set sched sched_cycle_length = 00:20:00
set sched sched_preempt_enforce_resumption = False
set sched preempt_targets_enable = True
set sched sched_port = 15004
set sched sched_priv =/var/spool/pbs/sched_priv
set sched sched_log = /var/spool/pbs/sched_logs
set sched scheduling = True
set sched scheduler_iteration = 120
set sched state = idle
set sched preempt_queue_prio = 150
set sched preempt_prio = "express_queue, normal_jobs"
set sched preempt_order = R
set sched preempt_sort = min_time_since_start
set sched log_events = 4095
set sched server_dyn_res_alarm = 30