Dynamic resource configuration

Hi team, we trying to set a dynamic resource to handle the scratch on compute nodes, but we might be missing something as jobs stay in queue if requesting that resource.

We follow the instrucions on 5.14.4.1.i from the Admin guide. We guess, the PBS scheduler never gets the read, it reports an availability of 0, hence it leaves on queue jobs that specify that resource. Probably we missing a step? Probably we doing something not correct? Your feedback as always is much appreciated it. Let us share the procedure we follow, hopefully you notice something wrong:

5.14.4.1.i Example of Configuring Dynamic Host-level Resource
In bold our commands.

  1. Define the resource
    qmgr -c “create resource dynscratch type=size, flag=h”

  2. Write a script, for example hostdyn.pl, that returns the available amount of the resource via stdout. The script must return the value in a single line, ending with a newline. Place the script on each host where it will be used. For example, it could be placed in /usr/local/bin/hostdyn.pl.
    $ cat /usr/local/bin/get_total_quota.sh
    #!/bin/bash
    #To reduce other problems, we simply doing an echo
    echo 527756500

  3. Configure each MoM to use the script by adding the resource and the path to the script in PBS_HOME/mom_priv/ config: Linux: dynscratch !/usr/local/bin/hostdyn.pl
    Line added to /var/spool/pbs/mom_priv/config
    dynscratch !/usr/local/bin/get_total_quota.sh

  4. Reinitialize the MoMs. For Linux, see “Restarting and Reinitializing MoM” on page 167 in the PBS Professional Installation & Upgrade Guide, and for Windows, see “Restarting MoMs” on page 173 in the PBS Professional Installation & Upgrade Guide.
    $ ps -ef | grep pbs_mom
    root 2079 1 0 Oct12 ? 00:03:43 /opt/pbs/sbin/pbs_mom
    root 142266 142093 0 09:25 pts/0 00:00:00 grep --color=auto pbs_mom
    $ kill -HUP 2079

  5. You may optionally specify any limits on that resource via qmgr, such as the maximum amount available, or the maximum that a single user can request. For example: Qmgr: set server resources_max.scratchspace=1gb
    NA

  6. Add the new resource to the resources: line in /sched_config: resources: “ncpus, mem, arch, […], dynscratch”
    Line added to /var/spool/pbs/sched_priv/sched_config
    resources: “ncpus, mem, arch ……, dynscratch”

  7. Add the new resource to the mom_resources: line (deprecated as of 18.2.1) in / sched_config. Create the line if necessary: mom_resources: “dynscratch”
    Also added to previous file:
    mom_resources: “dynscratch”

  8. Restart the scheduler. See “Restarting and Reinitializing Scheduler or Multisched” on page 166 in the PBS Professional Installation & Upgrade Guide.
    # ps -ef | grep pbs_sched
    root 7295 1 0 Oct13 ? 00:00:34 pbs_sched
    root 20773 20642 0 09:31 pts/2 00:00:00 grep --color=auto pbs_sched
    # kill 7295
    # pbs_sched

To request this resource, the resource request would include
-l select=1:ncpus=N:dynscratch=10MB

We did:
$ qsub -l select=1:ncpus=1:dynscratch=10MB -I
qsub: waiting for job 533147.test to start
(job stays in queue regardless the whole cluster is idle)
We observed the reported available is 0kb
$ qstat -xf 533147
comment = Can Never Run: Insufficient amount of resource: dynscratch (R: 10
mb A: 0kb T: 0)

LOGS
From the compute node
$ cat 20231018 | grep get_total
10/18/2023 09:59:19;0080;pbs_mom;n/a;add_static;config[17] add name dynscratch value !/usr/local/bin/get_total_quota.sh

Also,
$ cat 20231018 | grep 533147
$

From the scheduler
# cat 20231018 | grep 533147
10/18/2023 10:32:01;0100;pbs_sched;Job;533147.test;Formula Evaluation = 17
10/18/2023 10:32:01;0080;pbs_sched;Job;533147.test;Considering job to run
10/18/2023 10:32:01;0040;pbs_sched;Job;533147.test;Insufficient amount of resource: dynscratch (R: 10mb A: 0kb T: 0)
10/18/2023 10:32:01;0040;pbs_sched;Job;533147.test;Job will never run with the resources currently configured in the complex

From the server
# cat …/server_logs/20231018 | grep 533147
10/18/2023 10:32:01;0100;Server@test;Job;533147.test;enqueuing into cse12, state Q hop 1
10/18/2023 10:32:01;0008;Server@test;Job;533147.test;Job Queued at request of user@nlogin1, owner = user@nlogin1, job name = STDIN, queue = cse12
10/18/2023 10:32:01;0008;Server@test;Job;533147.test;Job Modified at request of Scheduler@test

Could you please the below section in this guide

Example 9-17: Periodically update resources on vnodes
HG-306 PBS Professional 2022.1 Hooks Guide

Thank you very much for this Adarsh.