Pbs_release_nodes: No jobid given

I can not include the pbs_release_nodes -a command in my job script.
Is this a bug?
This is my job script.

[test@ohpc137pbsib-sms ~]$ pbs_release_nodes --version

pbs_version = 19.1.1

[test@ohpc137pbsib-sms ~]$

[test@ohpc137pbsib-sms ~]$ cat sleep_10.sh
#!/bin/bash
#PBS -N sleep_10
#PBS -j oe
sleep 10
ssh ohpc137pbsib-sms pbs_release_nodes -a
sleep 10

[test@ohpc137pbsib-sms ~]$ qsub -l select=3 sleep_10.sh

927.ohpc137pbsib-sms

[test@ohpc137pbsib-sms ~]$ cat sleep_10.o927

pbs_release_nodes: No jobid given
[test@ohpc137pbsib-sms ~]$

The following is stated in man pbs_release_nodes
SYNOPSIS
pbs_release_nodes [-j ] -a
(no options)
Without the -j option, pbs_release_nodes uses the value of the PBS_JOBID environment variable as the job ID of the job whose vnodes are to be released.

The following script works correctly.

[test@ohpc137pbsib-sms ~]$ cat sleep_10.sh
#!/bin/bash
#PBS -N sleep_10
#PBS -j oe
sleep 10
ssh ohpc137pbsib-sms pbs_release_nodes -j $PBS_JOBID -a
sleep 10

[test@ohpc137pbsib-sms ~]$ qstat -an

ohpc137pbsib-sms:
                                                            Req’d Req’d Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
932.ohpc137pbsi test     workq    sleep_10   13925  3   3   –      –     R 00:00
  ohpc137pbsib-c001/0+ohpc137pbsib-c002/0+ohpc137pbsib-c003/0
[test@ohpc137pbsib-sms ~]$


[test@ohpc137pbsib-sms ~]$ qstat -an

ohpc137pbsib-sms:
                                                            Req’d Req’d Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
932.ohpc137pbsi test     workq    sleep_10   13925  1   1   –      –     R 00:00
  ohpc137pbsib-c001/0

When you do the ssh to ohpc137pbsib-sms and run pbs_release_nodes you are no longer “in” the PBS job from the perspective of the command (PBS_JOBID environment variable is very likely unset in the ohpc137pbsib-sms that results from the ssh connection), so the error message is correct. You should be able to run pbs_release_nodes directly from within the job script (no ssh) without supplying -j.

Dear scc

Thanks for your comment.
“pbs_release_nodes” could not be run on the primary execution node.
See the following.

[test@ohpc137pbsib-sms ~]$ cat sleep_10.sh
#!/bin/bash
#PBS -N sleep_10
sleep 10
pbs_release_nodes -a
sleep 10

[test@ohpc137pbsib-sms ~]$

[test@ohpc137pbsib-sms ~]$ qsub -l select=3 sleep_10.sh
153.ohpc137pbsib-sms
[test@ohpc137pbsib-sms ~]$ qstat -an

ohpc137pbsib-sms:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
153.ohpc137pbsi test     workq    sleep_10    63361   3   3    --    --  R 00:00
   ohpc137pbsib-c001/0+ohpc137pbsib-c002/0+ohpc137pbsib-c003/0

[test@ohpc137pbsib-sms ~]$ cat sleep_10.e153
pbs_release_nodes: Unauthorized Request
[test@ohpc137pbsib-sms ~]$

Therefore, I ssh to the ohpc137pbsib-sms and executed pbs_release_nodes.

Is “pbs_release_nodes” executable on the primary execution node?

If so, please tell me the reason for “Unauthorized Request”.

I can not imagine why it becomes “Unauthorized Request”.

Is flatuid set to True in your server configuration ?
You can find it out in the output of qstat -Bf

  • can you please run the command pbs_release_nodes outside the script (as the job own) and whether it works

Thank you for pointing it out.
Your point was correct.
I confirmed as follows.

[root@ohpc137pbsib-sms ~]# qstat -Bf |grep flatuid
[root@ohpc137pbsib-sms ~]# qmgr -c "set server flatuid = True"
[root@ohpc137pbsib-sms ~]# qstat -Bf |grep flatuid
    flatuid = True
[root@ohpc137pbsib-sms ~]#



[test@ohpc137pbsib-sms ~]$ qsub -l select=3 sleep_10.sh
154.ohpc137pbsib-sms
[test@ohpc137pbsib-sms ~]$ qstat -an

ohpc137pbsib-sms:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
154.ohpc137pbsi test     workq    sleep_10    89065   3   3    --    --  R 00:00
   ohpc137pbsib-c001/0+ohpc137pbsib-c002/0+ohpc137pbsib-c003/0

[test@ohpc137pbsib-sms ~]$ qstat -an

ohpc137pbsib-sms:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
154.ohpc137pbsi test     workq    sleep_10    89065   1   1    --    --  R 00:00
   ohpc137pbsib-c001/0
[test@ohpc137pbsib-sms ~]$
[test@ohpc137pbsib-sms ~]$ cat sleep_10.o154
[test@ohpc137pbsib-sms ~]$ cat sleep_10.e154
[test@ohpc137pbsib-sms ~]$
1 Like

Setting flatuid=true is one way to allow this, but it does introduce a security situation as now any user who can connect a system to the network and control the user name space on that system can submit a job as any other user. Everything I said in my post at “Where to submit a job instead of pbs server” would also apply here, since the server does the same check of the remote user’s authorization to submit a job (as discussed in that post) as to act on an existing job (such as releasing nodes from it).

1 Like