Hi,
I am working on a hook that takes nodes offline. PBS hook documentation https://www.pbsworks.com/pdfs/PBSHooks14.2.pdf section 9.7 presents us one way but my requirements were different as my node offline takes place on the basis of exit code from job.
The below code gives us an output file pqr.txt which contains name of the nodes on which job was submitted. Sample output node1, node2 …
My objective is to offline all the nodes retuned from pqr.txt.
Now if I use the below concept in my code
if ((current_state == pbs.ND_OFFLINE) == 0):
vnl[new_list[y]].state = pbs.ND_OFFLINE
vnl[new_list[y]].comment = “offlined node as it is heavily loaded”
print >> fd_out1, 'Ele is = ’ + new_list[y]
as an output the node on which job was submitted goes offline. Though I had requested 3 nodes all those 3 should have gone offline.
Another approach I followed is to call a subprocess i.e a bash script within the below code
import pbs
import os
import re
import sys
import subprocess
e = pbs.event()
try:
if e.job.in_ms_mom():
exit_code = str(e.job.Exit_status)
execution_vnode = str(e.job.exec_vnode)
execution_vnode1 = execution_vnode.split('+')
vnl = e.vnode_list
new_list = []
if int(exit_code) != 0:
report_file3 = str("/home/centos/pqr.txt")
pbs.logmsg(pbs.LOG_DEBUG, "report_usage file1 is %s" % report_file3)
fd_out1 = open(report_file3, 'w+')
print >> fd_out1, 'To: id@gmail.com'
print >> fd_out1, 'From: id@ndsu.edu'
print >> fd_out1, 'Subject: Node Taken offline'
for x in range(len(execution_vnode1)):
data = execution_vnode1[x].split(':')
new_list.append(data[0][1:])
for y in range(len(new_list)):
current_state = pbs.server().vnode(new_list[y]).state
if ((current_state == pbs.ND_OFFLINE) == 0):
vnl[new_list[y]].comment = "offlined node as it is heavily loaded"
print >> fd_out1, '' + new_list[y]
else:
vnl[new_list[y]].comment = "exit status of the job is not negative"
fd_out1.close()
mail_cmd="/usr/sbin/sendmail -t \"PBS OSS\" < /home/centos/pqr.txt"
pbs.logmsg(pbs.LOG_DEBUG, "mail_command is %s" % mail_cmd)
os.popen(mail_cmd)
os.system('sh /home/centos/offline.sh')
except SystemExit:
pass
except:
pbs.logmsg(pbs.LOG_DEBUG, “report_usage: failed with %s” % str(sys.exc_info()))
pass
Bash Script named as offline.sh. This takes input from a file
#!/bin/bash
input="/home/centos/input.txt"
while IFS= read -r line
do
if [[ “$line” = node ]];
then
pbsnodes -o $line
fi
done < input.txt
The problem that I am facing is the subprocess i.e bash script is unable to perform any action on vnode. i don’t know whether calling of bash script is correct in my code. I have tried various ways like subprocess.call() and other methods but none of them works. In case I run bash script indicidually it is able to perform the desired action. This signifies bash script is correct but why its not working through hook.