My machine’s hostname is ‘d003’, why I run a pbs script, the $PBS_NODEFILE display ‘d003.lan’?
My job runs abnormally because of that.
check entries in /etc/hosts
Check the DNS settings; verify that nslookup <hostname>
gives correct and consistent (across the nodes) answers. Also, check hostname -f
. Furthermore, I believe MoM remembers the hostname when it started and isn’t updated if afterward the hostname changed for a reason; so you may need to restart the service.
My machine’s hostname is ‘d003’, why I run a pbs script, the $PBS_NODEFILE display ‘d003.lan’?
If you are using it for MPI jobs (hostsfile) you can read the contents of $PBS_NODEFILE and rewirte the file with the required hostnames
- is the machine part of the domain ?
- domainame command output
Try creating node as below:
qmgr -c "create node nodname Mom=nodename"
yes correct, if the node exists, then, you need to delete it first and add it again with the required Mom name configuration
- delete all the nodes and create them in the order you want it to be listed