I thought $TMPDIR was supposed to be automatically created on all execution hosts in addition to the lead node, but it isn’t. We have $TMPDIR set to /local_scratch in the MOM config file. Do we need to create it via a prologue script?
The way I interpret the documentation (section 9.1.15 in the Admin Guide), you set $tmpdir in the MOM config on each node to be the root of your temp location (/local_scratch in your case). If you’ve done that, then I would expect the MOM to create a unique directory in /local_scratch for the job; probably /local_scratch/$PBS_JOBID.
Yes, I understood it that way too. However, the TMPDIR is being created on the lead node only.
If you are setting $tmpdir in the MOM config on each host, but the job-specific temp dir is not being created on each host, then it could be a bug/regression. If you’ve just recently added $tmpdir to each MOM config, be sure that you restarted each of the MOMs.
We’ve had tmpdir set to /local_scratch for a long time now. However I’ve never checked to see if tmpdir was being created on all execution hosts. We are running open source 17.1.0. Don’t have the opportunity to update in the near future. Is creating tmpdir manually via a prologue acceptable?
I can confirm in PBS Pro 14.2.6 that when I set $tmpdir in the MOM config on several hosts and then run a job on those hosts, the temporary directory is only created by the Mother Superior and not by the sister MOMs. This is contrary to the documentation (section 10.14.3 in the Administrator Guide for 14.2.1). Additionally, PBS_TMPDIR is not defined in my job environment, neither by the Mother Superior nor the sister MOMs, also contrary to the documentation. @scc, do you know if there have been any bugs or regressions filed about this?
Either there is a problem with PBS Pro or there is a problem with the documentation.
Hi @gabe and @coreyferrier! The tmpdir on the secondary execution hosts is not created until pbs_mom actually spawns a task for the job on the node (using pbs_attach to attach an existing process to the node does not result in the creation of the job specific temp directory, since it would be too late to get $TMPDIR set in the processes’ environment to use it anyway).
[user1@centos7-2 ~]$ qsub -lselect=2:ncpus=1 -lplace=scatter -I qsub: waiting for job 30.centos7-2 to start qsub: job 30.centos7-2 ready [user1@centos7-2 ~]$ cat $PBS_NODEFILE centos7-2.prog.altair.com centos7.prog.altair.com [root@centos7 ~]# ls -ld /var/tmp/pbs.* ls: cannot access /var/tmp/pbs.*: No such file or directory [user1@centos7-2 ~]$ pbs_tmrsh centos7.prog.altair.com /bin/true [user1@centos7-2 ~]$ [root@centos7 ~]# ls -ld /var/tmp/pbs.* drwx------. 2 user1 user1 6 Jul 11 12:57 /var/tmp/pbs.30.centos7-2 [root@centos7 ~]#
I hope this helps!
Thanks Scott! I’ve confirmed you are correct. In my earlier test I had not spawned a task on the other nodes as you state.