Interactive jobs do not work with qsub as they ideally should with openpbs that we have.
When I do qsub -I -X qsub_interactive.sh, it just drops to the shell of the queue compute node rather than staying in current shell. From there, I have to launch the GUI app such as emacs.
qsub_interactive.sh just contains PBS variables such as node name, project name, memory required etc.
I am looking to find an equivalent of qrsh utility on openpbs, if such exists. I see it exists on Sun grid engine but couldn’t find any qrsh package for openpbs.
Has anyone used qrsh or similar utility with openpbs before or know if any alternative to it exists?
Thanks a lot!
The terminology you are using is not clear to me, so lets make sure I understand the issue first.
Could you give a more detailed description of your environment? Include things like, PBS version, OS, and what you have for login nodes, server nodes, compute nodes, etc…
You say “it just drops to the shell of the queue compute node rather than staying in current shell”
First, a successful interactive job will drop you into a shell on the head node (the first one listed in your
$PBS_NODEFILE). It does not stay in the shell from which you submitted the qsub. When your job ends, either by you exiting or running out of time, you will be returned to the shell from which you did the qsub.
queue compute node? Is one of your nodes acting as login, server, and compute node? In other words, does do any of your pbs.conf files have a 1 for both
- If yes, then if that node meets the resources specifications in
qsub_interactive.sh then it is functioning exactly as it should. If you have have more than one compute node, try adding
vnode=<some other vnode> and that should force it on to another node.
- If no, then if you are on a login node that has no MOM, i.e. pbs.conf does not have
PBS_START_MOM = 1, and it is dropping into a shell on that node, then yes, something is wrong. If you sshed into a compute node and are running the qsub from a shell there, then again if that node matches the resource specification it is working as intended.
I have never used qrsh, so I can’t talk about that. Is the issue that your compute nodes don’t have internet access? Is this an ssh tunneling issue? I am trying to figure out why you would run emacs from a compute node rather than a login node, unless they are one and the same.
You used emacs as an example, if you are editing code, in my experience that is usually done outside of a job on the login node. As an aside, emacs (and most editors these days), has the ability to run on your local machine and use an ssh tunnel to access the files. That might be a better way to do that. Just a thought.