Disallow ssh access to the node where a job is running in PBS (even if you are the owner of the job)

A user with a running job from PBS can make an ssh connection to the execution node, is there any way to prohibit this?
We currently restrict it by specifying $restrict_user on and $restrict_user_exceptions in the mom config file, but users with running jobs can still log in. How can I prevent this?

You can use the /etc/ssh/sshd_config file to control who can ssh to the node. How you would use it would depend on what you are trying to accomplish:

  • If you never wanted anyone but root or maybe sys admins to be able to log in to compute nodes, whether there was a job running or not, just configure the file permanently.
  • If the rules are different when a job is running, use a hooks to configure it appropriately, probably execjob_[start|end].

In either case you need to consider interactive jobs, unless you don’t allow those.

I guess I would also ask why you want to prevent a user from sshing to their own nodes? They have been allocated those nodes. They are theirs to use. They may want to ssh in to check performance counters, attach a debugger, grab a core file, any number of things.

If this is for security, I don’t see how it helps. Even if you prevent the users from logging into the nodes, you can’t use that to assume the nodes are in a known good state, you have to be able to prove that via some form of attestation (comparing hashes on files or whatever).

Anyway, I hope that helps.

Bill

I assume this would make MPI job broken? or you shall modify the sshd config quite often.

First, I had the wrong file. etc/ssh/sshd_config should work, but I checked with our admins this morning and we use /etc/security/access.conf, but yes, as you say that would break MPI. I hadn’t thought about that aspect, I was just thinking about how we control ssh access.

I am not sure how to implement this, but perhaps you could implement some sort of firewall rules locally on the compute nodes via hooks that isolates the nodes in the jobs? At its most basic, you could just block the logins. Or you could try and separate them from everything else, but then you have to worry about where various services are running. Just be careful that you don’t end up locking yourself completely out of the node in the event of a mom failure and the epilogue doesn’t run to open the firewall back up. Again, that is an idea, I have never seen it done and I don’t know exactly how to implement it.

Would you mind sharing the use case for this?

If you have a version of MPI that is built with PBS TM (Task Manager) support, you don’t need to use ssh as your PLI (process launch interface). I mention the acronyms because that is how OpenMPI refers to them internally, and it will help you determine the correct arguments to provide the configure script should you try to build MPI on your own. IIRC, “./configure --with-tm=pbs” or “./configure --with-tm=torque”.

Thank you for your help.

The reasons for wanting to prohibit ssh are as follows

  • I have a node with two GPUs and I have defined GPUs as resources and I use Hook in cgroups to control GPU allocation.
  • User A and User B have requested one GPU each and are running jobs.
  • User A logs in interactively to the node where the job is running and executes the program using the GPU allocated by user B.
  • User B’s job execution time is slower than expected.

With the above background, we are wondering if it is possible to prohibit ssh login.

Administrators must be able to ssh login.
MPI uses IntelMPI and OpenMPI built with --tm=pbs.
Also, ssh must be allowed between compute nodes in a cluster.
(I believe this can be done by defining access.conf.)

I was wondering if it is possible to run under the cgroup of a job already running when sshing to a node where the job is running, but is this not a feature in PBS? (I have heard that Slurm can control this.)

If the user has the right number of GPU resources, there is no problem, but we are considering if there is a defensible plan.

Best Regards,