Jobs fail with more than 1 per node

I’m encountering a problem I haven’t seen before.
Running PBS Pro CE on Ubuntu 18.04.
If I submit large array jobs, the jobs fail to start (or fail to finish) if more than one job per node is running at a time. If I configure the jobs so that each node runs only one at a time, the jobs run fine.
Typically, there are two results;

  1. multiple pbs_mom processes are spawned, but fail to su to my user account, or
  2. the pbs_mom processes su to my account, then become defunct
    Occasionally these failures are accompanied by mom_log messages stating that stdout/stderr files couldn’t be opened in the working directory.
    Has anyone else experienced this error? Got any clues on how to diagnose this?

I found the solution. The nodes authenticate to an LDAP server which is part of the cluster. For an unknown reason, non-root logins to the nodes would hang for 30 seconds after logging in. I say “unknown” because as near as we could tell the compute nodes’ LDAP configuration was identical to the head node. I stumbled across the solution - install libnss-ldapd, and then reboot the server. Now logins execute in the expected amount of time, and array jobs are running as expected on all nodes.