We run into a strange bug when submitting interactive jobs.
A job that can run quickly
qsub -X -I -l select=1:ncpus=1:mem=10gb
qsub: waiting for job 23913.pbs59 to start
qsub: job 23913.pbs59 ready
will just run normally
but when it has to wait a bit,
qsub -X -I -l select=100:ncpus=72:mem=10gb
qsub: waiting for job 23917.pbs59 to start
qsub: SIGPIPE received, job submission interrupted.: Connection reset by peer
it will crash with above message but the jobs stays queued and there is not error message anywhere. is this a bug or some system setting we run into?