Job has egroup == "-default-"

I have a python script that every so often queries against all the jobs on the server using the PTL testlib Server() connection. Once in a while as it loops over the jobs from Server.status(JOB) it sees the egroup of a new job set to “-default-” whereas most of the time it gets set to a valid value based on the user’s unix group.

I have tried waiting a few seconds and looping over jobs again but egroup is still not set properly.

anyone know why a job would occasionally start out life right after a qsub with a bogus egroup?



When the server can’t find the user in its password file, and flatuid is set to true, the server sets the egroup so that the MoM will use the login group.

Relevant places in the code:

interesting, thanks.
now on to determining why the pwd entry for the user on the pbs server isnt seen occasionally.

When I run job from clienthost(for testing this is node1), job ends with H state in queue. Tracejob command tells me that:

11/04/2021 13:49:24  M    node1 cput=00:00:00 mem=0kb
11/04/2021 13:49:47  M    No Group Entry for Group -default-
11/04/2021 13:49:47  M    Obit sent
11/04/2021 13:49:48  M    delete job request received
11/04/2021 13:49:48  M    kill_job

I submit job from node1 witch command:
echo “sleep 60” | qsub

I have flatuid set to true on the serverhost.
I submit job with user that not exist on serverhost. According to referance guide:

User without account on server can submit jobs.

I have /etc/hosts.equiv file on serverhost with name of node1 inside it.

Why job ends in held state?