Hello guys,
I am new in my company and recently I was trying to make some changes in our PBS Pro V12.0.1 (commercial).
We have a SGi cluster to run fluent and cfd++ and, as I was told, recently PBS stopped working.
Every time I submitted a new job to any queue (we have the workq, cfd and fluent) we got the answer:
qsub: Bad UID for job execution.
At this time, qmgr and qstat were working.
I tried to solve this problem changing some stuff in qmgr, using set server acl_host, set server acl_user and etc, with no change in the problem, but also we could still access qstat and qmgr.
Then, we tried to add set server acl_hosts_enable into qmgr, and after making this input we lost connection to PBS.
I mean, I can use pbs_probe, pbs_mom and the server is working, however we cannot even check the qstat or qmgr anymore, we get these outputs:
$: qstat
pbs_iff: error returned: 15031
No Permission.
qstat: cannot connect to server host (errno+15007)
$: qmgr
pbs_iff: error returned: 15031
No Permission.
qstat: cannot connect to server host (errno+15007)
I even tried reconnecting to the server, without success:
$: pbs_iff -t host 15001
pbs_iff: error returned: 15031
Can anyone help us?
Thanks for your attention and your time.
Kind regards,
Alexandre Medina