sometimes on restart of PBS it starts up with:
Cannot enable queue, incomplete definition (15030) in decode_attr_db, Action function failed for enabled attr, errn 15030
for a number of queues which are then not recovered.
going into qmgr to try to rebuild them:
Qmgr: delete queue blah
qmgr obj=blah svr=default: Unknown queue
qmgr: Error (15018) returned from server
Qmgr: create queue blah
qmgr obj=blah svr=default: End of File
at this point the server has panicked with:
que_save_db, que_save failed Execution of Prepared statement insert_que failed: ERROR: duplicate key value violates unique constraint “queue_pk”
DETAIL: Key (qu_name)=(blah) already exists.
panic_stop_db, Panic shutdown of Server on database error. Please check PBS_HOME file system for no space condition.
Stopping PBS dataservice
df shows tons of space on all filesystems.
at this point the only option is to delete the database directory and rebuild everything.
which after the 3rd time this happened is now scripted and not difficult.
but it does mean at any random time that PBS gets restarted the system is broken until I rebuild.