I am new to OpenPBS, and having this problem.
Was looking for answer on the internet, but didn’t find any solution.
I have cluster with 30 nodes, two queues, named Big and Batch.
Sometimes when I check status of running jobs, i see that all jobs from queue Batch hang in state “Not Running: Strict fifo order”. But this jobs already started and using all available nodes, so new jobs can’t start. Only way to solve this is to qdel this jobs, then new jobs start normally.
round_robin: False all
by_queue: True prime
by_queue: True non_prime
strict_fifo: True ALL
fair_share: false ALL
help_starving_jobs false ALL
sort_queues true ALL
load_balancing: false ALL
Please help me to solve this issue!