We have a cluster with nodes that has different #cpus mixed together.
With the current settings, it was found that the smaller (that requests fewer #cpus) jobs occupied the nodes.
So when a larger job was queued, the nodes do have free cpus, but not enough to run the larger job in one node, thus keep it waiting in the queue.
Is there a general suggestion to improve the waiting time and efficiency in such cases?
Like, set the nodes that has fewer #cpus with higher priority, leaving the larger nodes to run the larger jobs?
Or maybe to make the larger job has higher priority?