Balance array jobs of one user (fairshare?)

Hello,

I want PBS to distribute compute nodes among all array jobs of one user in balanced manner, is it possible?

For example I have 12 equal compute nodes. User submits array job Job1 with 10k subjobs, each subjob uses one compute node, i.e. 12 subjobs are running at once. Then the same user submits another similar aray job Job2, I want Job2 to start as soon as possible and don’t wait for all subjobs in Job1 to finish. I.e. as soon as some subjob in Job1 is finished, PBS should start subjob from Job2 and eventually PBS should distribute 6 compute nodes to Job1 and 6 compute nodes to Job2. (50% fairshare).

Similarly, when same user submits third array job Job3, eventually PBS should distribute 4 compute nodes to each array job (33% fairshare), etc. For simplicity let’s assume that trere are no other users in that queue.

Is it possible to do it?

Hi,

Please note user(s) request resources - ncpus, mem, walltime, gpus etc via qsub to submit their jobs and PBS Scheduler will make sure these jobs are scheduled on to the compute nodes that can satify the resource request. The fairshare policy is based on historical usage of the cluster by a user (or entity) - if a user has historically used the cluster for a lot of time, then the user has less chances of getting resource for his jobs when other users who hav less historical usage on the cluster, it is not about interprovisioning the cluster to maintain fairshare resource allocation to the user jobs, the scheduler is not aware of future job submission of the users, that it can foresee to take a decision now to segragate resources.

Please check the documentation on :

  • the user limits , project limits , group limits, they might help in limiting
  • queue job hooks might be helpful in re-writing the resource request statements after finding out the current scenario on cluster (or external method that records the fairshare usage based on your requirment and helps rewriting the select statement)

Thank you