Submitting Array Job with -t vs -J

I’m writing a job array script and have noticed there seems to be two ways to submit an array job:

#PBS -t <first_index-last_index>%number_of_active_jobs
or
#PBS -J <first_index-last_index>:index_step

In my case I’m wanting to have 10 subjobs running at the same time. I’m hoping I don’t need to setup some form of MPI to run the subjobs at the same time. I have openpbs 20.0.1 built and configured in some VM’s currently for testing. The documentation suggests -J, however -t seems to be more appropriate for my case.

Can I use #PBS -t to do this?

This is the correct one to use. #PBS -J or qsub -J 1-10

the correct documentation is at this link: https://www.altair.com/pdfs/pbsworks/PBSAdminGuide2020.1.pdf

Job arrays are used for embarrassingly parallel jobs. The job (subjobs) which are not dependent on each other and sometimes run for less than a couple of minutes or sometimes even seconds in some cases.
Example: monte carlo simulations, rendering of movie frames, SIMD (single instruction/application but multiple data) applications etc

qsub -J 1-10 – /bin/sleep 10 # 10 subjobs are created
qsub -J 1-10:2 – /bin/sleep 10 # odd number/indexed of subjobs are created
qsub -J 2-10:2 – /bin/sleep 10 # even number/indexed of subjobs are created

Sample script:
#!/bin/sh
#PBS -N my10subjobs
#PBS -J 1-10
echo "My job indexes " $PBS_ARRAY_INDEX
/path/to/montecarlosimulationbinary -i /project/cases/scriptlet_$PBS_ARRAY_INDEX

$qsub my10subjobs.sh

Hope this helps

Thanks. I’ve seen some examples like what you showed.

In my use case I have users running scripts over time-series data. It is true that those sub-jobs typically don’t take more than a few minutes. I’m moving to PBS from a custom-built scheduler that does run the sub-jobs parallel in batches of 10. My users have become used to that and expect the subjobs to run in parallel.

The PBSadminguide is the documentation I’ve been referring to. However, it seems common practice that people run sub-jobs of a job array in parallel with #PBS -t

What’s the reason against submitting an array job with #PBS -t ?

The subjobs run in parallel by default

Could you pleas quote section and share the snippet from the admin guide that points to #PBS -t

Please check the work load manager that suggests #PBS -t, it might not be openPBS.
openPBS admin manual is found at this link:
https://www.altair.com/pdfs/pbsworks/PBSAdminGuide2020.1.pdf

I was not aware that the subjobs run in parallel by default. My current schedule has additional functionality built to enable this and requires an additional argument passed. Given that, I don’t need to even consider using the directive #PBS -t

The guide at the link you shared is what I’ve been referring to. Like you said, I does state to submit array jobs with the directive #PBS -J. I was referring to a few examples/tutorials on university websites that explained how to specify the number of sub-jobs to run in parallel. I think you must be right about #PBS -t being for a different scheduler forked from PBS. Likley TORQUE or SLURM.

Here’s one example from Georgia Tech
https://docs.pace.gatech.edu/software/arrayGuide/

I did actually find TORQUE documentation referring to using the directive #PBS -t for array jobs
http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php

In the PACE docs, it says:
What technology does PACE use for its scheduler and resource manager?

  • Currently PACE uses the Moab scheduler along with the Torque resource manager.
1 Like

I saw that as well afterwards. Looking over other people’s examples (not just from PACE but other clusters as well) has misled me a bit. Seems people often refer to their job scripts as “PBS Scripts” regardless if their using PBS Pro/openPBS or some other forked off version like TORQUE or SLRUM.

Thanks!

1 Like