PBS Implementation on AWS

Hi,

I am looking into building PBS on AWS Fully is there a reference on how to do that or a reference architecture?

Thanks.

You would need to build on the supported OS / Version / Arch

Details are found here:

Thanks for sharing that, but there is no reference architecture on components placement, the queues and how should they be placed if they are with different instance sizing and also how this should interact with aws services as well?

For large size deployments, what is the best practices to place the solution components as well and the networking best practices?

Thank you, Please refer the below documents:

Make sure you have the pre-requisite ready before deploying PBS Pro

  • static ip /hostname
  • selinux disabled
  • firewall to allow ports 15001-15007 , 17001, 22
  • passwordless ssh
  • uid/gid of users common across participating systems
  • make sure application are deployed
  • test whether the batch command line of the application work
  • once tested, integrated it as batch script and run with PBS Pro.

PBS Professional needs to be deployed as below:

  1. PBS server/scheduler on the headnode or master node or server node
  2. PBS execution component on the compute instances ( compute nodes ) running your applications

Cloud instance sizing depends on the specific requirements of the application. The PBS Pro execution component itself can run on a basic 1 vCore instance. However, selecting the appropriate instance size should be determined in consultation with the application specialist, supported by benchmarking to identify the best option. Factors such as whether the application is CPU-, memory-, GPU-, or I/O-intensive will influence the decision. Please note that sizing recommendations fall outside the scope of a workload manager like openPBS

Network and storage requirements also depend on the type of jobs being run. For example, SMP jobs, MPP jobs (which may require a high-speed interconnect such as InfiniBand, as well as scale sets, instance pools, or instance groups), and OpenMP jobs each have different needs. Similarly, if the application is I/O-intensive, the choice between local storage and shared/global storage should be carefully considered.

1 Like
1 Like

Thank you for all the details shared will check them out and try to see how things would go.

1 Like