I have installed OpenPBS version = 20.0.0 on Ubuntu 18.04, with 3 GPU servers configured as nodes
when a user submits a job to PBS it assigns the job to any one of the 3 GPU servers. Same set of users are present in all the 3 nodes. Users need data (usually 10GB or more) in their respective home directories to run a model training job. The problem seems to be keeping the same set of data in each of the 3 nodes so that the job can be executed by PBS, in any node.
Do you suggest to have a single NFS mounted drive in all the 3 nodes, so that the data needs to be copied only once and the same data is available in all the 3 nodes ? or is there a solution when we use only local hard disks specific to each node ?