how to configure PBS to Stage data to the node
how to configure PBS to Stage data to the node
You can do this from within the job script
For example:
#PBS directives
< main application batch command line >
< scp or cp or rsync files form the job directory to your source location >
< cleanup the files in the job directory after the above copy command is successful or keep both >
execjob_epilogue hook which does the scp , cp or rysnc or any remote file copy to the source location and then deletes the files from the sandbox / job directory (or you can keep both )
Write a hook event that alters the STAGEOUT attribute and redirects to the source directory,
Thank you
PBS does have a built-in staging mechanism triggered with the stagein and stageout options to the -W (additional job attributes) directive. If you consult the qsub manpage and search for ‘stagein’, you will find the documentation. In my opinion, the documentation for this feature has always been rather confusing and does not provide an example, so here is an example:
#PBS -W stagein=/path/to/local/dir@remote.host:/path/to/remote/dir,stageout=/path/to/local/dir@remote.host:/path/to/remote/dir
Importantly, and also not mentioned in the manpage, are:
If you need more sophisticated staging, then I recommend either writing your own script and making it PBS_SCP, or using the scenarios that @adarsh mentioned.
Hopefully this will help.
The file transfer protocol depends what you have configured on the PBS Servers /etc/pbs.conf for STAGEIN (server to compute nodes) and PBS MOMs $PBS_HOME/mom_priv/config for STAGEOUT (if $usecp does not exists in mom_priv/config then it follows what is configured in /etc/pbs.conf on the MOM)
Key words to search in the PBS Pro admin guide : $usecp
Default copy mechanism is : RCP , otherwise SCP and CP if they are configured in pbs.conf / mom_priv/config
#PBS -W stagein = <execution_path>@:<storage_path>
#PBS -W stageout = <execution_path>@:<storage_path>
stagein: location of input files ( copy input files to the execution directory or job directory)
stageout: location of output files (copy results from job directory or execution directory back to you intended location )
execution_path: execution directory on the compute node
storage_path: filename on host hostname
The ‘@’ character separates execution path specification from storage path specification
@ character is just a separator, it does not represent username@hostname kind of specification
#PBS -N pbsproapplicaton
#PBS -l select=1:ncpus=1:mem=1gb
#PBS -W sandbox=PRIVATE
#copy the box.fem file from the current location to headnode:/home/pbsdata/optistruct with the same file name box.fem
#PBS -W stagein=box.fem@headnode:/home/pbsdata/optistruct/box.fem
#copy all the results from the sandbox or jobdir to headnode:/home/pbsdata/output
#PBS -W stageout=*@headnode:/home/pbsdata/output
#PBS -N pbsproapplication
#PBS -l select=1:ncpus=1:mem=1gb
#PBS -W sandbox=PRIVATE
#copy the box.fem and box.inc files from the current location to headnode:/home/pbsdata/optistruct with the same file name box.fem
#PBS -W stagein=box.fem@headnode:/home/pbsdata/optistruct/box.fem,box.inc@@headnode:/home/pbsdata/optistruct/box.fem
#copy *.out and *.log result files from the sandbox or jobdir to headnode:/home/pbsdata/output
#PBS -W stageout=*.out@headnode:/home/pbsdata/output,*.log@headnode:/home/pbsdata/output
Please see the PBS Pro User Guide, section 3.2, “Input/Output File Staging”, page UG-33.