Direct write is requested for job: 29769.hn1, but the destination is not usecp-able from cn006

Got the below error on a job where a python script is trying to write to the user home directory that is bound to a singularity container.

Direct write is requested for job: 29769.hn1, but the destination is not usecp-able from cn006

Basic Info:

  • openpbs 20.0.1
  • Ubuntu 18.04
  • Singularity 3.9.9

PBS Directives (some values get passed):
#!/bin/bash -x
#PBS -N %(name)s
#PBS -l walltime=%(walltime)s
#PBS -q workq
#PBS -k eod
#PBS -o %(context_dir)s
#PBS -e %(context_dir)s
#PBS -l %(processors)s
#PBS -l mem=%(memory)s
#PBS -W umask=66


tracejob 29769:

Job: 29769.hn1

04/17/2023 14:43:09 L Considering job to run
04/17/2023 14:43:09 S enqueuing into workq, state 1 hop 1
04/17/2023 14:43:09 S Job Queued at request of schuec1@hn1, owner = schuec1@hn1,
job name = utilscriptsinfrastructurelistAllTxIndexFiles2py, queue = workq
04/17/2023 14:43:09 S Job Run at request of Scheduler@hn1 on exec_vnode
(cn006.cm.cluster:ncpus=1:mem=4194304kb)
04/17/2023 14:43:09 L Job run
04/17/2023 14:43:10 S Obit received momhop:1 serverhop:1 state:4 substate:42
04/17/2023 14:43:11 S Exit_status=0 resources_used.cpupercent=0 resources_used.cput=00:00:00
resources_used.mem=0kb resources_used.ncpus=1 resources_used.vmem=0kb
resources_used.walltime=00:00:01


mom_log:

04/17/2023 15:43:09;0100;pbs_mom;Req;;Type 1 request received from root@172.16.0.1:15001, sock=1
04/17/2023 15:43:09;0100;pbs_mom;Req;;Type 3 request received from root@172.16.0.1:15001, sock=1
04/17/2023 15:43:09;0100;pbs_mom;Req;;Type 5 request received from root@172.16.0.1:15001, sock=1
04/17/2023 15:43:09;0008;pbs_mom;Job;29769.hn1;Started, pid = 26477
04/17/2023 15:43:10;0080;pbs_mom;Job;29769.hn1;task 00000001 terminated
04/17/2023 15:43:10;0008;pbs_mom;Job;29769.hn1;Terminated
04/17/2023 15:43:10;0100;pbs_mom;Job;29769.hn1;task 00000001 cput=00:00:00
04/17/2023 15:43:10;0008;pbs_mom;Job;29769.hn1;kill_job
04/17/2023 15:43:10;0100;pbs_mom;Job;29769.hn1;n006 cput=00:00:00 mem=0kb
04/17/2023 15:43:10;0100;pbs_mom;Job;29769.hn1;Obit sent
04/17/2023 15:43:10;0100;pbs_mom;Req;;Type 54 request received from root@172.16.0.1:15001, sock=1
04/17/2023 15:43:10;0080;pbs_mom;Job;29769.hn1;copy file request received
04/17/2023 15:43:11;0100;pbs_mom;Job;29769.hn1;staged 2 items out over 0:00:01
04/17/2023 15:43:11;0008;pbs_mom;Job;29769.hn1;no active tasks
04/17/2023 15:43:11;0100;pbs_mom;Req;;Type 6 request received from root@172.16.0.1:15001, sock=1
04/17/2023 15:43:11;0080;pbs_mom;Job;29769.hn1;delete job request received
04/17/2023 15:43:11;0008;pbs_mom;Job;29769.hn1;kill_job


Singularity Verbose stderr:

Direct write is requested for job: 29769.hn1, but the destination is not usecp-able from cn006
/var/spool/pbs/mom_priv/jobs/29769.hn1.SC: line 34: shopt: varredir_close: invalid shell option name
VERBOSE: Forwarding SINGULARITYENV_LD_LIBRARY_PATH as LD_LIBRARY_PATH environment variable
VERBOSE: Setting HOME=/home/ahs/schuec1
VERBOSE: Setting PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
VERBOSE: Set messagelevel to: 4
VERBOSE: Starter initialization
VERBOSE: Check if we are running as setuid
VERBOSE: Drop root privileges
VERBOSE: Drop root privileges permanently
VERBOSE: Spawn stage 1
VERBOSE: Execute stage 1
VERBOSE: stage 1 exited with status 0
VERBOSE: Get root privileges
VERBOSE: Change filesystem uid to 2089
VERBOSE: Spawn master process
VERBOSE: Create mount namespace
VERBOSE: Entering in mount namespace
VERBOSE: Create mount namespace
VERBOSE: Spawn RPC server
VERBOSE: Execute master process
VERBOSE: Serve RPC requests
VERBOSE: Default mount: /proc:/proc
VERBOSE: Default mount: /sys:/sys
VERBOSE: Found ‘bind path’ = /etc/localtime, /etc/localtime
VERBOSE: Found ‘bind path’ = /etc/hosts, /etc/hosts
VERBOSE: Default mount: /tmp:/tmp
VERBOSE: Default mount: /var/tmp:/var/tmp
VERBOSE: Default mount: /etc/resolv.conf:/etc/resolv.conf
VERBOSE: Checking for template passwd file: /apps/singularity/3.9.9/var/singularity/mnt/session/rootfs/etc/passwd
VERBOSE: Creating passwd content
VERBOSE: Creating template passwd file and appending user data: /apps/singularity/3.9.9/var/singularity/mnt/session/rootfs/etc/passwd
VERBOSE: Default mount: /etc/passwd:/etc/passwd
VERBOSE: Checking for template group file: /apps/singularity/3.9.9/var/singularity/mnt/session/rootfs/etc/group
VERBOSE: Creating group content
VERBOSE: Default mount: /etc/group:/etc/group
VERBOSE: /mnt/pathfinder/contexts/chase/software/bin found within container
VERBOSE: rpc server exited with status 0
VERBOSE: Execute stage 2
FATAL: permission denied

I believe you also have to set the appropriate configuration on the compute nodes themselves. For instance, this is an example from a node in one of our clusters:

x3006c0s13b0n0 20230418-154119 pbs> sudo cat ./mom_priv/config
[sudo] password for allcock:
$clienthost polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov
$restrict_user_maxsysid 999
$usecp *:/home/ /home/
$usecp *:/lus/eagle/ /lus/eagle/
$usecp *:/lus/grand/ /lus/grand/
$usecp *:/eagle/ /eagle/
$usecp *:/grand/ /grand/
$usecp *:/dev/null /dev/null

So that allows them to use any paths below those listed. If they use any other, they will get the same error you are seeing.

1 Like