Hi,
I’m running PBS pro community 20.0.1 on Rocky Linux 8.6 (RHEL-equivalent).
The system have a login node with PBS server on it, but does not run job itself.
A computing node receives job from the login node.
The problem is, the software works by “resubmit” itself in iterations - the subsequent jobs are submitted from the computing node, not the login node.
But I have such error:
[mike@forest ~]$ qsub test.sh
2993.forest
[mike@forest ~]$ ssh node01
Last login: Fri Aug 26 14:52:24 2022 from 192.168.1.1
[mike@node01 ~]$ qsub test.sh
qsub: Bad UID for job execution
I’ve setup the password-less SSH and home folder sharing between the 2 machines.
The qmgr output follows:
create queue workq
set queue workq queue_type = Execution
set queue workq enabled = True
set queue workq started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_host_enable = False
set server default_queue = workq
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.ncpus = 1
set server default_chunk.ncpus = 1
set server scheduler_iteration = 600
set server resv_enable = True
set server node_fail_requeue = 310
set server max_array_size = 10000
set server pbs_license_min = 0
set server pbs_license_max = 2147483647
set server pbs_license_linger_time = 31536000
set server eligible_time_enable = False
set server max_concurrent_provision = 5
set server max_job_sequence_id = 9999999
There’s a allow_node_submit server property in Torque to do this, but I don’t think the PBS pro has this. Any ideas?
Mike