Seeing some sporadic DIS errors when -W block=True is requested:
./20250103:01/03/2025 14:19:32;0001;Server@sched;Svr;Server@sched;check_block_wt, DIS error while replying to client gitrun3 for job 164004.sched
The job exits cleaning (status 0), but the submission terminal hangs as it didn’t receive a response from the Scheduler.
I see this in the MOM logs:
01/28/2025 11:02:39;0080;pbs_mom;Job;179807.sched;task 00000001 terminated
01/28/2025 11:02:39;0800;pbs_mom;n/a;mom_get_sample;nprocs: 615, cantstat: 2, nomem: 0, skipped: 492, cached: 0
01/28/2025 11:02:39;0008;pbs_mom;Job;179807.sched;Terminated
01/28/2025 11:02:39;0100;pbs_mom;Job;179807.sched;task 00000001 cput=00:13:00
01/28/2025 11:02:39;0008;pbs_mom;Job;179807.sched;kill_job
01/28/2025 11:02:39;0100;pbs_mom;Job;179807.sched;node9 cput=00:13:00 mem=1275332kb
01/28/2025 11:02:39;0100;pbs_mom;Job;179807.sched;Obit sent
01/28/2025 11:02:39;0100;pbs_mom;Req;;Type 54 request received from root@10.10.38.18:15001, sock=5
01/28/2025 11:02:39;0080;pbs_mom;Job;179807.sched;copy file request received
01/28/2025 11:02:39;0800;pbs_mom;Job;stage_file;Skipping directly written/absent spool file /var/spool/pbs/spool/179807.sched.OU
01/28/2025 11:02:39;0800;pbs_mom;Job;stage_file;Skipping directly written/absent spool file /var/spool/pbs/spool/179807.sched.ER
01/28/2025 11:02:39;0100;pbs_mom;Job;179807.sched;staged 2 items out over 0:00:00
01/28/2025 11:02:39;0800;pbs_mom;n/a;mom_get_sample;nprocs: 616, cantstat: 2, nomem: 0, skipped: 492, cached: 0
01/28/2025 11:02:39;0008;pbs_mom;Job;179807.sched;no active tasks
01/28/2025 11:02:39;0100;pbs_mom;Req;;Type 6 request received from root@10.10.38.18:15001, sock=5
01/28/2025 11:02:39;0080;pbs_mom;Job;179807.sched;delete job request received
01/28/2025 11:02:39;0008;pbs_mom;Job;179807.sched;kill_job