Hi All,
To me, it seems like my system is running OK, but the output of the mom_log file below made me think that something may be wrong. All the slave nodes have similar outputs as below.
I would appreciate it if someone could take a look and tell me that the output is OK and there is nothing to worry about.
03/31/2022 23:46:31;0004;pbs_mom;Fil;2906.hep-node0.ER;lost connection
03/31/2022 23:46:31;0001;pbs_mom;Svr;pbs_mom;No such file or directory (2) in is_child_path, Failed to allocate memory
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;Job files not copied:---->>>>
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;Unable to copy file /var/spool/pbs/spool/2906.hep-node0.ER to comcomproxy1.com.com:/dev/null
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;>>> error from copy
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;comcomproxy1.com.com: Connection timed out
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;proxy1.com.com, user ali_0, command scp -v -r -p -t /dev/null
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;debug1: Reading configuration data /etc/ssh/ssh_config
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;debug1: /etc/ssh/ssh_config line 61: Applying options for *
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;debug1: Connecting to comcomproxy1.com.com [45.11.57.36] port 22.
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;debug1: connect to address 45.11.57.36 port 22: Connection timed out
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;ssh: connect to host comcomproxy1.com.com port 22: Connection timed out
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;lost connection
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;>>> end error output
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;Output retained on that host in: /var/spool/pbs/undelivered/2906.hep-node0.ER
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;---->>>>
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;staged 2 items out over 0:09:02
03/31/2022 23:46:31;0008;pbs_mom;Job;2906.hep-node0;no active tasks
03/31/2022 23:46:31;0008;pbs_mom;Job;2921.hep-node0;no active tasks
03/31/2022 23:46:31;0080;pbs_mom;Req;req_reject;Reject reply code=15051, aux=0, type=54, from root@192.168.1.1:15001
03/31/2022 23:46:31;0100;pbs_mom;Job;2906.hep-node0;Obit sent
03/31/2022 23:46:31;0100;pbs_mom;Req;;Type 6 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:46:31;0080;pbs_mom;Job;2906.hep-node0;delete job request received
03/31/2022 23:46:31;0008;pbs_mom;Job;2906.hep-node0;kill_job
03/31/2022 23:46:31;0100;pbs_mom;Req;;Type 6 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:46:31;0080;pbs_mom;Job;2906.hep-node0;delete job request received
03/31/2022 23:46:31;0080;pbs_mom;Req;req_reject;Reject reply code=15001, aux=0, type=6, from root@192.168.1.1:15001
03/31/2022 23:46:31;0008;pbs_mom;Job;2921.hep-node0;no active tasks
03/31/2022 23:46:31;0100;pbs_mom;Req;;Type 1 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:46:31;0100;pbs_mom;Req;;Type 3 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:46:31;0100;pbs_mom;Req;;Type 5 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:46:31;0008;pbs_mom;Job;2956.hep-node0;Started, pid = 26231
03/31/2022 23:47:58;0080;pbs_mom;Fil;sys_copy;command: /opt/pbs/sbin/pbs_rcp -rp /var/spool/pbs/spool/2921.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=2
03/31/2022 23:50:16;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2921.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=3
03/31/2022 23:50:42;0008;pbs_mom;Job;2921.hep-node0;no active tasks
03/31/2022 23:50:42;0080;pbs_mom;Job;2956.hep-node0;task 00000001 terminated
03/31/2022 23:50:42;0008;pbs_mom;Job;2956.hep-node0;Terminated
03/31/2022 23:50:42;0100;pbs_mom;Job;2956.hep-node0;task 00000001 cput=00:04:11
03/31/2022 23:50:42;0008;pbs_mom;Job;2956.hep-node0;kill_job
03/31/2022 23:50:42;0100;pbs_mom;Job;2956.hep-node0;hep-node0 cput=00:04:11 mem=12708kb
03/31/2022 23:50:42;0100;pbs_mom;Job;2956.hep-node0;Obit sent
03/31/2022 23:50:42;0100;pbs_mom;Req;;Type 54 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:50:42;0080;pbs_mom;Job;2956.hep-node0;copy file request received
03/31/2022 23:52:24;0080;pbs_mom;Fil;sys_copy;command: /opt/pbs/sbin/pbs_rcp -rp /var/spool/pbs/spool/2921.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=4
03/31/2022 23:52:45;0001;pbs_mom;Fil;copy_file;sys_copy failed with status=1
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;Unable to copy file /var/spool/pbs/spool/2921.hep-node0.OU to comcomproxy1.com.com:/dev/null
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;comcomproxy1.com.com: Connection timed out
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;proxy1.com.com, user ali_0, command scp -v -r -p -t /dev/null
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;debug1: Reading configuration data /etc/ssh/ssh_config
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;debug1: /etc/ssh/ssh_config line 61: Applying options for *
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;debug1: Connecting to comcomproxy1.com.com [45.11.57.36] port 22.
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;debug1: connect to address 45.11.57.36 port 22: Connection timed out
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;ssh: connect to host comcomproxy1.com.com port 22: Connection timed out
03/31/2022 23:52:45;0004;pbs_mom;Fil;2921.hep-node0.OU;lost connection
03/31/2022 23:52:45;0001;pbs_mom;Svr;pbs_mom;No such file or directory (2) in is_child_path, Failed to allocate memory
03/31/2022 23:52:50;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2956.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=1
03/31/2022 23:54:52;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2921.hep-node0.ER ali_0@comcomproxy1.com.com:/dev/null status=1, try=1
03/31/2022 23:54:57;0080;pbs_mom;Fil;sys_copy;command: /opt/pbs/sbin/pbs_rcp -rp /var/spool/pbs/spool/2956.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=2
03/31/2022 23:56:59;0080;pbs_mom;Fil;sys_copy;command: /opt/pbs/sbin/pbs_rcp -rp /var/spool/pbs/spool/2921.hep-node0.ER ali_0@comcomproxy1.com.com:/dev/null status=1, try=2
03/31/2022 23:57:15;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2956.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=3
03/31/2022 23:57:23;0008;pbs_mom;Job;2921.hep-node0;no active tasks
03/31/2022 23:57:23;0008;pbs_mom;Job;2956.hep-node0;no active tasks
03/31/2022 23:57:23;0080;pbs_mom;Job;2937.hep-node0;task 00000001 terminated
03/31/2022 23:57:23;0008;pbs_mom;Job;2937.hep-node0;Terminated
03/31/2022 23:57:23;0100;pbs_mom;Job;2937.hep-node0;task 00000001 cput=00:23:19
03/31/2022 23:57:23;0008;pbs_mom;Job;2937.hep-node0;kill_job
03/31/2022 23:57:23;0100;pbs_mom;Job;2937.hep-node0;hep-node0 cput=00:23:19 mem=12944kb
03/31/2022 23:57:23;0100;pbs_mom;Job;2937.hep-node0;Obit sent
03/31/2022 23:57:23;0100;pbs_mom;Req;;Type 54 request received from root@192.168.1.1:15001, sock=1
03/31/2022 23:57:23;0080;pbs_mom;Job;2937.hep-node0;copy file request received
03/31/2022 23:59:17;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2921.hep-node0.ER ali_0@comcomproxy1.com.com:/dev/null status=1, try=3
03/31/2022 23:59:23;0080;pbs_mom;Fil;sys_copy;command: /opt/pbs/sbin/pbs_rcp -rp /var/spool/pbs/spool/2956.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=4
03/31/2022 23:59:30;0080;pbs_mom;Fil;sys_copy;command: /bin/scp -Brvp /var/spool/pbs/spool/2937.hep-node0.OU ali_0@comcomproxy1.com.com:/dev/null status=1, try=1
03/31/2022 23:59:44;0001;pbs_mom;Fil;copy_file;sys_copy failed with status=1
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;Unable to copy file /var/spool/pbs/spool/2956.hep-node0.OU to comcomproxy1.com.com:/dev/null
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;comcomproxy1.com.com: Connection timed out
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;proxy1.com.com, user ali_0, command scp -v -r -p -t /dev/null
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;debug1: Reading configuration data /etc/ssh/ssh_config
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;debug1: /etc/ssh/ssh_config line 61: Applying options for *
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;debug1: Connecting to comcomproxy1.com.com [45.11.57.36] port 22.
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;debug1: connect to address 45.11.57.36 port 22: Connection timed out
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;ssh: connect to host comcomproxy1.com.com port 22: Connection timed out
03/31/2022 23:59:44;0004;pbs_mom;Fil;2956.hep-node0.OU;lost connection
03/31/2022 23:59:44;0001;pbs_mom;Svr;pbs_mom;No such file or directory (2) in is_child_path, Failed to allocate memory