Hello,
I recently spent some time reviewing the mom_logs and was curious to know in the following example output, what exactly is going on:
10/28/2018 10:27:13;0100;pbs_mom;Req;;Type 1 request received from root@10.30.255.254:15001, sock=8
10/28/2018 10:27:13;0100;pbs_mom;Req;;Type 3 request received from root@10.30.255.254:15001, sock=8
10/28/2018 10:27:13;0100;pbs_mom;Req;;Type 5 request received from root@10.30.255.254:15001, sock=8
10/28/2018 10:27:13;0080;pbs_mom;Job;76042.bright01-thx;running prologue
10/28/2018 10:27:13;0008;pbs_mom;Job;76042.bright01-thx;Started, pid = 97236
10/28/2018 11:27:46;0080;pbs_mom;Job;76042.bright01-thx;task 00000001 terminated
10/28/2018 11:27:46;0008;pbs_mom;Job;76042.bright01-thx;Terminated
10/28/2018 11:27:46;0100;pbs_mom;Job;76042.bright01-thx;task 00000001 cput=20:09:57
10/28/2018 11:27:46;0008;pbs_mom;Job;76042.bright01-thx;kill_job
10/28/2018 11:27:46;0100;pbs_mom;Job;76042.bright01-thx;node0109 cput=20:09:57 mem=63093660kb
10/28/2018 11:27:46;0008;pbs_mom;Job;76042.bright01-thx;no active tasks
10/28/2018 11:27:46;0100;pbs_mom;Job;76042.bright01-thx;Obit sent
10/28/2018 11:27:46;0100;pbs_mom;Req;;Type 54 request received from root@10.30.255.254:15001, sock=8
10/28/2018 11:27:46;0080;pbs_mom;Job;76042.bright01-thx;copy file request received
10/28/2018 11:28:46;0100;pbs_mom;Job;76042.bright01-thx;staged 1 items out over 0:01:00
10/28/2018 11:28:46;0008;pbs_mom;Job;76042.bright01-thx;no active tasks
10/28/2018 11:28:49;0100;pbs_mom;Req;;Type 6 request received from root@10.30.255.254:15001, sock=8
10/28/2018 11:28:49;0080;pbs_mom;Job;76042.bright01-thx;delete job request received
10/28/2018 11:28:49;0008;pbs_mom;Job;76042.bright01-thx;kill_job
A few additional questions: why do we have multiple kill_job commands? what does 'task 00000001 terminated" mean?
In another example output I notice there are 4 kill_job commands - is there a reason why?:
10/28/2018 06:23:28;0002;pbs_mom;Svr;Log;Log opened
10/28/2018 06:23:28;0002;pbs_mom;Svr;pbs_mom;pbs_version=14.1.2
10/28/2018 06:23:28;0002;pbs_mom;Svr;pbs_mom;pbs_build=mach=N/A:security=N/A:configure_args=N/A
10/28/2018 06:23:28;0008;pbs_mom;Job;73337.bright01-thx;walltime 86410 exceeded limit 86340
10/28/2018 06:23:28;0008;pbs_mom;Job;73337.bright01-thx;kill_job
10/28/2018 06:23:28;0080;pbs_mom;Job;73337.bright01-thx;task 00000001 terminated
10/28/2018 06:23:38;0008;pbs_mom;Job;73337.bright01-thx;kill_job
10/28/2018 06:23:38;0080;pbs_mom;Job;73337.bright01-thx;task 00000001 force exited
10/28/2018 06:23:38;0008;pbs_mom;Job;73337.bright01-thx;Terminated
10/28/2018 06:23:38;0100;pbs_mom;Job;73337.bright01-thx;task 00000001 cput=479:53:01
10/28/2018 06:23:38;0008;pbs_mom;Job;73337.bright01-thx;kill_job
10/28/2018 06:23:38;0100;pbs_mom;Job;73337.bright01-thx;node0109 cput=479:53:01 mem=9697960kb
10/28/2018 06:23:38;0008;pbs_mom;Job;73337.bright01-thx;no active tasks
10/28/2018 06:23:38;0100;pbs_mom;Job;73337.bright01-thx;Obit sent
10/28/2018 06:23:38;0100;pbs_mom;Req;;Type 54 request received from root@10.30.255.254:15001, sock=8
10/28/2018 06:23:38;0080;pbs_mom;Job;73337.bright01-thx;copy file request received
10/28/2018 06:23:42;0100;pbs_mom;Job;73337.bright01-thx;staged 2 items out over 0:00:04
10/28/2018 06:23:42;0008;pbs_mom;Job;73337.bright01-thx;no active tasks
10/28/2018 06:23:44;0100;pbs_mom;Req;;Type 6 request received from root@10.30.255.254:15001, sock=8
10/28/2018 06:23:44;0080;pbs_mom;Job;73337.bright01-thx;delete job request received
10/28/2018 06:23:44;0008;pbs_mom;Job;73337.bright01-thx;kill_job