Warning: PBS Professional has detected core file(s) in PBS_HOME that require attention!

I see this warning in my log files, but I can’t see any obvious documentation on what it means?

Warning: PBS Professional has detected core file(s) in PBS_HOME that require attention!!!

There should be a core dump file present in your PBS_HOME directory.
Please move that core* file(s) to some other location or delete it and restart PBS services, then you should not be seeing that message again.

Hey @datakid
Before you delete the cores, could you do you give me a little bit of information about them?

  1. Find the cores. They’ll be in one of the priv directories in pbs home (e.g. /var/spool/pbs/server_priv).
  2. do file . This will tell you what daemon crashed. It’ll likely be the same daemon of the priv directory you are in.
  3. do: gdb /opt/pbs/sbin/ /var/spool/pbs//core
  4. do ‘bt’ and post the output to the thread.

This will help us debug what happened and possibly fix a bug.

Thanks,
Bhroam

Which binaries created cores

[user@clive server_priv]# pwd
/var/spool/pbs/server_priv
[user@clive server_priv]# ls | grep core
core_0001
core_0002
[user@clive server_priv]# file core_0001 
core_0001: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/opt/pbs/sbin/pbs_server.bin -t create -a 0', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/opt/pbs/sbin/pbs_server.bin', platform: 'x86_64'
[user@clive server_priv]# file core_0002
core_0002: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/opt/pbs/sbin/pbs_server.bin', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/opt/pbs/sbin/pbs_server.bin', platform: 'x86_64'

GDB on core0001

[user@clive server_priv]# gdb /opt/pbs/sbin/pbs_server.bin /var/spool/pbs/server_priv/core_0001 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Reading symbols from /opt/pbs/sbin/pbs_server.bin...Reading symbols from /opt/pbs/sbin/pbs_server.bin...(no debugging symbols found)...done.
(no debugging symbols found)...done.
[New LWP 12542]
[New LWP 12543]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/opt/pbs/sbin/pbs_server.bin -t create -a 0'.
Program terminated with signal 6, Aborted.
#0  0x00007f0ee70c3337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install pbspro-server-19.1.3-0.x86_64
(gdb) bt
#0  0x00007f0ee70c3337 in raise () from /lib64/libc.so.6
#1  0x00007f0ee70c4a28 in abort () from /lib64/libc.so.6
#2  0x00007f0ee7105e87 in __libc_message () from /lib64/libc.so.6
#3  0x00007f0ee710e679 in _int_free () from /lib64/libc.so.6
#4  0x00000000004d43ce in pg_db_save_svr ()
#5  0x000000000048c944 in svr_save_db ()
#6  0x000000000042970c in main ()
(gdb) 

GDB on core0002

[user@clive server_priv]# gdb /opt/pbs/sbin/pbs_server.bin /var/spool/pbs/server_priv/core_0002 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Reading symbols from /opt/pbs/sbin/pbs_server.bin...Reading symbols from /opt/pbs/sbin/pbs_server.bin...(no debugging symbols found)...done.
(no debugging symbols found)...done.
[New LWP 13160]
[New LWP 13162]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/opt/pbs/sbin/pbs_server.bin'.
Program terminated with signal 6, Aborted.
#0  0x00007fe84de73337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install pbspro-server-19.1.3-0.x86_64
(gdb) bt
#0  0x00007fe84de73337 in raise () from /lib64/libc.so.6
#1  0x00007fe84de74a28 in abort () from /lib64/libc.so.6
#2  0x00007fe84deb5e87 in __libc_message () from /lib64/libc.so.6
#3  0x00007fe84debe679 in _int_free () from /lib64/libc.so.6
#4  0x00000000004d43ce in pg_db_save_svr ()
#5  0x000000000048c944 in svr_save_db ()
#6  0x000000000042970c in main ()
(gdb)

Thanks for the info. I’ll file a bug for this. What version are you running?

Bhroam

[user@clive pbs]# pbsrun --version
pbs_version = 19.1.3

Thanks for the reply. I’ve filed the ticket. It may already be fixed in master. There has been many bugs fixed in master since 19.1.3 was released.

The server crashed when trying to save to the database. I’m not the expert of the database, so I’ll have to let someone else jump in.

Bhroam

1 Like

Has anyone determined what this problem was? I ran into the same issue with sched_priv.