An In-built Heap Profiling feature and post proc tool for OpenPBS

tl;dr I have implemented an in built heap profiling feature and a post proc tool

Hi,

A few years ago, while hunting a memory leak in PBS daemon, I had this idea, which I could now finally get some time to successfully implement as a hobby project. Sharing it here so that it can be useful to the community. Sorry for the long post.

Its a run time heap profiling infrastructure that can sit in production env without impacting performance (like the undolr integration) and can be triggered as and when debugging of unreasonable memory usage is noticed. This feature will give the developers to visualize how much heap memory is being consumed from which all places in the code base, similar to Valgrind’s massif tool (but NOT 20 times slower to run!). I have named it as “heap track” feature. It sounds very simple - override all the heap calls with a wrapper and use few avl trees to keep track of all the allocs and frees, but is difficult to implement. In this post I will try to explain the feature, how to build, activate and use it to get some hint on memory usage pattern using a post processing script that has been also developed with the feature. Please do give it a try, and if the community / maintainers indicate this feature sounds logical and comfortable enough to be integrated into OpenPBS master, I would be happy to raise a PR and create doc pages, if not you may backup the patch for any future use.

With this feature you can:

  • Figure out what is in the program heap at any given time
  • Locate memory leaks
  • Finding places that do a lot of allocation

I have stashed a finished, usable version of this feature in my github branch here. and it is based on v20.0.1 Tag of OpenPBS

Below are the steps to demonstrate its usage

1. build openpbs with heap track feature

export CFLAGS="-g"
./autogen.sh && ./configure --prefix=/opt/pbs --enable-heap-track
make
sudo make install

2. activate

export PBS_HEAP_TRACK_ENABLED=true
# disable runtime heap tracking by commenting above line

# optionally set a stack depth level to record, (default is 5)
# export PBS_HEAP_TRACK_STACK_DEPTH=8

/etc/init.d/pbs start
# or whatever way you use to start pbs daemons

pgrep to check if daemons have started

[root@pbsserver shrini]# pgrep pbs -a                                                                                
12948 /opt/pbs/sbin/pbs_comm
12965 /opt/pbs/sbin/pbs_mom
12969 /opt/pbs/sbin/pbs_sched
13251 /opt/pbs/sbin/pbs_ds_monitor monitor
13342 /opt/pbs/sbin/pbs_server.bin

3. trigger a heap profile dump

To dump a heap profile data from any pbs daemon, just send a SIGUSR1 to it as shown below

[root@pbsserver shrini]# kill -USR1 13342
[shrini@pbsserver ~]$ ls -l /var/spool/pbs/server_logs/ 
total 244
-rw-r--r--. 1 root root 80078 Jul 13 04:10 20210713
-rw-r--r--. 1 root root  5008 Jul 13 03:57 Server@pbsserver_heap_track.12130.0713035736
-rw-r--r--. 1 root root 12414 Jul 13 03:57 Server@pbsserver_heap_track.12494.0713035740
-rw-r--r--. 1 root root  7485 Jul 13 03:57 Server@pbsserver_heap_track.12970.0713035747
-rw-r--r--. 1 root root 19792 Jul 13 04:10 Server@pbsserver_heap_track.13342.0713041034

sending the USR1 signal to pbs_server daemon generated the heap track dump file Server@pbsserver_heap_track.13342.0713041034 inside the server’s log directory.
Notice that the filename format of dump file is <daemon_name>_heap_track.<pid>.<mmddHHMMSS>
This file contains some info and back traces with memory usage as shown below

[shrini@pbsserver ~]$ head -n14 /var/spool/pbs/server_logs/Server@pbsserver_heap_track.13342.0713041034 
# PBS Heap Usage diagnostic dump
# Dump Time: Tue Jul 13 04:10:34 2021

# Daemon: Server@pbsserver
# Host: pbsserver
# PID: 13342
# Stack Depth: 5

Total Heap memory: 994293 Bytes
location wise usage dump START
Malloc invocation stack:allocated Bytes
0x498613:0x4b85c6:0x45f565:0x464dfc:0x2b6a3ffc1555::1920
0x535816:0x547d3d:0x534e17:0x5352b3:0x53654a::0
0x535816:0x547d3d:0x534498:0x497478:0x497801::0
...

4. post process

There is a python script installed to help the developer visualize the profiled data and also resolve the addresses to symbol

[shrini@pbsserver ~]$ which pbs_heap_track_analyze.py 
/opt/pbs/bin/pbs_heap_track_analyze.py
[shrini@pbsserver ~]$ pbs_heap_track_analyze.py                                                                       

Usage: pbs_heap_track_analyze.py [-b <debug binary path>] [-a <addr2line_path>]
        [-w] [-L depth_level] [-z] [-r] <heap_track dump file> [<heap_track dump file2>]

        post processes a <heap_track> dump file by resolving source code lines
        using "addr2line" and other processing to make the dump human viewable
        If two dump files are povided, then they will be diffed to output the
        locations of memory difference to help in detecting leaks

Options:
        -b <debug binary path> : to supply the path to the debug binary to be used with addr2line
        -a <addr2line_path> : to supply the path to addr2line utitlity
        -w : write the analysis to a neighboring file (by appending ".view") instead of default stdout
        -L depth_level: number of stack levels to deduce and analyze from the
                recorded stack level in the dump file. A "0" level arg will deduce
                all stack depth levels upto 1. default is 1
        -z : Show stacks that have 0 allocations
        -r : Show numbers in raw instead of human readable format (in ascii graph mode)

Copyright (C) <2021>  <Shrinivas "shrini" Harapanahalli>

Lets invoke this tool on the dump file we got. But before that to get better view, you can install ascii_graph python module using command pip install ascii_graph

[shrini@pbsserver server_logs]$ pbs_heap_track_analyze.py Server@pbsserver_heap_track.13342.0713041034              [s
input file parse completed ! : Server@pbsserver_heap_track.13342.0713041034
found 390 stacks with stack depth 5
lines read = 402 from /var/spool/pbs/server_logs/Server@pbsserver_heap_track.13342.0713041034

Analysis Report:
# PBS Heap Usage diagnostic dump
# Dump Time: Tue Jul 13 04:10:34 2021

# Daemon: Server@pbsserver
# Host: pbsserver
# PID: 13342
# Stack Depth: 5

Total Heap memory: 994293 Bytes

Allocations at Stack depth 5 :
######################################################################################################################
######################################################################################################################
#######################################################################################################
████████████████████████████████████████████████████████████  244K  25.13%  0x4e4a77(crc_file at ??:?)<--0x4ee42c(hook
_recov at ??:?)<--0x4604b6(pbsd_init at pbsd_init.c:1009)<--0x464dfc(main at pbsd_main.c:1439)<--0x2b6a3ffc1555(?? ??:
0)                                                                                                     
███████████████████████████████████████████████               192K  19.77%  0x4cb562(tpp_em_init at tpp_em.c:136)<--0x
4f4f06(init_poll_context at net_server.c:1158)<--0x4f3a49(init_network_add at net_server.c:335)<--0x46504d(main at pbs
d_main.c:1509)<--0x2b6a3ffc1555(?? ??:0)                                                               
███████████████████████████████████████████████               192K  19.77%  0x4cb562(tpp_em_init at tpp_em.c:136)<--0x
4f4ec7(init_poll_context at net_server.c:1153)<--0x4f3a49(init_network_add at net_server.c:335)<--0x46504d(main at pbs
d_main.c:1509)<--0x2b6a3ffc1555(?? ??:0)                                                               
███████████████████████████████████████████████               192K  19.77%  0x4cb562(tpp_em_init at tpp_em.c:136)<--0x
4cc63c(tpp_transport_init at tpp_transport.c:581)<--0x4c493f(tpp_init at tpp_client.c:814)<--0x46533f(main at pbsd_mai
n.c:1565)<--0x2b6a3ffc1555(?? ??:0)                                                                    
█████████████                                                  53K  5.47%  0x460818(pbsd_init at pbsd_init.c:1099)<--0
x464dfc(main at pbsd_main.c:1439)<--0x2b6a3ffc1555(?? ??:0)<--0x428849(_start at ??:?) 
...

The script automatically used the server binary and addr2line utility to resolve the addresses to src code locations. Please feel free to try out the options mentioned in the usage. You can even get multiple profiles with reducing stack levels by using the -L option and -L 0 for all possible stack depths !

5. Diff two dumps to get leaks

Now with one heap dump we can get an idea on the heap spread, but its not easy to pinpoint a leak unless there is some huge obvious leaks. Hence the script can take a second dump file from the same daemon process and diff them to show you where exactly the allocations changed. To demo this lets first take two dumps one before and one after submitting a job:

[root@pbsserver server_logs]# pkill -USR1 pbs_server
[root@pbsserver server_logs]# ls -l
-rw-r--r--. 1 root root  20127 Jul 13 04:39 Server@pbsserver_heap_track.13342.0713043910
[shrini@pbsserver ~]$ qsub  -- /bin/sleep 10000
0.pbsserver
[shrini@pbsserver ~]$ qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
0.pbsserver       STDIN            shrini            00:00:00 R workq           

[root@pbsserver server_logs]# pkill -USR1 pbs_server
[root@pbsserver server_logs]# ls -l
-rw-r--r--. 1 root root  20127 Jul 13 04:39 Server@pbsserver_heap_track.13342.0713043910
-rw-r--r--. 1 root root  25676 Jul 13 04:39 Server@pbsserver_heap_track.13342.0713043928

Now feed both the dump files to the post proc script to get a diff:

[shrini@pbsserver server_logs]$ pbs_heap_track_analyze.py Server@pbsserver_heap_track.13342.0713043910 Server@pbsserve
r_heap_track.13342.0713043928 
input file parse completed ! : Server@pbsserver_heap_track.13342.0713043928
found 511 stacks with stack depth 5
lines read = 523 from /var/spool/pbs/server_logs/Server@pbsserver_heap_track.13342.0713043928
input file parse completed ! : Server@pbsserver_heap_track.13342.0713043910
found 397 stacks with stack depth 5
lines read = 409 from /var/spool/pbs/server_logs/Server@pbsserver_heap_track.13342.0713043910
 
Analysis Report:
# Daemon: Server@pbsserver
# Host: pbsserver
# PID: 13342
# Stack Depth: 5

DIFF between dump files:
Server@pbsserver_heap_track.13342.0713043928 <<-->> Server@pbsserver_heap_track.13342.0713043910
Allocations at Stack depth 5 :
######################################################################################################################
######################################################################################################################
#################################################################################
████████████████████████████████████████████████████████████   8K  0x535816(dis_setup_chan at dis_helpers.c:1167)<--0x
547d3d(tcp_get_chan at tcp_dis.c:86)<--0x534e17(dis_get_readbuf at dis_helpers.c:605)<--0x5352b3(dis_getc at dis_helpe
rs.c:885)<--0x53654a(disrsl_ at disrsl_.c:81)                                    
████████████████████████████████████████████████████████████   8K  0x535862(dis_setup_chan at dis_helpers.c:1170)<--0x
547d3d(tcp_get_chan at tcp_dis.c:86)<--0x534e17(dis_get_readbuf at dis_helpers.c:605)<--0x5352b3(dis_getc at dis_helpe
rs.c:885)<--0x53654a(disrsl_ at disrsl_.c:81)                                    
████████████████████████████████████████████████               6K  0x444a34(job_alloc at job_func.c:358)<--0x48191d(re
q_quejob at req_quejob.c:508)<--0x467c61(dispatch_request at process_request.c:705)<--0x467a6d(process_request at proc
ess_request.c:591)<--0x4f3fb0(process_socket at net_server.c:522)                
██████████                                                     1K  0x4ad780(set_select_and_place at svr_jobfunc.c:2028
)<--0x4ae8ff(set_resc_deflt at svr_jobfunc.c:2491)<--0x4ac717(svr_chkque at svr_jobfunc.c:1431)<--0x4826a6(req_quejob 
at req_quejob.c:1051)<--0x467c61(dispatch_request at process_request.c:705)      
████████                                                       1K  0x4da49c(attrlist_alloc at attr_func.c:290)<--0x4da
5f4(attrlist_create at attr_func.c:348)<--0x4d9d85(encode_str at attr_fn_str.c:151)<--0x49ae27(svrcached at stat_job.c
:130)<--0x49b0d6(status_attrib at stat_job.c:215)                                
███████                                                       972  0x4da49c(attrlist_alloc at attr_func.c:290)<--0x4da
5f4(attrlist_create at attr_func.c:348)<--0x4d8397(encode_l at attr_fn_l.c:160)<--0x4d88c4(encode_resc at attr_fn_resc
.c:248)<--0x49ae27(svrcached at stat_job.c:130)                                  
███████                                                       960  0x4d8f35(add_resource_entry at attr_fn_resc.c:584)<
--0x4d89cd(set_resc at attr_fn_resc.c:314)<--0x4d35b9(attr_atomic_set at attr_atomic.c:185)<--0x47e127(modify_job_attr
 at req_modify.c:493)<--0x451ebe(stat_update at node_manager.c:2401)             
██████                                                        904  0x4f44bf(add_conn_priority at net_server.c:761)<--0
x4f4474(add_conn at net_server.c:730)<--0x4f4423(accept_conn at net_server.c:708)<--0x4f3fb0(process_socket at net_ser
ver.c:522)<--0x4f42bf(wait_request at net_server.c:637)                          
█████                                                         800  0x540a2b(parse_chunk_r at grunt_parse.c:384)<--0x54
0da0(parse_chunk at grunt_parse.c:510)<--0x4c36c9(do_schedselect at vnparse.c:4713)<--0x4ae498(make_schedselect at svr
_jobfunc.c:2333)<--0x4ae9fc(set_resc_deflt at svr_jobfunc.c:2509)                
█████                                                         800  0x5406c2(parse_node_resc_r at grunt_parse.c:223)<--
0x5409b2(parse_node_resc at grunt_parse.c:314)<--0x45ac08(set_nodes at node_manager.c:7073)<--0x48fc7c(assign_hosts at
 req_runjob.c:1871)<--0x48fabf(where_to_runjob at req_runjob.c:1801)             
█████                                                         718  0x4c35d5(do_schedselect at vnparse.c:4684)<--0x4ae4
98(make_schedselect at svr_jobfunc.c:2333)<--0x4ae9fc(set_resc_deflt at svr_jobfunc.c:2509)<--0x4ac717(svr_chkque at s
vr_jobfunc.c:1431)<--0x4826a6(req_quejob at req_quejob.c:1051)                   
████                                                          681  0x4da49c(attrlist_alloc at attr_func.c:290)<--0x4da
5f4(attrlist_create at attr_func.c:348)<--0x4d8397(encode_l at attr_fn_l.c:160)<--0x49ae27(svrcached at stat_job.c:130
)<--0x49b0d6(status_attrib at stat_job.c:215)                                  
...

Hope you get the idea. You can use it with any of the PBS daemons

Caveats:

  1. There should be some minute speed penalty when the profiling is turned on in runtime which will be of the same complexity of the avl tree implementation in pbs. If it is off at runtime then penalty may be almost nil.
  2. Memory allocated by python interpreter cannot be tracked with this as it use a separate memory allocator.(but there are python modules to profile it)
  3. Not extensively tested (as I do not have the resource)
  4. Any unknowns i’m not aware of.

Of course there are many heap profilers out there, which may be more suited or tested for your needs. As I said earlier I just did this as hobby to prove my idea.

hope you made it till here. Do check it out !
Thank You :slight_smile:
-Shrini