PP-758: Add pbs_snapshot tool to capture state & logs from PBS

Hey Hiren,

Because we need sudo privileges, I think it’s better to have the user explicitly run pbs_snapshot with sudo instead of internally running commands with sudo without the user’s knowledge. That’s why the EDD mentions pretty clearly that you need to run with sudo explicitly

You are right about the snapshot log, I’ve removed the optional tag from it. Please let me know if it looks good to go.

Thanks,
Ravi

Looking really nice!

It would be great if the default (no arguments) run collected everything that might possibly be useful for debugging. Since having both some services logs and some accounting logs would be generally useful, I suggest always capturing some logs. How about collecting 5 days of services logs (server, scheduler, MOM, comm) and 30 days of accounting logs and also replacing “Option -L < num days >” with:

  • –service_logs=< num days >
  • –accounting_logs=< num days >

It might be important to separate out services logs (which can be big) from accounting logs when the --additional_hosts argument is used.

One more…

Since a snapshot can be big… I wonder if defaulting to putting the output in /tmp is a good choice. Another option would be to force the user to specify the output file… so they would be responsible for picking a location that has enough space.

If there is a way to compress it “on the fly” that could also be a bonus… perhaps for a later version.

1 Like

Under core_file_bt, I suggest adding another subdirectory, say misc, for any other core files that may be found other than the ones found in server_priv, sched_priv, mom_priv. For instance, pbs_python could dump core which might end up in server_priv/hooks or server_priv/hooks_tmp directory…

Thanks for the feedback Bill!

I like the idea of requiring the user to specify the path to output file and have changed the EDD to reflect that, although this would mean that just saying “pbs_snapshot” won’t be enough, now you’d necessarily have to specify the -o input. Please let me know if that’s okay.

I’ve also changed the logging to now have ‘–serve-logs’ and ‘–accounting-logs’. Please let me know if it looks okay.

I hadn’t thought about that, thanks for the suggestion, I’ve added that in.

1 Like

I have an implementation question that may or may not have an answer yet, but is related to the -o option discussion.

Assuming the tool is collecting all of the data in some temporary location before compressing the final output to wherever -o specified, if we are worried about providing a default -o location then I think we should REALLY be worried about where the uncompressed data goes while the tool is running since that will require much more space than the compressed final output.

Hey Scott,

That’s a good question. From my preliminary research into Python’s archiving facilities, I did the following test:

  • Launched a docker container with default disk space of 10GBs.
  • Copied over a directory which was ~7GBs big to the container.
  • Walked down this directory and progressively added the files to a gunzipped tarball using Python’s tarfile module

If the program was copying over the files in a temporary directory and compressing it all later then this would have failed as the filesystem isn’t large enough for another copy of the data. So, I guess it just copressed them on the fly and created the compressed tarball. We could do something similar to create the snapshot tarball as well. What do you think?

If it is a low effort thing to implement it would be nice, but honestly no one that I know of has complained about pbs_diag copying everything uncompressed to /tmp and then gzipping to /root by default (not that we shouldn’t improve on that anyway, which we are), so I don’t see any of this as a showstopper.

Thanks. Regarding --serve-logs and --accounting-logs, those options and defaults look good, thanks!

I would, however, suggest that zero means zero. So, for both, instead of:
If the value is 0, only the logs for the current day are captured.
If the value is 1, only the logs for the current day, and the day before will be captured.
it should be:
If the value is 0, no logs are captured.
If the value is 1, only the logs for the current day, will be captured.

Thanks!

I think it shouldn’t be too much effort to implement it that way, so I’ll try to do it, but thanks for letting me know that this is not super important.

ah, yes, thanks for pointing that out, now that we are capturing logs by default, setting these to 0 seems like the best way to ask for no logs.

Request you guys to review the latest changes and let me know if the design looks good to go to implementation phase. Thanks for all the feedback!

The changes look good. I have one suggestion. We should provide an option to get all of the accounting logs without having to specify some large number. Possible

–accounting-logs -1 or --accounting-logs all

Thoughts?

Ok, but I sort of feel like if we add this for accounting logs, we should add the same option for service logs as well, what do you think?

I have a couple of comments:

My main comment is about the duplication of data. I understand it’s needed for support to debug, but keep in mind that obfuscation takes time. The more duplication of data, the longer taking a snapshot will take. I’m not saying we shouldn’t get it, I’m just saying we should be smart about it. It isn’t just taking extra space in a directory, it is taking extra time while taking the snapshot. If it takes 30m-1hr to take a snapshot, admins might not want to do it (logs can be huge).

If pbs_snapshot is replacing pbs_diag, I’d remove the -d option. It’s a way of taking the output of pbs_diag and converting it to pbs_snapsho formt. We’re not going to use pbs_diag any more, so why have an option for it?

Just as a note, by putting the log directories one level deeper than in the normal PBS_HOME, we can’t run tracejob on them. If all the log directories were in the same place, you can point tracejob at that directory and it’ll work. The separation does keep things tidy though.

As for sudo, the real requirement is a command run as root. I wouldn’t say to run it as sudo. You don’t need sudo privileges, you need root. the sudo command is just one way of getting it. You could run su -c just as easily. You could make a side note that sudo is alright, but I wouldn’t make the interface section use it.

While it is an admin orientated tool, it can still be run as a normal user. You don’t get everything, but you get most things. This would allow those of us who have accounts on customer sites to take partial snapshots without bugging the admin. If sudo is embedded inside the tool, the admins will get upset at us for running it. I’ll add my vote to run sudo outside the tool.

Bhroam

Not sure there’s much difference between “all” and 9999 (which would be 27+ years). Is it worth the extra engineering and testing work?

I guess the all option is not needed as long as 9999 gets all the logs up to 9999 days in the past without an error. This would also need to be tested.

We are not duplicating logs though, it’s mostly the qstat info that’s being captured in multiple different ways. But ya, while implementing this, we can make sure that the extra data doesn’t take up too much time.

it’s mostly there to ease the transition from pbs_diag to pbs_snapshot. Tools like pbs_stat and pbs_loganalyzer right now can consume a pbs_diag, that will be changed to pbs_snapshot, so by having a way to convert diags into snapshots i think we’ll make sure that existing diags are still useful. Also, users who haven’t switched to pbs_snapshot yet can still send us a diag that we can convert to snapshot and consume. So, maybe this option will go away in future, but I think it’s useful to have right now.

I’m actually not aware of how tracejob works, I’ll take a look at it and see what can be done. One obvious option is to update tacejob to explicitly be compatible with a snapshot directory. Thanks for pointing this out.

About sudo, ya, even I was thinking of removing it from the interface section, you really just need root and that can be done in a number of ways, I’ll update the doc.

About partial snapshots, can they really be useful? We won’t get important files like sched_priv and attributes that are visible to managers only. Also, will you as a normal user be able to see all jobs on the system?

I’ll definitely test that out, thanks.