The problem:
Right now, pbs_snapshot needs root access to capture everything on the system. There’s a ‘sudo’ argument to PBSSnapUtils, but the pbs_snapshot program calls it with sudo=False so that the user has to explicitly run pbs_snapshot program as root/with sudo to capture everything. The idea was that this will be more transparent to users.
But, this really isn’t very transparent to the users afterall as they don’t know about all of the commands that pbs_snapshot will run when they run the entire thing as root. So, they can’t really make an informed decision right now about whether they should run pbs_snapshot as root or not.
Proposed Solution:
We might be able to solve this if pbs_snapshot uses “sudo=True” internally even if the user runs it as a normal user, so I was thinking that we can add an option called ‘–with-sudo’ to pbs_snapshot to make this happen.
We’ll probably have to publish a list of all commands that pbs_snapshot/PTL runs with sudo so that users can look at it and make a informed decision about whether they wanna use the --with-sudo option or not. Or maybe pbs_snapshot can list all the commands on the prompt when they use this option and take user input to confirm that it’s okay to proceed with sudo.
What do you guys think about this approach? Any concerns? Any better alternative solutions?
@agrawalravi90 This is an interesting problem and I want to understand the problem correctly.
As of today, if a user runs pbs_snapshot as a non-root user would it produce any useful results? Also, in future, with the proposed change, if a user issues command without ‘–with-sudo’ option then would it produce useful results?
To answer your question, it depends on what you consider “useful”. pbs_snapshot, if run without the appropriate permissions, will still capture everything that it can with the permissions that it has been provided. For example, it won’t capture anything inside the ‘_priv’ directories if not run with root privileges, but it will still capture “qstat -f” output. Does that answer your question?
Thanks Ravi!
Well it does answer my question to some extent. Based on your answer I think the output being “useful” or not will depend on who is running it and when it is run.
This is a diagnostic tool which captures state of the system so that one can debug a problem, so it is likely that the tool will be run by an admin only and in that case, is it not okay to expect that they run it as root to get useful data?
Unlike pbs_diag, admins can’t easily look up pbs_snapshot’s script to see what commands it is running. So, running the entire pbs_snapshot script with root privileges might be a security concern for some as they have no visibility into what commands it will run.
So, I was thinking that pbs_snapshot can use sudo to internally run the handful of commands for which it needs root privileges when “–with-sudo” is provided, and that small list of commands can be made visible to users so that they can review it, add those commands to their sudo rules if needed, and then run pbs_snapshot without worrying about it running anything dangerous with root privileges.
Is pbs_snapshot delivered as a binary that it can not be read? What you are probably trying to point is a trust issue between the admin/PBS-support and PBS. In that case, there are many commands supported by PBS present in ‘sbin’ directory which should only be run as a root. Why pbs_snapshot needs to be any different than those commands?
I personally feel this is a non issue unless raised by admins/PBS-support that they do not trust PBS commands to do the right thing. Producing half-baked data with an uncertainty that it may or may not be useful for debugging seems futile to me. Even if we make the change you are suggesting, we will depend on user (whoever is running the command) to have right permissions set in their sudoers file, so that when used with ‘–with-sudo’ option data can be gathered as a super user.
No it’s not, but one would have to walk through the nitty gritty details of all the code paths in PTL that pbs_snapshot calls to find out all of the commands that it runs. With my proposal they can just run pbs_snapshot with the “–with-sudo” option, if it works within the bounds of the existing sudo rules, great. If not, then they’ll easily be able to make a conscious choice to make an exception for the command that failed or not.
That’s true, but I thought that we wanted to go in the direction of making all of PBSPro non-root friendly, no?
I’d like to differ. When we know that for the data to be useful we will need to gather everything then why are we doing this.
NO I disagree. Direction was to make sure that PTL tests can be run in a non-root friendly environment, so that we don’t get any false positives in testing. It was not to make product changes that the commands that need to work as root start working as non-root. If this is not true then we might also need to make sure that we run pbs server/mom/comm as non-root too
Not necessarily. All of the daemon logs (except accounting logs) are non-root accessible. A lot of the server configuration is non-root accessible as well (qstat -f, qstat -Qf, qmgr print etc.). So, one could still get a lot of information without special privileges right?
But, I wasn’t suggesting that they run pbs_snapshot and capture partial data, I was suggesting that they can run pbs_snapshot with sudo and it will either capture all of the data, or it will error out saying something like “command X cannot be run with current privileges” because their sudo rules prevent it, and that will allow them to make a conscious decision about 1) whether they want to change their sudo rules and capture all of the data, or 2) use pbs_snapshot without sudo to capture partial data, which may or may not be useful, or 3) just not run pbs_snapshot.
Isn’t that a better approach than expecting users to blindly trust us and run a new software with root privileges on their live system?
Yes, but I’m trying to make the case that pbs_snapshot doesn’t really need to run as root. It can already capture a bunch of information just as a normal user. With this, it will be able to capture privileged information with sudo instead of requiring that it be run as root. Isn’t that better?
My opinion is that everyone is a lazy typer. Running ‘sudo pbs_snapshot’ takes less time to type than ‘pbs_snapshot --with-sudo’. People will either run it as non-root and capture whatever data it can or run it as root to capture it all. I don’t find the new option all that useful.
No that is incorrect, most of these commands produce role dependent results. So the results of the commands you mentioned will vary based user/operator/manager privileges.
Most of the times in the past the usual case of using pbs_diag was to gather information in case of crashes. Under a non-root user this will produce unusable results and increase back and forth communication to gather more data.
A conscious decision can only be made if one knows what are the repercussions of running a command under different privileges. Since you mentioned code is not easily readable, I suppose we will then have to expose the information about what the command produces under which privilege mode.
The fact that user has PBS installed and running mom/server under root privileges already shows the trust. In my mind, we should not doubt our own commands and give admin a tool that says if you want full snapshot and if you feel valiant enough then try running this tool with this new option.
@arungrover, @bhroam and @iestockdale, thanks for discussing this with me offline. A summary of our discussion (please do correct anything that I might be remembering incorrectly):
Even though PBSPro runs certain things as root, it’s good practice to use root/sudo only when necessary, so if pbs_snapshot can be refactored to minimize sudo usage then that’s the correct direction.
From an interface standpoint, it’s better if, by default, pbs_snapshot uses sudo internally for commands which need it. So, instead of a “–with-sudo” option, there should be a “–no-sudo” option. This way, pbs_snapshot will always try to capture all of the data by default, as it seems that a partially complete snapshot is not very useful.
There should be an option to pbs_snapshot that prints out the list of all commands which it will run as sudo.
Please let me know if I’ve captured that correctly, and if this changed proposal makes sense.
In addition to all the commands that run as sudo, I’d like to see an option to list all the commands it runs. It’d be better if both of these were in one option with a section that lists if a command is run as sudo.
Bhroam
I’d find it strange to have a command that lists just the commands run as sudo and not one that lists everything. You do have a point that running snapshot will basically tell you what it’s going to do. Of course it tells you when it does it
Ok, I see your point. But, I feel like one could always just use this option and grep for sudo to filter out the list of commands that will be run as sudo, all we have to do is list “qmgr” as “sudo qmgr”. So, what do you think about just one option which will list all commands, and the users can filter sudo from that?
OK, then I’d rather not bother with the option at all. You are right, they could grep sudo out of the command itself and find out what it’ll sudo. If they really care about not running with sudo, they’ll use the --no-sudo option.
I know of several large companies that either don’t install sudo or use alternatives altogether. One such can of worms we’d need to open for that would be Centrify’s dzdo, which is in use by some of those companies.
So yeah, I’d like to not assume sudo is even there in the first place.
@all, I think this proposal was a bit premature, maybe we should just wait and see whether users mind running pbs_snapshot as root and come up with a solution if they do.