Thanks - this is a great change. A few quick comments:
auth_set_config - in this api the first parameter “func” is actually a function pointer to a logger, why don’t we name the parameter as “logfunc” or “logger” to drive home that fact?
auth_create_ctx - instead of this api returning 0/1 error, we could return the context pointer back in return instead of a parameter of double indirection. So pointer if successful, else null? Unless, of course, we need more than a boolean return value.
auth_destroy_ctx - why does this api need a pointer to pointer as parameter?
auth_get_userinfo - could u please add examples or explanation as to how/when the realm is different and has useful information than just being the same as hostname?
auth_do_handshake - could the return value itself be used to indicate a “handshake_done” instead of the last parameter? Like if you returned 2, handshake is done?
auth_set_config - what if a particular authentication library needs to have access to more than a single cred location? Or more “internal” configuration parameters. Let’s use an example. Let’s say in future we come up with a auth library has needs a switch to control its compression behavior - how can we pass this to the library, without making it part of a “config” interface? There exists several options of course: Instead of cred_location, we could set a path to configuration file for the auth library, or we could pass options via env variables. Am not sure which one is better, but probably worth considering.
The reason behind return 0/1 and using of double-pointer is PBS doesn’t care what’s the value of returned ctx, its just opaque value to PBS. And PBS only cares about the success or failure of API. Now take an example of current Munge authentication where it doesn’t have any ctx kinda stuff, so in this API returned ctx will be NULL but still call to API is a success. So now if PBS checks for ctx value is NULL or not then for Munge it will be marked as a failure.
The use of a double-pointer is to keep the same signature as create_ctx for ctx value. But if we want we can change it to a single-pointer also.
Sure will add an example of GSS authentication where the realm will be different than the hostname.
There might be chances that a handshake is not done in one shot, for example in case mutual authentication methods like GSS/Kerberos, this API will be called multiple times, so PBS can’t say whether to mark as a failure or need more data or success from 0/1. Also same as point 1, PBS doesn’t know whether auth lib needs more data to complete a handshake or not… PBS only cares about the success or failure of call to API and whether handshake is done or not?
Honestly, I am also not sure which one is better, I am open to any option.
Should we add the prefix “pbs” to all of these interfaces, or change “auth” to “pbsauth”? It will help avoid symbol collisions with other libraries.
auth_set_config - Can this method be called multiple times? if not, how about calling this “init” instead of “set_config”, that might better convey the fact that this routine needs to be called first, and only once. Also, I do like the idea of a config file to allow for adding other configuration options in the future.
auth_create_ctx - I’m not sure I understand the purpose of “void **ctx”. The API creates the context, so it must need ‘ctx’ to be in a particular form right? The client can’t just pass it an int pointer for example, right? So why is it a void * ? Also, can you please explain the use case for having multiple contexts? if there’s only going to be one context at a time then maybe this should be created automatically by the library, hidden from clients.
auth_do_handshake: it seems like this function won’t take care of the handshake completely, in which case, isn’t it just sending and receiving a message from a remote host? Maybe you can call it something else? I’d expect a handshake routine to take care of the entire handshake protocol.
auth_encrypt_data: you mention “text data”, but the interface accepts a void*. If the API expects text then maybe use char* instead? Also, Should this interface take an “encryption method” argument? You haven’t mentioned what encryption method will be used for encryption, do we plan to only support one?
The only reason right now I have not added “pbs_” prefix is, I wanted to make this API specs/interface completely generic so not just PBS but any other application (if at all in future) wants to use this interface for authentication and/or encrypt/decrypt data, they can use it.
But if that’s not the case and still we want to add “pbs_” prefix then I can add it.
Yes, you can call this method multiple times if you want to change the logger or cred location.
I don’t think “init” is the right word here, as we are not initializing the auth library, we are just giving it a few configuration parameters. In other words, if we say “init” that means we must have to call it before calling any other API, but here “auth_set_config” is not mandatory API that you have to call it. It should be called only if you want to change those configuration parameters. If you don’t care about those parameters, then still you can use other API.
About the config file, few questions:
Who will create/maintain the config file?
What should be the format of its content?
Can we pass the function pointers in the config file? no right? then what will do for the function pointers (like logger method for now)?
Instead, how about using “structure” and in future will keep adding members to it as needed, and whatever library already implemented using old struct definition won’t have any problem because it doesn’t care about the newly added member unless it wants to use newly added members.
Just to explain my thought better, see set_tpp_config() in PBS code (here), we pass pbs_conf structure to it but set_tpp_config() only uses few members of it, now if you add new member in pbs_conf today then you don’t have to do any changes in set_tpp_config() unless you want to use newly added member in set_tpp_config()
So, the idea here is LibAuth will just take input data, process it and generate output data and now how/where to transfer those data is up to the application. Means LibAuth has no idea of how to connect to remote hosts or how to transfer data to socket etc…
Now why we need to make this way?
There might be chances that one application can have multiple ways of connecting to remote hosts and transfer data between hosts.
Take an example of PBS only, where you have two ways to connect to remote host one TCP and another TPP and both method has its way of connecting and transferring data between hosts. So either we have to teach all those ways to LibAuth or let PBS do it and LibAuth will take just take input data which PBS provides, process it and provide output data back to PBS and then PBS will transfer those data using appropriate method.
I will change “text data” to “unencrypted data”.
About “encryption method”:
We are using this API to encrypt data and get encrypted data back…
Now how to encrypt data, what encryption method to use is up to this API (or implementation of auth library). This means whatever auth library providing us this API knows hows to encrypt data using which method, and we (aka application) have no idea about it (+ we don’t care about it).
So no, we don’t want to pass the “encryption method” as an argument.
Thanks for your reply Hiren and explaining things in great detail. Some more comments:
But you do mention in the design doc that this should be called before any other API: “This API should be called first before calling any other LibAuth API.” If one wants to change the logger or cred location, they will have to restart/HUP the daemon anyways right?
Ok, yes, I like that
Ok, then it really is not doing any handshake at all. If it’s just processing the data to prepare for a handshake, then maybe call it “pbs_auth_prep_handshake_data()” or something similar ?
Trying to understand how this will work. Say somebody changes the credentials inside the ‘cred_location’ file, how will the daemons get notified that this change has happened without a HUP or a restart ?
Again just trying to understand, how will the other functions work without the configuration being set first? Will pbs_auth_create_ctx() be successful without first calling pbs_auth_set_config() ?
If pbs_auth_do_handshake() doesn’t know how to talk to another host, then how will it do the handshake? Am I missing something?
Once PBS gives cred_location to the auth library then it’s auth library’s job to keep watch on that location and do whatever necessary to do when credentials get change in the given directory.
Yes, why not? If you in my PR, currently we have to implementation of auth library one for Munge and another for GSS/Kerberos. Now, if you both implementation then don’t make use of cred_location as of now. So in such a case, if you don’t call set_config() first then still you can call create_ctx(), the only disadvantage will be printed on stderr instead of your desired location. So as I said earlier “pbs_auth_set_config” is not mandatory API to call first before calling any other API but its recommended to call so you can get log messages at the proper place.
Ok let me try to explain in other words:
To understand better will divide a complete handshake in 3 part:
Connecting to another host
Sending/receiving handshake data to/from another host
Processing received data and generate handshake data to send
So from the above 3 parts, PBS handles 1st & 2nd (aka networking stuff of connecting and transferring data between hosts) while “pbs_auth_do_handshake” handles the only 3rd.
So based on above we don’t need to teach or write anything networking related stuff in “pbs_auth_do_handshake”.
(If not then I suggest you go through this page and this file and see an implementation for auth APIs for Munge authentication)
Interesting. So do you envision a separate thread in libauth which will monitor cred_location? Can there be a race condition? admin modified cred file at time t1 to remove credentials for user1, libauth notices the change at time t2, but in the meantime user1 accesses data from PBS between t1 and t2? If situations like this are possible then it might be better to put this responsibility on the admins and require them to HUP PBS daemons. What do you think?
Ok, thanks for clarifying.
Thanks, this is what I was thinking as well. When I look at a function named “pbs_auth_do_handshake”, it makes me think that this function will do all of these 3 parts and not just the 3rd part, which actually seems to be more about pre and post processing of the handshake data than the actual handshake itself. That’s why I think that we should rename it to something like pbs_auth_cvt_handshake_data(). What do you think?
I don’t think so that there will be any race condition. Unfortunately right now I don’t have any real implementation which can show what I am trying to explain here… but let me try to explain with some fake example:
Let’s say cred_location is directory which stores user’s password in encrypted format in file named with user’s name, like file “user1” will contain user1’s encrypted password, file “user2” will contain user2’s encrypted password… and libpass.so is the library which you implemented using libauth API specs…
Now user1 invoked qstat -Qf, so qstat, first authenticate to PBS server then send statque req…
So while authenticating to server first qstat will talk to libpass and ask for password for user1, and libpass with look for file called “user1” in given cred_location and it will read it and return its content to qstat, then qstat will send it to server, now server (while validating received password) will give that password to libpass, and libpass will read file called “user1” from given cred_location then match its content with received password, if match then validated else rejected…
Now let’s say at time t1 admin removed file called “user1” and at time t2 user1 invokes qstat again, so qstat will fail to get its password because libpass fails to read “user1” file as there is not file called "user1 in given cred_location. So no race condition…
Now let’s say somehow user1 manage bypass libpass calls and sends its password to server but server will fail to validate password, because there is not “user1” file inside cred_location (and user1 can’t change server to bypass libpass calls ) so again no race condition…
Ok, (assuming “cvt” means “convert”) I think “pbs_auth_process_handshake_data()” will make more sense here (as this API is not converting handshake data but it is processing it…)