I am back with some news. I have done the implementation as agreed. The new code can be found on the branch kerberos_support_3. The design doc was also altered. The following was done:
The GSS code was unified and generalized. Redundant code was removed. Same GSS code is used for both the TCP and TPP.
The TPP encryption was fully changed as agreed. The connection with comm is now encrypted. It should also work between routers - Anything that connects to comm will use encryption (like server, moms, scheduler, other comm). So you need to have host keytab on those nodes. With GSS code enabled, it is forbidden to use cleartext with comm now. The implementation replaces the regular tpp handlers with new gss_* handlers, and the gss_* handlers call regular leaf or router handlers. The asynchronous handshake is always expected at the beginning of communication.
TCP was improved. If the client wants to connect to the server with encryption, the auth batch request is sent, which initiates the handshake. The new cn_ready_func notices that handshake is in progress and processes the handshake tokens asynchronously. Once the handshake is finished, the cn_ready_func returns true (after unwrapping data) and data are processed by regular process_request(). The GSS layer is also isolated in its own layer here. It means that dis_* handlers are replaced with gss_dis_* handlers and the interface was extended as needed (e.g. the tcp_read was exposed to gss_dis_* layer via a new handler).
The tool for renewing credentials “renew-test” was added into unsupported directory.
TCP allows using cleartext, which means that it is possible to use regular clients with the GSS enabled server. It is nice, and it also means that you can move a job between the regular server and GSS enabled server. Peer scheduling should also work. Adding the encryption on TCP between server and scheduler should be quite easy now, but let’s keep it in TODO for future commit.
It is also possible to enable encryption from hooks. It is very well ready for it. It actually already works. The pbs_python just needs to have a valid Kerberos ticket in the default location. Let’s keep this in TODO because here we have also hooks run as users, which should be (maybe) addressed with proper user credentials.
I am quite happy with the changes. Let me know what do you think. I am ready to address more comments.