Thursday, August 5, 2010

Kerberos + Open Directory + OSX Snow Leopard

I just finished tracking down one of those terrible bugs that are hard to find because the cause is just so seemingly disconnected.

I recently installed an Xserve with Snow Leopard (OSX 10.6) and Open Directory. Everything worked great, except that periodically kadmind would go into an infinite loop and never come back to this world.

So, I downloaded the latest MIT kerberos source package, compiled it, and installed it to /opt/kerberos on my system. After verifying that it worked, I renamed the osx provided kadmind, and copied the new one, and started it up. Everything worked great, until....

Uses on my linux hosts complained that they could not change their password, and received the following errors:

root@clipper:/etc# kpasswd tbriggs
Password for tbriggs@CS.SHIP.EDU:
Enter new password:
Enter it again:
Authentication error: Failed reading application request

After reissuing keys, rewriting configuration files, triple checking ntp configurations ( to make sure clock skew wasn't the problem ), I finally broke down (my resolve and my emotions), and broke out GDB. Having compiled kadmind from MIT source, I was able to use GDB to trace through the code. I found that it was failing in lib/krb5/krb/rd_req_dec.c, specifically when it calls krb5int_authdata_verify. I traced into that, and found that it was failing whilst running authorization plugins.

So, here is the tricksy part. I used OSX's dtruss to capture file activity, and found, to my dismay, that there were two sets of authorization plugins, one from /System/Library/KerberosPlugins/KerberosAuthDataPlugins and one from /opt/kerberos/lib. The plugin that was there, was, if you can guess, the plugin for the OSX PasswordServer. I disabled it (ok, I swore a little bit and removed it). Now, I can change passwords without issue.

So, three days of serious kerberos debugging, and it was because of a residual plugin that we don't even need!