I'm making this post here in hopes I may save someone from beating
their head against the wall like I did...
I am writing a custom Name Service Switch (NSS) module to take
advantage of already existing account information in a pg database.
Under certain circumstances, processes will hang due to non-recursive
mutex locking during PG connection creation. It goes something like
this:
========================================
/etc/nsswitch.conf:
passwd: files mypgmod
group: files mypgmod
========================================
[process with euid not found in /etc/passwd, looking up another
username not found in /etc/passwd]
getpwnam_r(username, ...)
// nss doesn't find user in files module (/etc/passwd),
// so uses mypgmod
_nss_mypgmod_getpwnam_r(username, ...)
pthread_mutex_lock(...) //to protect PG connection, then...
PQconnectdb(...)
// any number of reasons to look up info in calling user's
// home directory, e.g., default password, ssl certs, etc.,
// resulting in call to getpwuid_r() to find user's homedir.
getpwuid_r(geteuid(), ...)
// nss doesn't find user in files module (/etc/passwd),
// so uses mypgmod
_nss_mypgmod_getpwuid_r(uid, ...)
pthread_mutex_lock(...) //to protect PG connection
/* HANG */
* * *
The "fix," if you can call it that, is to run nscd, the NSS caching
daemon. Typically run as root (with account info in /etc/passwd) the
second lookup will not generate a second connection attempt.
Also, there exists in many linux distros a libnss-pgsql2 package which
suffers from the exact same problem (there are some scattered posts on
the 'net about it). Same "fix" ... run nscd.
Cheers,
Daniel Popowich