Thread: LDAP authentication fails with concurrent create extensions
We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4 with AD domain controller). We've recently moved to LDAP authentication and are now encountering a problem where some concurrent connections that create extensions in different databases at the same time are failing with a "Can't contact LDAP server" error. The postgres error log contains:
========================================================================================
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-1] db=db_test_022,user=fullaccess LOG: 00000: LDAP login failed for user "cn=dev_fullaccess,
OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX" on server "server1.ZZ.net server2.ZZ.net": Can't contact LDAP server
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-2] db=db_test_022,user=fullaccess LOCATION: CheckLDAPAuth, auth.c:2593
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [10-1] db=db_test_022,user=fullaccess FATAL: 28000: LDAP authentication failed for user "fullaccess"
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-2] db=db_test_022,user=fullaccess DETAIL: Connection matched pg_hba.conf line 12: "host all
+systemlogin 0.0.0.0/0 ldap ldapserver="server1.ZZ.net server2.ZZ.net" ldapprefix="cn=dev_" ldapsuffix=",OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX""
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-3] db=db_test_022,user=fullaccess LOCATION: auth_failed, auth.c:328
========================================================================================
The postgres version is "PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit" although we get similar failures with Postgres 9.5.9.
The problem is reproducible with this bash script and assumes the user password is in .pgpass:
========================================================================================
#!/bin/bash
set -u
declare -r NUM_DATABASES=50
declare -r NUM_CREATE_EXTENSIONS=1
declare -r DBHOST=kvmdevdb12
declare -r PORT=5437
declare -r USER=fullaccess
declare -r EXTENSION_NAME="btree_gin"
#declare -r EXTENSION_NAME="pgcrypto"
#declare -r EXTENSION_NAME="pg_buffercache"
#declare -r EXTENSION_NAME="tablefunc"
declare -r LOG=/tmp/drop_create.log
declare -r SIGNAL_FILE=/tmp/start_drop_create.flg
declare -r ERROR_SIGNAL_FILE=/tmp/connect_error
function get_now()
{
date +'%Y-%m-%d %H:%M:%S.%N'
}
function create_extension()
{
local -r DBNAME=$1
psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "DROP EXTENSION IF EXISTS $EXTENSION_NAME; CREATE EXTENSION $EXTENSION_NAME SCHEMA extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create extension \"$EXTENSION_NAME\" in database \"$DBNAME\" for user \"$USER\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
}
function drop_create_db()
{
local -r DBNAME=$1
psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "DROP DATABASE IF EXISTS $DBNAME"
psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "CREATE DATABASE $DBNAME"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create database \"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "CREATE SCHEMA extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create schema for \"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
while [[ ! -f $SIGNAL_FILE ]]; do
:
done;
for i in $(seq 1 $NUM_CREATE_EXTENSIONS); do
create_extension $DBNAME
done
}
echo "$(get_now): Postgres version: $(psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c 'SELECT VERSION()')"
echo
echo "$(get_now): Setting up $NUM_DATABASES background jobs..."
rm -f $SIGNAL_FILE
rm -f $ERROR_SIGNAL_FILE
declare DBNAME=""
declare NUM=1
while [[ 1 ]]; do
echo "$(get_now): Creating \"$DBHOST:$PORT/$DBNAME\"..." >> $LOG
DBNAME=$(printf "db_test_%03d" $NUM)
drop_create_db $DBNAME &
if [[ $NUM -ge $NUM_DATABASES ]]; then
break
fi
NUM=$(($NUM + 1))
echo >> $LOG
done
echo "$(get_now): Signal the $NUM_DATABASES background jobs to start creating extensions..."
touch $SIGNAL_FILE
echo "$(get_now): Waiting for all the create extensions to finish..."
echo
wait
echo
if [[ -f $ERROR_SIGNAL_FILE ]]; then
echo "$(get_now): THERE WERE ERRORS."
echo
fi
echo "$(get_now): *** Finished." | tee -a $LOG
========================================================================================
I did some simple debugging into 'openldap' and could get this debug information about the failure:
========================================================================================
ldap_init: trying /var/lib/pgsql/ldap/etc/openldap/ldap.conf
ldap_init: HOME env is /var/lib/pgsql
ldap_init: trying /var/lib/pgsql/ldaprc
ldap_init: trying /var/lib/pgsql/.ldaprc
ldap_init: trying ldaprc
ldap_init: LDAPCONF env is NULL
ldap_init: LDAPRC env is NULL
ldap_create
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP server1.om.net:389
ldap_new_socket: 6
ldap_prepare_socket: 6
ldap_connect_to_host: Trying 192.168.111.111:389
ldap_pvt_connect: fd: 6 tm: -1 async: 0
attempting to connect:
connect success
ldap_open_defconn: successful
ldap_send_server_request
ldap_result ld 0x1d76380 msgid 1
wait4msg ld 0x1d76380 msgid 1 (infinite timeout)
wait4msg continue ld 0x1d76380 msgid 1 all 1
** ld 0x1d76380 Connections:
* host: server1.om.net port: 389 (default)
refcnt: 2 status: Connected
last used: Tue Apr 10 15:01:01 2018
** ld 0x1d76380 Outstanding Requests:
* msgid 1, origid 1, status InProgress
outstanding referrals 0, parent count 0
ld 0x1d76380 request count 1 (abandoned 0)
** ld 0x1d76380 Response Queue:
Empty
ld 0x1d76380 response count 0
ldap_chkResponseList ld 0x1d76380 msgid 1 all 1
ldap_chkResponseList returns ld 0x1d76380 NULL
ldap_int_select
ldap_int_select returned -1: errno 4 <====================== ERROR HERE
ldap_unbind
ldap_free_request (origid 1, msgid 1)
ldap_free_connection 1 1
ldap_send_unbind
ldap_free_connection: actually freed
ldap_err2string
========================================================================================
errno 4 is EINTR and the function actually failing is 'poll' inside the os-ip.c function 'ldap_int_select'.
Greetings, * Greg k (gregg.kay@gmail.com) wrote: > We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4 > with AD domain controller). We've recently moved to LDAP authentication and > are now encountering a problem where some concurrent connections that > create extensions in different databases at the same time are failing with > a "Can't contact LDAP server" error. The postgres error log contains: You really shouldn't be using LDAP in an AD environment for authentication- configure and use Kerberos instead, which is much more secure than having cleartext passwords seen by the PG server and then proxied to the LDAP server. That said, there does appear to be an issue here, thanks for creating a test case. Stephen
Attachment
Hi Stephen,
Do I need to raise this as an issue or bug?
Thanks,
Greg
On Tue, 10 Apr 2018 at 21:31 Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* Greg k (gregg.kay@gmail.com) wrote:
> We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4
> with AD domain controller). We've recently moved to LDAP authentication and
> are now encountering a problem where some concurrent connections that
> create extensions in different databases at the same time are failing with
> a "Can't contact LDAP server" error. The postgres error log contains:
You really shouldn't be using LDAP in an AD environment for
authentication- configure and use Kerberos instead, which is much more
secure than having cleartext passwords seen by the PG server and then
proxied to the LDAP server.
That said, there does appear to be an issue here, thanks for creating a
test case.
Stephen
On Fri, Apr 20, 2018 at 1:43 PM, Greg k <gregg.kay@gmail.com> wrote: > Do I need to raise this as an issue or bug? It seems like we might need a patch on that adds ldap_set_option(ldap, LDAP_OPT_RESTART, LDAP_OPT_ON) somewhere in auth.c to tell it to restart on EINTR. -- Thomas Munro http://www.enterprisedb.com
On Fri, Apr 20, 2018 at 2:09 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Fri, Apr 20, 2018 at 1:43 PM, Greg k <gregg.kay@gmail.com> wrote: >> Do I need to raise this as an issue or bug? > > It seems like we might need a patch on that adds ldap_set_option(ldap, > LDAP_OPT_RESTART, LDAP_OPT_ON) somewhere in auth.c to tell it to > restart on EINTR. I'm a bit confused about that; how are you supposed to do that early enough to cover all its syscalls if you don't have the handle until after you've connected? Perhaps CheckLDAPAuth() (or higher up) should be blocking signals instead, or at least SIGUSR1 (I guess that's what you're receiving -- can you tell?) -- Thomas Munro http://www.enterprisedb.com