LDAP authentication fails with concurrent create extensions - Mailing list pgsql-bugs
From | Greg k |
---|---|
Subject | LDAP authentication fails with concurrent create extensions |
Date | |
Msg-id | CAC5zpv3PZVZmSTj+ugpC54mC5uK+hRbN6w3f6gmFGR10phGGsw@mail.gmail.com Whole thread Raw |
Responses |
Re: LDAP authentication fails with concurrent create extensions
|
List | pgsql-bugs |
We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4 with AD domain controller). We've recently moved to LDAP authentication and are now encountering a problem where some concurrent connections that create extensions in different databases at the same time are failing with a "Can't contact LDAP server" error. The postgres error log contains:
========================================================================================
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-1] db=db_test_022,user=fullaccess LOG: 00000: LDAP login failed for user "cn=dev_fullaccess,
OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX" on server "server1.ZZ.net server2.ZZ.net": Can't contact LDAP server
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-2] db=db_test_022,user=fullaccess LOCATION: CheckLDAPAuth, auth.c:2593
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [10-1] db=db_test_022,user=fullaccess FATAL: 28000: LDAP authentication failed for user "fullaccess"
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-2] db=db_test_022,user=fullaccess DETAIL: Connection matched pg_hba.conf line 12: "host all
+systemlogin 0.0.0.0/0 ldap ldapserver="server1.ZZ.net server2.ZZ.net" ldapprefix="cn=dev_" ldapsuffix=",OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX""
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-3] db=db_test_022,user=fullaccess LOCATION: auth_failed, auth.c:328
========================================================================================
The postgres version is "PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit" although we get similar failures with Postgres 9.5.9.
The problem is reproducible with this bash script and assumes the user password is in .pgpass:
========================================================================================
#!/bin/bash
set -u
declare -r NUM_DATABASES=50
declare -r NUM_CREATE_EXTENSIONS=1
declare -r DBHOST=kvmdevdb12
declare -r PORT=5437
declare -r USER=fullaccess
declare -r EXTENSION_NAME="btree_gin"
#declare -r EXTENSION_NAME="pgcrypto"
#declare -r EXTENSION_NAME="pg_buffercache"
#declare -r EXTENSION_NAME="tablefunc"
declare -r LOG=/tmp/drop_create.log
declare -r SIGNAL_FILE=/tmp/start_drop_create.flg
declare -r ERROR_SIGNAL_FILE=/tmp/connect_error
function get_now()
{
date +'%Y-%m-%d %H:%M:%S.%N'
}
function create_extension()
{
local -r DBNAME=$1
psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "DROP EXTENSION IF EXISTS $EXTENSION_NAME; CREATE EXTENSION $EXTENSION_NAME SCHEMA extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create extension \"$EXTENSION_NAME\" in database \"$DBNAME\" for user \"$USER\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
}
function drop_create_db()
{
local -r DBNAME=$1
psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "DROP DATABASE IF EXISTS $DBNAME"
psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "CREATE DATABASE $DBNAME"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create database \"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "CREATE SCHEMA extensions"
if [[ $? != 0 ]] ; then
echo "$(get_now): *** Error: could not create schema for \"$DBNAME\"." | tee -a $LOG
touch $ERROR_SIGNAL_FILE
exit 1
fi
while [[ ! -f $SIGNAL_FILE ]]; do
:
done;
for i in $(seq 1 $NUM_CREATE_EXTENSIONS); do
create_extension $DBNAME
done
}
echo "$(get_now): Postgres version: $(psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c 'SELECT VERSION()')"
echo
echo "$(get_now): Setting up $NUM_DATABASES background jobs..."
rm -f $SIGNAL_FILE
rm -f $ERROR_SIGNAL_FILE
declare DBNAME=""
declare NUM=1
while [[ 1 ]]; do
echo "$(get_now): Creating \"$DBHOST:$PORT/$DBNAME\"..." >> $LOG
DBNAME=$(printf "db_test_%03d" $NUM)
drop_create_db $DBNAME &
if [[ $NUM -ge $NUM_DATABASES ]]; then
break
fi
NUM=$(($NUM + 1))
echo >> $LOG
done
echo "$(get_now): Signal the $NUM_DATABASES background jobs to start creating extensions..."
touch $SIGNAL_FILE
echo "$(get_now): Waiting for all the create extensions to finish..."
echo
wait
echo
if [[ -f $ERROR_SIGNAL_FILE ]]; then
echo "$(get_now): THERE WERE ERRORS."
echo
fi
echo "$(get_now): *** Finished." | tee -a $LOG
========================================================================================
I did some simple debugging into 'openldap' and could get this debug information about the failure:
========================================================================================
ldap_init: trying /var/lib/pgsql/ldap/etc/openldap/ldap.conf
ldap_init: HOME env is /var/lib/pgsql
ldap_init: trying /var/lib/pgsql/ldaprc
ldap_init: trying /var/lib/pgsql/.ldaprc
ldap_init: trying ldaprc
ldap_init: LDAPCONF env is NULL
ldap_init: LDAPRC env is NULL
ldap_create
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP server1.om.net:389
ldap_new_socket: 6
ldap_prepare_socket: 6
ldap_connect_to_host: Trying 192.168.111.111:389
ldap_pvt_connect: fd: 6 tm: -1 async: 0
attempting to connect:
connect success
ldap_open_defconn: successful
ldap_send_server_request
ldap_result ld 0x1d76380 msgid 1
wait4msg ld 0x1d76380 msgid 1 (infinite timeout)
wait4msg continue ld 0x1d76380 msgid 1 all 1
** ld 0x1d76380 Connections:
* host: server1.om.net port: 389 (default)
refcnt: 2 status: Connected
last used: Tue Apr 10 15:01:01 2018
** ld 0x1d76380 Outstanding Requests:
* msgid 1, origid 1, status InProgress
outstanding referrals 0, parent count 0
ld 0x1d76380 request count 1 (abandoned 0)
** ld 0x1d76380 Response Queue:
Empty
ld 0x1d76380 response count 0
ldap_chkResponseList ld 0x1d76380 msgid 1 all 1
ldap_chkResponseList returns ld 0x1d76380 NULL
ldap_int_select
ldap_int_select returned -1: errno 4 <====================== ERROR HERE
ldap_unbind
ldap_free_request (origid 1, msgid 1)
ldap_free_connection 1 1
ldap_send_unbind
ldap_free_connection: actually freed
ldap_err2string
========================================================================================
errno 4 is EINTR and the function actually failing is 'poll' inside the os-ip.c function 'ldap_int_select'.
pgsql-bugs by date: