LDAP authentication fails with concurrent create extensions - Mailing list pgsql-bugs

From Greg k
Subject LDAP authentication fails with concurrent create extensions
Date
Msg-id CAC5zpv3PZVZmSTj+ugpC54mC5uK+hRbN6w3f6gmFGR10phGGsw@mail.gmail.com
Whole thread Raw
Responses Re: LDAP authentication fails with concurrent create extensions
List pgsql-bugs
We are using postgresql 10.3 on Centos 7.2 with LDAP authentication (samba4 with AD domain controller). We've recently moved to LDAP authentication and are now encountering a problem where some concurrent connections that create extensions in different databases at the same time are failing with a "Can't contact LDAP server" error. The postgres error log contains:

========================================================================================
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-1] db=db_test_022,user=fullaccess LOG:  00000: LDAP login failed for user "cn=dev_fullaccess,
OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX" on server "server1.ZZ.net server2.ZZ.net": Can't contact LDAP server
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [9-2] db=db_test_022,user=fullaccess LOCATION:  CheckLDAPAuth, auth.c:2593
Apr 10 16:55:27.988 kvmdevdb12 postgres[10901]: [10-1] db=db_test_022,user=fullaccess FATAL:  28000: LDAP authentication failed for user "fullaccess"
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-2] db=db_test_022,user=fullaccess DETAIL:  Connection matched pg_hba.conf line 12: "host       all
           +systemlogin 0.0.0.0/0               ldap ldapserver="server1.ZZ.net server2.ZZ.net" ldapprefix="cn=dev_" ldapsuffix=",OU=StandardUsers,OU=Postgres,OU=ZZ,DC=company,DC=com,DC=XX""
Apr 10 16:55:27.989 kvmdevdb12 postgres[10901]: [10-3] db=db_test_022,user=fullaccess LOCATION:  auth_failed, auth.c:328
========================================================================================

The postgres version is "PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit" although we get similar failures with Postgres 9.5.9.

The problem is reproducible with this bash script and assumes the user password is in .pgpass:

========================================================================================
#!/bin/bash

set -u

declare -r NUM_DATABASES=50
declare -r NUM_CREATE_EXTENSIONS=1

declare -r DBHOST=kvmdevdb12
declare -r PORT=5437
declare -r USER=fullaccess

declare -r EXTENSION_NAME="btree_gin"
#declare -r EXTENSION_NAME="pgcrypto"
#declare -r EXTENSION_NAME="pg_buffercache"
#declare -r EXTENSION_NAME="tablefunc"

declare -r LOG=/tmp/drop_create.log
declare -r SIGNAL_FILE=/tmp/start_drop_create.flg
declare -r ERROR_SIGNAL_FILE=/tmp/connect_error


function get_now()
{
    date +'%Y-%m-%d %H:%M:%S.%N'
}


function create_extension()
{
    local -r DBNAME=$1

    psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "DROP EXTENSION IF EXISTS $EXTENSION_NAME; CREATE EXTENSION $EXTENSION_NAME SCHEMA extensions"
    if [[ $? != 0 ]] ; then
        echo "$(get_now): *** Error: could not create extension \"$EXTENSION_NAME\" in database \"$DBNAME\" for user \"$USER\"." | tee -a $LOG
        touch $ERROR_SIGNAL_FILE
        exit 1
    fi

}


function drop_create_db()
{
    local -r DBNAME=$1

    psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "DROP DATABASE IF EXISTS $DBNAME"
    psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c "CREATE DATABASE $DBNAME"
        if [[ $? != 0 ]] ; then
        echo "$(get_now): *** Error: could not create database \"$DBNAME\"." | tee -a $LOG
        touch $ERROR_SIGNAL_FILE
        exit 1
    fi
    psql -qt -p $PORT -h $DBHOST -d $DBNAME -U $USER -c "CREATE SCHEMA extensions"
    if [[ $? != 0 ]] ; then
        echo "$(get_now): *** Error: could not create schema for \"$DBNAME\"." | tee -a $LOG
        touch $ERROR_SIGNAL_FILE
        exit 1
    fi

    while [[ ! -f $SIGNAL_FILE ]]; do
        :
    done;

    for i in $(seq 1 $NUM_CREATE_EXTENSIONS); do
        create_extension $DBNAME
    done
}

echo "$(get_now): Postgres version: $(psql -qt -p $PORT -h $DBHOST -d postgres -U $USER -c 'SELECT VERSION()')"
echo

echo "$(get_now): Setting up $NUM_DATABASES background jobs..."

rm -f $SIGNAL_FILE
rm -f $ERROR_SIGNAL_FILE

declare DBNAME=""
declare NUM=1
while [[ 1 ]]; do
    echo "$(get_now): Creating \"$DBHOST:$PORT/$DBNAME\"..."  >> $LOG
    DBNAME=$(printf "db_test_%03d" $NUM)
    drop_create_db $DBNAME &
    if [[ $NUM -ge $NUM_DATABASES ]]; then
        break
    fi
    NUM=$(($NUM + 1))
    echo >> $LOG
done

echo "$(get_now): Signal the $NUM_DATABASES background jobs to start creating extensions..."
touch $SIGNAL_FILE

echo "$(get_now): Waiting for all the create extensions to finish..."
echo
wait
echo

if [[ -f $ERROR_SIGNAL_FILE ]]; then
    echo "$(get_now): THERE WERE ERRORS."
    echo
fi
echo "$(get_now): *** Finished." | tee -a $LOG

========================================================================================


I did some simple debugging into 'openldap' and could get this debug information about the failure:

========================================================================================
ldap_init: trying /var/lib/pgsql/ldap/etc/openldap/ldap.conf
ldap_init: HOME env is /var/lib/pgsql
ldap_init: trying /var/lib/pgsql/ldaprc
ldap_init: trying /var/lib/pgsql/.ldaprc
ldap_init: trying ldaprc
ldap_init: LDAPCONF env is NULL
ldap_init: LDAPRC env is NULL
ldap_create
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP server1.om.net:389
ldap_new_socket: 6
ldap_prepare_socket: 6
ldap_connect_to_host: Trying 192.168.111.111:389
ldap_pvt_connect: fd: 6 tm: -1 async: 0
attempting to connect:
connect success
ldap_open_defconn: successful
ldap_send_server_request
ldap_result ld 0x1d76380 msgid 1
wait4msg ld 0x1d76380 msgid 1 (infinite timeout)
wait4msg continue ld 0x1d76380 msgid 1 all 1
** ld 0x1d76380 Connections:
* host: server1.om.net  port: 389  (default)
  refcnt: 2  status: Connected
  last used: Tue Apr 10 15:01:01 2018

** ld 0x1d76380 Outstanding Requests:
 * msgid 1,  origid 1, status InProgress
   outstanding referrals 0, parent count 0
  ld 0x1d76380 request count 1 (abandoned 0)
** ld 0x1d76380 Response Queue:
   Empty
  ld 0x1d76380 response count 0
ldap_chkResponseList ld 0x1d76380 msgid 1 all 1
ldap_chkResponseList returns ld 0x1d76380 NULL
ldap_int_select
ldap_int_select returned -1: errno 4                              <====================== ERROR HERE
ldap_unbind
ldap_free_request (origid 1, msgid 1)
ldap_free_connection 1 1
ldap_send_unbind
ldap_free_connection: actually freed
ldap_err2string
========================================================================================

errno 4 is EINTR and the function actually failing is 'poll' inside the os-ip.c function 'ldap_int_select'.

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15148: font is too small to see
Next
From: Huong Dangminh
Date:
Subject: power() function in Windows: "value out of range: underflow"