Thread: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
Hi all, I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).
The server crashed with following errors and other active server processes are terminated as well:
2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: server process (PID 26306) was terminated by signal 11: Segmentation fault
2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: terminating any other active server processes
I can reproduce it in a test server with many other sessions connected:
1. login using non-LDAP-authenticated user, query local & foreign tables - OK
2. login using LDAP-authenticated user, query local table - OK
3. login using LDAP-authenticated user, query foreign table - ERROR, server crashes with signal 11: Segmentation fault error when I quit the psql session
It seems like the problem only when the LDAP-authenticated session (which queried foreign table) is terminated. In dmesg log, I can see following:
[16385512.182231] traps: postmaster[26306] general protection ip:7f1e758b638c sp:7ffef7ed8858 error:0 in libc-2.17.so[7f1e75836000+1b6000]
Has anyone encountered similar issue?
######################
PostgreSQL version: 10.6
Platform: CentOS Linux
######################
Thank you.
Regards,
Mike Yeap
Re: LDAP authenticated session terminated by signal 11:Segmentation fault, PostgresSQL server terminates other active serverprocesses
From
Laurenz Albe
Date:
Mike Yeap wrote: > I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw). > > The server crashed with following errors and other active server processes are terminated as well: > 2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: server process (PID 26306)was terminated by signal 11: Segmentation fault > > 2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: terminating any other activeserver processes > > I can reproduce it in a test server with many other sessions connected: > > 1. login using non-LDAP-authenticated user, query local & foreign tables - OK > 2. login using LDAP-authenticated user, query local table - OK > 3. login using LDAP-authenticated user, query foreign table - ERROR, server crashes with signal 11: Segmentation faulterror when I quit the psql session Are the "postgres" executable and libpq linked with the same version of OpenLDAP? Any other extensions installed? Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Laurenz Albe <laurenz.albe@cybertec.at> writes: > Mike Yeap wrote: >> I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw). > Are the "postgres" executable and libpq linked with the same version of OpenLDAP? And which version is that? (And which version of Postgres?) Digging around in our git history, I came across this: Author: Noah Misch <noah@leadboat.com> Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400 Diagnose incompatible OpenLDAP versions during build and test. With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL backends can crash at exit. Raise a warning during "configure" based on the compile-time OpenLDAP version number, and test the crash scenario in the dblink test suite. Back-patch to 9.0 (all supported versions). which sounds a fair bit like what you are describing. regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
> Are the "postgres" executable and libpq linked with the same version of OpenLDAP?
How should I check whether they are linked?
My Postgres version is 10.6 and I have this output for "yum list | grep ldap | sort":
$ yum list | grep ldap | sort
apr-util-ldap.x86_64 1.5.2-6.el7 base
bind-dyndb-ldap.x86_64 11.1-4.el7 base
compat-openldap.i686 1:2.3.43-5.el7 base
compat-openldap.x86_64 1:2.3.43-5.el7 base
cyrus-sasl-ldap.i686 2.1.26-23.el7 base
cyrus-sasl-ldap.x86_64 2.1.26-23.el7 base
freeradius-ldap.x86_64 3.0.13-9.el7_5 base
ipsilon-authldap.noarch 1.0.0-13.el7_3 base
krb5-server-ldap.x86_64 1.15.1-37.el7_6 updates
ldapjdk-javadoc.noarch 4.19-5.el7 base
ldapjdk.noarch 4.19-5.el7 base
mod_ldap.x86_64 2.4.6-88.el7.centos base
nss-pam-ldapd.i686 0.8.13-16.el7 base
nss-pam-ldapd.x86_64 0.8.13-16.el7 base
openldap-clients.x86_64 2.4.44-21.el7_6 @updates
openldap-devel.i686 2.4.44-21.el7_6 updates
openldap-devel.x86_64 2.4.44-21.el7_6 updates
openldap.i686 2.4.44-21.el7_6 updates
openldap-servers-sql.x86_64 2.4.44-21.el7_6 updates
openldap-servers.x86_64 2.4.44-21.el7_6 updates
openldap.x86_64 2.4.44-21.el7_6 @updates
openssh-ldap.x86_64 7.4p1-16.el7 base
php-ldap.x86_64 5.4.16-46.el7 base
python-ldap2pg-doc.x86_64 4.11-1.rhel7 pgdg10
python-ldap2pg.x86_64 4.11-1.rhel7 pgdg10
python-ldap.x86_64 2.4.15-2.el7 base
sssd-ldap.x86_64 1.16.2-13.el7_6.5 updates
And in the database where I encountered this issue I have these extensions installed:
repdb=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+------------------------------------------------------------
hstore | 1.4 | public | data type for storing sets of (key, value) pairs
pg_stat_statements | 1.6 | repdb | track execution statistics of all SQL statements executed
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
postgres_fdw | 1.0 | repdb | foreign-data wrapper for remote PostgreSQL servers
tablefunc | 1.0 | repdb | functions that manipulate whole tables, including crosstab
(5 rows)
Thank you.
Regards,
Mike Yeap
On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> Mike Yeap wrote:
>> I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).
> Are the "postgres" executable and libpq linked with the same version of OpenLDAP?
And which version is that? (And which version of Postgres?)
Digging around in our git history, I came across this:
Author: Noah Misch <noah@leadboat.com>
Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400
Diagnose incompatible OpenLDAP versions during build and test.
With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based on
the compile-time OpenLDAP version number, and test the crash scenario in
the dblink test suite. Back-patch to 9.0 (all supported versions).
which sounds a fair bit like what you are describing.
regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Mike Yeap <wkk1020@gmail.com> writes: >> Are the "postgres" executable and libpq linked with the same version of >> OpenLDAP? > How should I check whether they are linked? "ldd" should show the dependencies of whatever executable or library you point it at. regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
Hi Tom, when I run "ldd /usr/pgsql-10/bin/postmaster" I got this output:
# ldd /usr/pgsql-10/bin/postmaster
linux-vdso.so.1 => (0x00007ffd4ec65000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007eff8b5d3000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00007eff8b268000)
libpam.so.0 => /lib64/libpam.so.0 (0x00007eff8b059000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007eff8ade7000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007eff8a985000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007eff8a738000)
librt.so.1 => /lib64/librt.so.1 (0x00007eff8a530000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007eff8a32b000)
libm.so.6 => /lib64/libm.so.6 (0x00007eff8a029000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)
libicui18n.so.50 => /lib64/libicui18n.so.50 (0x00007eff899d4000)
libicuuc.so.50 => /lib64/libicuuc.so.50 (0x00007eff8965b000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007eff89633000)
libc.so.6 => /lib64/libc.so.6 (0x00007eff89271000)
/lib64/ld-linux-x86-64.so.2 (0x00007eff8b7f9000)
libz.so.1 => /lib64/libz.so.1 (0x00007eff8905b000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007eff88e35000)
libaudit.so.1 => /lib64/libaudit.so.1 (0x00007eff88c0c000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007eff88924000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007eff88720000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007eff884ec000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007eff882de000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007eff880da000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007eff87ebf000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007eff87cb0000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007eff87a93000)
libssl3.so => /lib64/libssl3.so (0x00007eff8784f000)
libsmime3.so => /lib64/libsmime3.so (0x00007eff87628000)
libnss3.so => /lib64/libnss3.so (0x00007eff87302000)
libnssutil3.so => /lib64/libnssutil3.so (0x00007eff870d5000)
libplds4.so => /lib64/libplds4.so (0x00007eff86ed1000)
libplc4.so => /lib64/libplc4.so (0x00007eff86ccc000)
libnspr4.so => /lib64/libnspr4.so (0x00007eff86a8d000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007eff86785000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007eff8656f000)
libicudata.so.50 => /lib64/libicudata.so.50 (0x00007eff84f9a000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007eff84d95000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007eff84b6e000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x00007eff848ec000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007eff846e7000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007eff844a0000)
libcap-ng.so.0 => /lib64/libcap-ng.so.0 (0x00007eff84299000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007eff84062000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007eff83e5c000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007eff83bfa000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007eff839e2000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007eff837d1000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007eff835ce000)
On the line that has ldap in it:
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)
Sorry but in this case what is my libpq?
Regards,
Mike Yeap
On Thu, Feb 21, 2019 at 10:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Yeap <wkk1020@gmail.com> writes:
>> Are the "postgres" executable and libpq linked with the same version of
>> OpenLDAP?
> How should I check whether they are linked?
"ldd" should show the dependencies of whatever executable or library
you point it at.
regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <wkk1020@gmail.com> wrote: > openldap-clients.x86_64 2.4.44-21.el7_6 @updates > openldap-devel.i686 2.4.44-21.el7_6 updates > openldap-devel.x86_64 2.4.44-21.el7_6 updates > openldap.i686 2.4.44-21.el7_6 updates > openldap-servers-sql.x86_64 2.4.44-21.el7_6 updates > openldap-servers.x86_64 2.4.44-21.el7_6 updates > openldap.x86_64 2.4.44-21.el7_6 @updates > On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL >> backends can crash at exit. Raise a warning during "configure" based on >> the compile-time OpenLDAP version number, and test the crash scenario in >> the dblink test suite. Back-patch to 9.0 (all supported versions). Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31. Perhaps the dangerous range is out of date? Hmm, so Noah's analysis[1] says this is a clash between libldap_r.so (used by libpq) and libldap.so (used by the server), specifically in destructor/exit code. Curiously, in a thread about Curl's struggles with this problem, I found a claim[2] that Debian decided to abandon the non-"_r" variant and just use _r always. Sure enough, on my Debian buster VM I see a symlink libldap-2.4.so.2 -> libldap_r-2.4.so.2. So essentially Debian and friends have already forced Noah's first option on users: > 1. Link the backend with libldap_r, so we never face the mismatch. On some > platforms, this means also linking in threading libraries. FreeBSD and CentOS systems near me have separate libraries still. [1] https://www.postgresql.org/message-id/flat/20140612210219.GA705509%40tornado.leadboat.com [2] https://www.openldap.org/lists/openldap-technical/201608/msg00094.html -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
Hi Thomas, does that mean the bug is still there?
Regards,
Mike Yeap
On Mon, Feb 25, 2019 at 4:06 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <wkk1020@gmail.com> wrote:
> openldap-clients.x86_64 2.4.44-21.el7_6 @updates
> openldap-devel.i686 2.4.44-21.el7_6 updates
> openldap-devel.x86_64 2.4.44-21.el7_6 updates
> openldap.i686 2.4.44-21.el7_6 updates
> openldap-servers-sql.x86_64 2.4.44-21.el7_6 updates
> openldap-servers.x86_64 2.4.44-21.el7_6 updates
> openldap.x86_64 2.4.44-21.el7_6 @updates
> On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
>> backends can crash at exit. Raise a warning during "configure" based on
>> the compile-time OpenLDAP version number, and test the crash scenario in
>> the dblink test suite. Back-patch to 9.0 (all supported versions).
Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31. Perhaps the
dangerous range is out of date? Hmm, so Noah's analysis[1] says this
is a clash between libldap_r.so (used by libpq) and libldap.so (used
by the server), specifically in destructor/exit code. Curiously, in a
thread about Curl's struggles with this problem, I found a claim[2]
that Debian decided to abandon the non-"_r" variant and just use _r
always. Sure enough, on my Debian buster VM I see a symlink
libldap-2.4.so.2 -> libldap_r-2.4.so.2. So essentially Debian and
friends have already forced Noah's first option on users:
> 1. Link the backend with libldap_r, so we never face the mismatch. On some
> platforms, this means also linking in threading libraries.
FreeBSD and CentOS systems near me have separate libraries still.
[1] https://www.postgresql.org/message-id/flat/20140612210219.GA705509%40tornado.leadboat.com
[2] https://www.openldap.org/lists/openldap-technical/201608/msg00094.html
--
Thomas Munro
https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote: > Hi Thomas, does that mean the bug is still there? Hi Mike, I haven't tried to repro this myself, but it certainly sounds like it. It also sounds like it would probably go away if you switched to a Debian-derived distro, instead of a Red Hat-derived distro, but I doubt that's the kind of advice you were looking for. We need to figure out a proper solution here, though I'm not sure what. Question for the list: other stuff in the server needs libpthread (SSL, LLVM, ...), so why are we insisting on using non-MT LDAP? -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-(
Hopefully this problem is solved in future version, thank you!
Regards,
Mike Yeap
On Tue, Feb 26, 2019 at 4:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote:
> Hi Thomas, does that mean the bug is still there?
Hi Mike,
I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for. We need to
figure out a proper solution here, though I'm not sure what. Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?
--
Thomas Munro
https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Tue, Feb 26, 2019 at 9:11 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote: > > Hi Thomas, does that mean the bug is still there? > I haven't tried to repro this myself, but it certainly sounds like it. > It also sounds like it would probably go away if you switched to a > Debian-derived distro, instead of a Red Hat-derived distro, but I > doubt that's the kind of advice you were looking for. We need to > figure out a proper solution here, though I'm not sure what. Question > for the list: other stuff in the server needs libpthread (SSL, LLVM, > ...), so why are we insisting on using non-MT LDAP? Concretely, why don't we just kill the LDAP_LIBS_FE/LDAP_LIBS_BE distinction and use a single LDAP_LIBS? Then it'll always match. It can still be the non-MT variant if you build with --disable-thread-safety (who does that?), but then it'll be the same in the server too so that postgres_fdw + ldap works that way too. Sketch patch attached. -- Thomas Munro https://enterprisedb.com
Attachment
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Stephen Frost
Date:
Greetings Mike, * Mike Yeap (wkk1020@gmail.com) wrote: > Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-( If you're in an active directory environment, you should really be using Kerberos for authentication and NOT LDAP anyway. LDAP-based authentication involves sending the user's password (cleartext) to the PG server, which is really bad security. Hopefully you're at least connecting to PG with SSL, and from PG to LDAP with SSL, but you still run the issue that a compromised server would expose the password of everyone connecting to that server, and when you're using a centralized authentication system like LDAP, that one password gets you access to everything that account has access to. Thanks! Stephen
Attachment
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > Question > for the list: other stuff in the server needs libpthread (SSL, LLVM, > ...), so why are we insisting on using non-MT LDAP? The traditional reason for avoiding that is the risk of a server process becoming multi-threaded. There are live bugs of that ilk on Darwin, and we actually have cross-checks for the case in our code (see HAVE_PTHREAD_IS_THREADED_NP stanzas). If pthread_is_threaded_np(), or something equivalent, is widely available then it might be all right to try solving this going forward by switching to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid to risk it in the back branches though ... regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > Question > > for the list: other stuff in the server needs libpthread (SSL, LLVM, > > ...), so why are we insisting on using non-MT LDAP? > > The traditional reason for avoiding that is the risk of a server > process becoming multi-threaded. There are live bugs of that ilk > on Darwin, and we actually have cross-checks for the case in our > code (see HAVE_PTHREAD_IS_THREADED_NP stanzas). > > If pthread_is_threaded_np(), or something equivalent, is widely available > then it might be all right to try solving this going forward by switching > to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid > to risk it in the back branches though ... Hmm. Well here is a new data point: it looks like the Red Hat family of distributions is in the process of making the same decision as Debian (namely: to expunge the non-MT variant, because it bites various projects in the same way that it bites us), but they haven't quite hasn't pulled the trigger yet: https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries So if we do nothing at all, it seems likely that this problem will eventually go away by itself on practically all Linux systems, leaving this unfixed LDAP vs postgres_fdw bug to trip up the other Unix systems. Bleugh. I don't see pthread_is_threaded_np() on any non-Apple systems in my lab. Clearly libdap_r is *capable* of creating threads: it contains a function ldap_pvt_thread_create(), and we can see that slapd and other OpenLDAP things use that, but AFAICT that's a private facility not intended for end users to call, so there's no danger if you just use the documented LDAP client API. Since pthread_is_threaded_np() is a Mac thing, note also that Macs aren't directly exposed to this particular choice anyway because (at least if you use system-provided libraries rather than MacPorts et al) libldap.dylib and libldap_r.dylib are already symlinks to the same Apple voodoo "/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP". -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> If pthread_is_threaded_np(), or something equivalent, is widely available >> then it might be all right to try solving this going forward by switching >> to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid >> to risk it in the back branches though ... > Hmm. Well here is a new data point: it looks like the Red Hat family > of distributions is in the process of making the same decision as > Debian (namely: to expunge the non-MT variant, because it bites > various projects in the same way that it bites us), but they haven't > quite hasn't pulled the trigger yet: > https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries Interesting, but that's going to be a very slow change. That says they'll pull the trigger in Fedora 30, which I think is due to be released this spring --- but it won't show up in RHEL till the next major release (8 or maybe even 9 at this point), and the existing major releases have got 10-year support lifespans. > I don't see pthread_is_threaded_np() on any non-Apple systems in my > lab. Yeah, I thought that might be a Mac thing. I wonder if POSIX has any usable equivalent. > Clearly libdap_r is *capable* of creating threads: it contains a > function ldap_pvt_thread_create(), and we can see that slapd and other > OpenLDAP things use that, but AFAICT that's a private facility not > intended for end users to call, so there's no danger if you just use > the documented LDAP client API. That seems promising, but I'd sure be happier if we could cross-check that there's still just one thread at the completion of authentication. regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
Adding Noah to thread. On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > I don't see pthread_is_threaded_np() on any non-Apple systems in my > > lab. > > Yeah, I thought that might be a Mac thing. I wonder if POSIX has any > usable equivalent. I don't see anything like that (the concept doesn't seem very portable). I couldn't find a way on Glibc (but I'm not saying there isn't one hiding somewhere). FreeBSD has a thing much like macOS's (and I think some more BSDs do too); it's set to true by libthr when the first thread is created, to make libc start locking various stuff. The macOS one probably isn't a good canary to protect us from OpenLDAP creating threads since on typical macOS builds we're using Apple's LDAP thing (which cybersquats libldap.dylib and libldap_r.dylib via symlinks). So adding a FreeBSD check seems like a good idea, because at least one FreeBSD system in our buildfarm runs the ldap checks on real OpenLDAP (elver). > > Clearly libdap_r is *capable* of creating threads: it contains a > > function ldap_pvt_thread_create(), and we can see that slapd and other > > OpenLDAP things use that, but AFAICT that's a private facility not > > intended for end users to call, so there's no danger if you just use > > the documented LDAP client API. > > That seems promising, but I'd sure be happier if we could cross-check > that there's still just one thread at the completion of authentication. Ok, here's that patch again with a commit message and with the configure version warning removed, and a make-sure-we're-not-threaded patch for FreeBSD. I'm not sure what to do about the LDAP test in contrib/dblink/sql/dblink.sql. Do we still want this? I propose this for master only, for now. I also think it'd be nice to consider back-patching it after a while, especially since this reported broke on CentOS/RHEL7, a pretty popular OS that'll be around for a good while. Hmm, I wonder if it's OK to subtly change library dependencies in a minor release; I don't see any problem with it since I expect both variants to be provided by the same package in every distro but we'd certainly want to highlight this to the package maintainers if we did it. -- Thomas Munro https://enterprisedb.com
Attachment
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Noah Misch
Date:
On Thu, Mar 07, 2019 at 10:45:56AM +1300, Thomas Munro wrote: > On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Munro <thomas.munro@gmail.com> writes: > > > I don't see pthread_is_threaded_np() on any non-Apple systems in my > > > lab. > > > > Yeah, I thought that might be a Mac thing. I wonder if POSIX has any > > usable equivalent. > > I don't see anything like that (the concept doesn't seem very > portable). I'm not aware of one. > > > Clearly libdap_r is *capable* of creating threads: it contains a > > > function ldap_pvt_thread_create(), and we can see that slapd and other > > > OpenLDAP things use that, but AFAICT that's a private facility not > > > intended for end users to call, so there's no danger if you just use > > > the documented LDAP client API. > > > > That seems promising, but I'd sure be happier if we could cross-check > > that there's still just one thread at the completion of authentication. > > Ok, here's that patch again with a commit message and with the > configure version warning removed, and a make-sure-we're-not-threaded > patch for FreeBSD. > > I'm not sure what to do about the LDAP test in > contrib/dblink/sql/dblink.sql. Do we still want this? Mike, does the dblink test suite not fail on your system? It's designed to catch this exact problem. Has anyone else reproduced this? > I propose this for master only, for now. I also think it'd be nice to > consider back-patching it after a while, especially since this > reported broke on CentOS/RHEL7, a pretty popular OS that'll be around > for a good while. Hmm, I wonder if it's OK to subtly change library > dependencies in a minor release; I don't see any problem with it since > I expect both variants to be provided by the same package in every > distro but we'd certainly want to highlight this to the package > maintainers if we did it. It's not great to change library dependencies in a minor release. If every RHEL 7 installation can crash this way, changing the dependencies is probably the least bad thing.
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote: > Has anyone else reproduced this? I tried, but could not reproduce this problem on "CentOS Linux release 7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike reported, what yum install is currently serving up). I tried "make check" in contrib/dblink, and the only strange thing I noticed was this FATAL error at the top of contrib/dblink/log/postmaster.log: 2019-03-14 03:51:33.058 UTC [20131] LOG: database system is ready to accept connections 2019-03-14 03:51:33.059 UTC [20135] [unknown] FATAL: the database system is starting up I don't see that on other systems and don't understand it. I also tried a test of my own which I thought corresponded directly to what Mike described, on both master and REL_10_STABLE. I'll record my steps here so perhaps someone can see what's missing. 1. Run the regression test under src/test/ldap so that you get some canned slapd configuration files. 2. cd into src/test/ldap/tmp_check and run "slapd -f slapd.conf -h ldap://localhost:5555". It should daemonify itself, and run until you kill it with SIGINT. 3. Put this into pg_hba.conf: host postgres test1 127.0.0.1/32 ldap ldapserver=localhost ldapport=5555 ldapbasedn="dc=example,dc=net" 4. Create database objects as superuser: create user test1; create table t (i int); grant all on t to test1; create extension postgres_fdw; create server foreign_server foreign data wrapper postgres_fdw options (dbname 'postgres', host '127.0.0.1'); create foreign table ft (i int) server foreign_server options (table_name 't'); create user mapping for test1 server foreign_server options (user 'test1', password 'secret1'); grant all on ft to test1; 5. Now you should be able to log in with "psql -h 127.0.0.1 postgres test1" and password "secret1", and run queries like: select * from ft; When exiting the session, I was expecting the backend to crash, because it had executed libldap.so code during authentication, and then it had linked in libldap_r.so via libpq.so while connecting via postgres_fdw. But it doesn't crash. I wonder what is different for Mike; am I missing something, or is there non-determinism here? > > I propose this for master only, for now. I also think it'd be nice to > > consider back-patching it after a while, especially since this > > reported broke on CentOS/RHEL7, a pretty popular OS that'll be around > > for a good while. Hmm, I wonder if it's OK to subtly change library > > dependencies in a minor release; I don't see any problem with it since > > I expect both variants to be provided by the same package in every > > distro but we'd certainly want to highlight this to the package > > maintainers if we did it. > > It's not great to change library dependencies in a minor release. If every > RHEL 7 installation can crash this way, changing the dependencies is probably > the least bad thing. +1, once we get a repro and/or better understanding. -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Noah Misch
Date:
On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote: > On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote: > > Has anyone else reproduced this? > > I tried, but could not reproduce this problem on "CentOS Linux release > 7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike > reported, what yum install is currently serving up). > When exiting the session, I was expecting the backend to crash, > because it had executed libldap.so code during authentication, and > then it had linked in libldap_r.so via libpq.so while connecting via > postgres_fdw. But it doesn't crash. I wonder what is different for > Mike; am I missing something, or is there non-determinism here? The test is deterministic. I'm guessing Mike's system is finding ldap libraries other than the usual system ones. Mike, would you check as follows? $ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X & [1] 2530123 pg_backend_pid ---------------- 2530124 (1 row) LOAD $ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap' [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6 From To Syms Read Shared Object Library 0x00007ffff65e1ee0 0x00007ffff6613304 Yes (*) /lib64/libldap-2.4.so.2 0x00007fffe998f6d0 0x00007fffe99c3ae4 Yes (*) /lib64/libldap_r-2.4.so.2 (*): Shared library is missing debugging information.
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Mike Yeap
Date:
Hi Noah, below is the output from one of the servers having this issue:
$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1] 9731
$ select pg_backend_pid(); load 'dblink'; select pg_sleep(100)
pg_backend_pid
----------------
9732
(1 row)
LOAD
$ gdb --batch --pid 9732 -ex 'info sharedlibrary ldap'
warning: .dynamic section for "/lib64/libldap-2.4.so.2" is not at the expected address (wrong library or version mismatch?)
warning: .dynamic section for "/lib64/liblber-2.4.so.2" is not at the expected address (wrong library or version mismatch?)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f1e7592dcf3 in __epoll_wait_nocancel () from /lib64/libc.so.6
From To Syms Read Shared Object Library
0x00007f1e7637d0f8 0x00007f1e763ae51c Yes (*) /lib64/libldap-2.4.so.2
0x00007f1d9f2c16d0 0x00007f1d9f2f5ae4 Yes (*) /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.
Regards,
Mike Yeap
On Thu, Mar 14, 2019 at 1:42 PM Noah Misch <noah@leadboat.com> wrote:
On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote:
> On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote:
> > Has anyone else reproduced this?
>
> I tried, but could not reproduce this problem on "CentOS Linux release
> 7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
> reported, what yum install is currently serving up).
> When exiting the session, I was expecting the backend to crash,
> because it had executed libldap.so code during authentication, and
> then it had linked in libldap_r.so via libpq.so while connecting via
> postgres_fdw. But it doesn't crash. I wonder what is different for
> Mike; am I missing something, or is there non-determinism here?
The test is deterministic. I'm guessing Mike's system is finding ldap
libraries other than the usual system ones. Mike, would you check as follows?
$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1] 2530123
pg_backend_pid
----------------
2530124
(1 row)
LOAD
$ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6
From To Syms Read Shared Object Library
0x00007ffff65e1ee0 0x00007ffff6613304 Yes (*) /lib64/libldap-2.4.so.2
0x00007fffe998f6d0 0x00007fffe99c3ae4 Yes (*) /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Noah Misch
Date:
On Fri, Mar 15, 2019 at 12:10:59AM +0800, Mike Yeap wrote: > Hi Noah, below is the output from one of the servers having this issue: > > $ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X & > [1] 9731 > > $ select pg_backend_pid(); load 'dblink'; select pg_sleep(100) > pg_backend_pid > ---------------- > 9732 > (1 row) > > LOAD > > $ gdb --batch --pid 9732 -ex 'info sharedlibrary ldap' > > warning: .dynamic section for "/lib64/libldap-2.4.so.2" is not at the expected address (wrong library or version mismatch?) > > warning: .dynamic section for "/lib64/liblber-2.4.so.2" is not at the expected address (wrong library or version mismatch?) > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > 0x00007f1e7592dcf3 in __epoll_wait_nocancel () from /lib64/libc.so.6 > From To Syms Read Shared Object Library > 0x00007f1e7637d0f8 0x00007f1e763ae51c Yes (*) /lib64/libldap-2.4.so.2 > 0x00007f1d9f2c16d0 0x00007f1d9f2f5ae4 Yes (*) /lib64/libldap_r-2.4.so.2 > (*): Shared library is missing debugging information. Thanks. That rules out my guess. I don't have another guess at this time.
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Fri, Mar 15, 2019 at 4:46 PM Noah Misch <noah@leadboat.com> wrote: > Thanks. That rules out my guess. I don't have another guess at this time. Even though I can't reproduce the problem myself, I'm quite keen to go ahead and push the patch I proposed for v12 anyway, and close this case. Otherwise this problem could just keep coming back until libldap.so is eventually entirely phased out by all distros. In 2023 I want to be working on quantum parallelism or something, not LDAP bug reports. Any objections? -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > Even though I can't reproduce the problem myself, I'm quite keen to go > ahead and push the patch I proposed for v12 anyway, and close this > case. Otherwise this problem could just keep coming back until > libldap.so is eventually entirely phased out by all distros. In 2023 > I want to be working on quantum parallelism or something, not LDAP bug > reports. Any objections? Do we have any clear reason to believe this'd actually fix Mike's problem? AFAIK the analogy to the old destructor-conflict issue is just a guess, and we don't really know exactly what is going wrong. It's reasonable to assume that the proposed patch won't cause real issues on any modern platform, but I'm not sure we can assume that for old ones, so the whole thing is making me a bit nervous. Still, it's nice simplification to not have different frontend and backend LDAP libs. As far as the specifics of the patch go, I don't like that you didn't adjust any of the comments near pthread_is_threaded_np() usages. regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Wed, Mar 20, 2019 at 10:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > Even though I can't reproduce the problem myself, I'm quite keen to go > > ahead and push the patch I proposed for v12 anyway, and close this > > case. Otherwise this problem could just keep coming back until > > libldap.so is eventually entirely phased out by all distros. In 2023 > > I want to be working on quantum parallelism or something, not LDAP bug > > reports. Any objections? > > Do we have any clear reason to believe this'd actually fix Mike's problem? > AFAIK the analogy to the old destructor-conflict issue is just a guess, > and we don't really know exactly what is going wrong. Right, we don't know. To learn more about the reported crash I think we'll need Mike to install debug symbols, attach with gdb and make it crash, then show us the output of "bt". More info here: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD It'd be nice to be able to rule it out in any future bug reports with these symptoms though, and it's roughly in line with what we see the rest of the open source ecosystem doing about this problem. > It's reasonable to assume that the proposed patch won't cause real issues > on any modern platform, but I'm not sure we can assume that for old ones, > so the whole thing is making me a bit nervous. Still, it's nice > simplification to not have different frontend and backend LDAP libs. Sure, it's possible that some BF animal will fail to link the backend for some reason that requires a bit of investigation and a follow-up patch. Are you thinking of systems not covered by the BF? Unless the server is being built with an extremely small set of configure options enabled, it's almost certainly already linking something that pulls in the platform's threading library (SSL, GSSAPI, XML2, ...). If someone out there is not enabling any of that stuff because their system doesn't like threads, they can use --disable-thread-safety to avoid the effects of this change. > As far as the specifics of the patch go, I don't like that you didn't > adjust any of the comments near pthread_is_threaded_np() usages. Hmm. The comments seemed OK to me without adjustment, is there something specific that bothered you? The errhint about LC_ALL is wrong though, it's macOS-specific. So I think I should change the hint to "On macOS, ...", or I guess make it conditional. -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > On Wed, Mar 20, 2019 at 10:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's reasonable to assume that the proposed patch won't cause real issues >> on any modern platform, but I'm not sure we can assume that for old ones, >> so the whole thing is making me a bit nervous. > Sure, it's possible that some BF animal will fail to link the backend > for some reason that requires a bit of investigation and a follow-up > patch. Are you thinking of systems not covered by the BF? No, I'm thinking that a "followup patch" might be impossible. > Unless the server is being built with an extremely small set of > configure options enabled, it's almost certainly already linking > something that pulls in the platform's threading library (SSL, GSSAPI, > XML2, ...). Yeah, but if somebody is relying on LDAP and not any of those other things, they won't be happy. > If someone out there is not enabling any of that stuff > because their system doesn't like threads, they can use > --disable-thread-safety to avoid the effects of this change. No, that's nonsense; --disable-thread-safety only affects what happens on the frontend side. >> As far as the specifics of the patch go, I don't like that you didn't >> adjust any of the comments near pthread_is_threaded_np() usages. > Hmm. The comments seemed OK to me without adjustment, is there > something specific that bothered you? The comment at postmaster.c:1339 is very specific about how there's a problem with macOS's libintl. On the basis of that, nobody would expect that there's a need to do anything on any other platform. I think we should at least add something about how we're worried about libldap_r maybe causing the backend to become multithreaded. > The errhint about LC_ALL is > wrong though, it's macOS-specific. Yeah, but that's part and parcel with the comment. regards, tom lane
Re: LDAP authenticated session terminated by signal 11: Segmentationfault, PostgresSQL server terminates other active server processes
From
Thomas Munro
Date:
On Thu, Mar 21, 2019 at 5:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > If someone out there is not enabling any of that stuff > > because their system doesn't like threads, they can use > > --disable-thread-safety to avoid the effects of this change. > > No, that's nonsense; --disable-thread-safety only affects what happens > on the frontend side. That's exactly what I'm talking about changing. With the patch, BE's LDAP library variant would also be controlled by that configure switch, so it would always match the FE. Almost all users would continue to choose libldap_r.so for the FE, so they'd start getting that in the BE too (if they didn't already due to distro-supplied symlinks). People using --disable-thread-safety would continue to get libldap.so for FE and BE, as they do today. -- Thomas Munro https://enterprisedb.com
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > On Thu, Mar 21, 2019 at 5:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Thomas Munro <thomas.munro@gmail.com> writes: >>> If someone out there is not enabling any of that stuff >>> because their system doesn't like threads, they can use >>> --disable-thread-safety to avoid the effects of this change. >> No, that's nonsense; --disable-thread-safety only affects what happens >> on the frontend side. > That's exactly what I'm talking about changing. With the patch, BE's > LDAP library variant would also be controlled by that configure > switch, so it would always match the FE. Almost all users would > continue to choose libldap_r.so for the FE, so they'd start getting > that in the BE too (if they didn't already due to distro-supplied > symlinks). People using --disable-thread-safety would continue to get > libldap.so for FE and BE, as they do today. Ah, I see. Seems reasonable. I still wish we could confirm this fixes the reported problem before we pull the trigger. regards, tom lane