Thread: Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

From
Ants Aasma
Date:
I had to make oracle_fdw work with PostgreSQL compiled using
--with-ldap. The issue there is that Oracle's client library has the
delightful property of linking against a ldap library they bundle that
has symbol conflicts with OpenLDAP. At PostgreSQL startup libldap is
loaded, so when libclntsh.so (the Oracle client) is loaded it gets
bound to OpenLDAP symbols, and unsurprisingly crashes with a segfault
when those functions get used.

glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers
symbols loaded by the library to those provided by the caller. Using
this flag fixes my issue, PostgreSQL gets the ldap functions from
libldap, Oracle client gets them from whatever it links to. Both work
fine.

Attached is a patch that enables this flag on Linux when available.
This specific case could also be fixed by rewriting oracle_fdw to use
dlopen for libclntsh.so and pass this flag, but I think it would be
better to enable it for all PostgreSQL loaded extension modules. I
can't think of a sane use case where it would be correct to prefer
PostgreSQL loaded symbols to those the library was actually linked
against.

Does anybody know of a case where this flag wouldn't be a good idea?
Are there any similar options for other platforms? Alternatively, does
anyone know of linker flags that would give a similar effect?

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

Attachment

Re: Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

From
Albe Laurenz
Date:
Ants Aasma wrote:
> I had to make oracle_fdw work with PostgreSQL compiled using
> --with-ldap. The issue there is that Oracle's client library has the
> delightful property of linking against a ldap library they bundle that
> has symbol conflicts with OpenLDAP. At PostgreSQL startup libldap is
> loaded, so when libclntsh.so (the Oracle client) is loaded it gets
> bound to OpenLDAP symbols, and unsurprisingly crashes with a segfault
> when those functions get used.
> 
> glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers
> symbols loaded by the library to those provided by the caller. Using
> this flag fixes my issue, PostgreSQL gets the ldap functions from
> libldap, Oracle client gets them from whatever it links to. Both work
> fine.

I am aware of the problem, but this solution is new to me.
My workaround so far has been to load OpenLDAP with the LD_PRELOAD
environment variable on PostgreSQL start.
But then you get a crash when Oracle uses LDAP functionality (directory naming).

> Attached is a patch that enables this flag on Linux when available.
> This specific case could also be fixed by rewriting oracle_fdw to use
> dlopen for libclntsh.so and pass this flag, but I think it would be
> better to enable it for all PostgreSQL loaded extension modules.

I'll consider changing oracle_fdw in that fashion, even if that will
only remedy the problem on Linux.
I think that this patch is a good idea though.

Yours,
Laurenz Albe

Re: Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

From
Noah Misch
Date:
On Wed, Nov 26, 2014 at 01:34:18PM +0200, Ants Aasma wrote:
> glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers
> symbols loaded by the library to those provided by the caller. Using
> this flag fixes my issue, PostgreSQL gets the ldap functions from
> libldap, Oracle client gets them from whatever it links to. Both work
> fine.
> 
> Attached is a patch that enables this flag on Linux when available.
> This specific case could also be fixed by rewriting oracle_fdw to use
> dlopen for libclntsh.so and pass this flag, but I think it would be
> better to enable it for all PostgreSQL loaded extension modules. I
> can't think of a sane use case where it would be correct to prefer
> PostgreSQL loaded symbols to those the library was actually linked
> against.
> 
> Does anybody know of a case where this flag wouldn't be a good idea?

There's a meta-downside that any bug the flag prevents will still exist on
non-glibc targets, and that's the primary reason not to make this change in
core PostgreSQL.  Most hackers use GNU/Linux as a primary development
platform, so RTLD_DEEPBIND would delay discovery of still-present bugs.

Standard POSIX symbol resolution has some advantages over RTLD_DEEPBIND.  Any
given program has at most one global symbol called "malloc", and so-named
symbols usually get away with assuming they own the brk() space.  LD_PRELOAD
can overload any global symbol.  Those points by themselves would not stop me
from using RTLD_DEEPBIND in specific cases like oracle_fdw, though.

> Are there any similar options for other platforms? Alternatively, does
> anyone know of linker flags that would give a similar effect?

It has some overlap with the -Bsymbolic linker option.