Home > mailing lists

Re: Regression tests fail with musl libc because libpq.so can't be loaded - Mailing list pgsql-bugs

From	Thomas Munro
Subject	Re: Regression tests fail with musl libc because libpq.so can't be loaded
Date	March 19 00:17:36
Msg-id	CA+hUKG+Tq3GK7bPd03N0Eox3YY4-Hjd7qQjo_QZFjdbhTqQGQA@mail.gmail.com Whole thread Raw
In response to	Re: Regression tests fail with musl libc because libpq.so can't be loaded (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Regression tests fail with musl libc because libpq.so can't be loaded (Thomas Munro <thomas.munro@gmail.com>)
List	pgsql-bugs

Tree view

On Tue, Mar 19, 2024 at 3:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > (Hmm, I think it's not that unreasonable on their part to assume the
> > initial environment is immutable if their implementation doesn't
> > mutate it, and our doing so is undeniably UB; surprising, maybe, given
> > that the technique works on that other popular brand of C library on
> > that kind of kernel, not to mention dozens of old Unixen of yore...
>
> Does their implementation also ignore the effects of putenv() or
> setenv() on LD_LIBRARY_PATH?  They have no moral high ground
> whatsoever if that's the case.  But if it doesn't, an alternative
> route to a solution could be to scan the original environment, strdup
> and putenv each entry to move it to freshly malloc'd space, and
> then reclaim the old environment area.

Yes, the musl linker/loader ignores putenv()/setenv() changes to
LD_LIBRARY_PATH after process start (that is, changes only effect the
search path when injected into a new program with exec*()).  As does
glibc, it's just that it captures by copy instead of reference
(according to one of the links above, I didn't check the source).  So
setenv() has no effect on dlopen() in *this* program, and using putenv
in that way won't help.  We simply can't move the value of
LD_LIBRARY_PATH (though my patch could be a little sneakier and steal
all the bytes right up to the = sign to get more space for our
message!).

One way to tell if a copy has been made is to trace a program that does:

        getenv("LD_LIBRARY_PATH")[2] = 'X';
        dlopen("foo.so", RTLD_NOW | RTLD_GLOBAL);

... when run with LD_LIBRARY_PATH set to /asdf.  On FreeBSD I see it
tries to open "/aXdf...", so now I know that FreeBSD also captures it
by reference like musl.  But we don't use the clobber trick on
FreeBSD, it has a proper setproctitle() function that knows how to
negotiate with the kernel, so it doesn't matter.  It also ignores
changes made with setent()/putenv(), because those create fresh
entries but leave the initial environment strings untouched.

Solaris also ignores changes made after startup (it's in the dlopen
man page), and from a very quick look at its ld_lib_setup() I think it
achieved that with a copy.  I believe its ancestor SunOS 4 invented
all of these conventions (and the mmap/virtual memory concepts they
rode in on), later nailed down to some degree in the System V ABI and
very widely adopted, but I don't see anything in the latter that
specifically addresses this point, eg LD_LIBRARY copy vs reference and
interaction with dlopen() (perhaps I didn't look hard enough).  I'm
not sure what else you can point to to make strong claims about this
stuff, but I bet every system ignores changes after startup, it's just
that they found two ways to achieve that.  POSIX says of dlopen that
the "file [argument] is used in an implementation-defined manner", and
of environ that we're welcome to swap a whole new environ, but doesn't
seem to tell us anything about the one that is replaced (who owns it?
is the initial one set up at execution time special? etc).  The line
banning manipulation of the pointers environ refers to doesn't exactly
describe what we're doing (we're manipulating the strings pointed to
by the *previous* environ).  UB.

pgsql-bugs by date:

From: Dave Cramer
Date: 18 March, 23:18:08
Subject: Re: Postgres jdbc driver inconsistent behaviour with double precession

From: Dave Cramer
Date: 19 March, 01:36:08
Subject: Re: Postgres jdbc driver inconsistent behaviour with double precession

Re: Regression tests fail with musl libc because libpq.so can't be loaded - Mailing list pgsql-bugs

Previous

Next