Re: PG17beta1: Unable to test Postgres on Fedora due to fatal Error in psql: undefined symbol: PQsocketPoll - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PG17beta1: Unable to test Postgres on Fedora due to fatal Error in psql: undefined symbol: PQsocketPoll
Date
Msg-id 675336.1716572059@sss.pgh.pa.us
Whole thread Raw
In response to PG17beta1: Unable to test Postgres on Fedora due to fatal Error in psql: undefined symbol: PQsocketPoll  (Hans Buschmann <buschmann@nidsa.net>)
List pgsql-hackers
Hans Buschmann <buschmann@nidsa.net> writes:
> When I tried to connect to the restored database with psql \c I got:
> ...
> postgres=# \c cpsdb
> pgbeta/bin/psql: symbol lookup error: pgbeta/bin/psql: undefined symbol: PQsocketPoll

> (To my understanding) the problem comes from incompatible libpq.so libraries on the system.

Right, you must have a v16-or-earlier libpq lying around somewhere,
and psql has bound to that not to the beta-test version.
PQsocketPoll is new in v17.

> - Why doesn't psql use the just created lib64/libpq.so.5.17 from ninja install?

It's on you to ensure that happens, especially on Linux systems which
have a strong bias towards pulling libraries from /usr/lib[64].
Normally our --enable-rpath option is sufficient; while that's
default in an autoconf-based build, I'm not sure that it is
in a meson build.  Also, if your beta libpq is not where the
rpath option expected it to get installed, the linker will silently
fall back to /usr/lib[64].

> The loading of the locally available libpq.so should always have priority over a system wide in /usr/lib64

Tell it to the Linux developers --- they think the opposite.
Likewise, all of your other proposals need to be addressed to
the various distros' packagers; this is not the place to complain.

The main thing that is bothering me about the behavior you
describe is that it didn't fail until psql actually tried to
call PQsocketPoll.  (AFAICT from a quick look, that occurs
during \c but not during the startup connection.)  I had thought
that we select link options that result in early binding and
hence startup-time failure for a case like this.  I can confirm
though that this acts as described on my RHEL8 box if I force
current psql to link to v16 libpq, so either we've broken that
or it never did apply to frontend programs.  But it doesn't
seem to me to be a great thing for it to behave like this.
You could easily miss that you have a broken setup until
after you deploy it.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Upgrade Debian CI images to Bookworm
Next
From: Andres Freund
Date:
Subject: Re: First draft of PG 17 release notes