Thread: PATCH: Report libpq version and configuration

PATCH: Report libpq version and configuration

From
Craig Ringer
Date:
The attached patches propose new interfaces for exposing more configuration and versioning information from libpq at runtime. They are to be used by applications to obtain finer grained information about libpq's configuration (SSL, GSSAPI, etc), to identify libpq binaries, and for applications that use libpq to report diagnostic information


Patch 0001 adds PQlibInfo(), which returns an array of key/value description items reporting on configuration like the full version string, SSL support, gssapi support, thread safety, default port and default unix socket path. This is for application use and application diagnostics. It also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout. See the commit message in patch 0001 for details.


Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be accessed by a debugger even when the library cannot be loaded or executed, and unlike macros are available even in a stripped executable. So they can be used to identify a libpq binary found in the wild. Their storage is shared with PQlibInfo()'s static data, so they only cost three symbol table entries.

$ cp ./build/src/interfaces/libpq/libpq.so libpq.so.stripped
$ strip libpq.so.stripped
$ gdb -batch -ex 'p (int)LIBPQ_VERSION_NUM' -ex 'p (const char *)LIBPQ_VERSION_STR' -ex 'p (const char *)LIBPQ_CONFIGURE_ARGS' ./libpq.so.stripped
$1 = 140000
$2 = 0x285f0 "PostgreSQL 14devel on x86_64-pc-linux-gnu, ...."
$3 = 0x28660 " '--cache-file=config.cache-' '--prefix=/home/craig/pg/master' '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--enable-dtrace' 'CC=/usr/lib64/ccache/gcc' 'CFLAGS=-Og -ggdb3' ..."



Patch 0003 allows libpq.so to be executed directly from the command line to print its version, configure arguments etc exactly as PQlibInfoPrint() would output them. This is only enabled on x64 linux for now but can be extended to other targets quite simply.

$  ./build/src/interfaces/libpq/libpq.so
VERSION_NUM: 140000
VERSION: PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit
CONFIGURE_ARGS:  '--cache-file=config.cache-' '--prefix=/home/craig/pg/master' '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--enable-dtrace' 'CC=/usr/lib64/ccache/gcc' 'CFLAGS=-Og -ggdb3' 'CPPFLAGS=' 'CPP=/usr/lib64/ccache/gcc -E'
USE_SSL: 0
ENABLE_GSS: 0
ENABLE_THREAD_SAFETY: 1
HAVE_UNIX_SOCKETS: 1
DEFAULT_PGSOCKET_DIR: /tmp
DEF_PGPORT: 5432
Attachment

Re: PATCH: Report libpq version and configuration

From
Alvaro Herrera
Date:
On 2020-Oct-26, Craig Ringer wrote:

> Patch 0001 adds PQlibInfo(), which returns an array of key/value
> description items reporting on configuration like the full version string,
> SSL support, gssapi support, thread safety, default port and default unix
> socket path. This is for application use and application diagnostics. It
> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout.
> See the commit message in patch 0001 for details.

Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass
stdout or whichever fd you want.

> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and
> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be
> accessed by a debugger even when the library cannot be loaded or executed,
> and unlike macros are available even in a stripped executable. So they can
> be used to identify a libpq binary found in the wild. Their storage is
> shared with PQlibInfo()'s static data, so they only cost three symbol table
> entries.

Interesting.  Is this real-world useful?  I'm thinking most of the time
I'd just run the library, but maybe you know of cases where that doesn't
work?

> Patch 0003 allows libpq.so to be executed directly from the command line to
> print its version, configure arguments etc exactly as PQlibInfoPrint()
> would output them. This is only enabled on x64 linux for now but can be
> extended to other targets quite simply.

+1 --- to me this is the bit that would be most useful, I expect.



Re: PATCH: Report libpq version and configuration

From
Abhijit Menon-Sen
Date:
At 2020-10-26 20:56:57 +0800, craig.ringer@enterprisedb.com wrote:
>
> $  ./build/src/interfaces/libpq/libpq.so
> VERSION_NUM: 140000
> VERSION: PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit
> CONFIGURE_ARGS:  '--cache-file=config.cache-'
> '--prefix=/home/craig/pg/master' '--enable-debug' '--enable-cassert'
> '--enable-tap-tests' '--enable-dtrace' 'CC=/usr/lib64/ccache/gcc'
> 'CFLAGS=-Og -ggdb3' 'CPPFLAGS=' 'CPP=/usr/lib64/ccache/gcc -E'
> USE_SSL: 0
> ENABLE_GSS: 0
> ENABLE_THREAD_SAFETY: 1
> HAVE_UNIX_SOCKETS: 1
> DEFAULT_PGSOCKET_DIR: /tmp
> DEF_PGPORT: 5432

This is excellent.

-- Abhijit



Re: PATCH: Report libpq version and configuration

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2020-Oct-26, Craig Ringer wrote:
>> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout.

> Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass
> stdout or whichever fd you want.

+1.  Are we concerned about translatability of these strings?  I think
I'd vote against, as it would complicate applications, but it's worth
thinking about it now not later.

>> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and
>> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be
>> accessed by a debugger even when the library cannot be loaded or executed,
>> and unlike macros are available even in a stripped executable. So they can
>> be used to identify a libpq binary found in the wild. Their storage is
>> shared with PQlibInfo()'s static data, so they only cost three symbol table
>> entries.

> Interesting.  Is this real-world useful?

-1, I think this is making way too many assumptions about the content
and format of a shlib.

>> Patch 0003 allows libpq.so to be executed directly from the command line to
>> print its version, configure arguments etc exactly as PQlibInfoPrint()
>> would output them. This is only enabled on x64 linux for now but can be
>> extended to other targets quite simply.

> +1 --- to me this is the bit that would be most useful, I expect.

Again, I'm not exactly excited about this.  I do not one bit like
patches that assume that x64 linux is the universe, or at least
all of it that need be catered to.  Reminds me of people who thought
Windows was the universe, not too many years ago.

I'd rather try to set this up so that some fairly standard tooling
like "strings" + "grep" can be used to pull out the info.  Sure,
it would be less convenient, but honestly how often is this really
going to be necessary?

            regards, tom lane



Re: PATCH: Report libpq version and configuration

From
Craig Ringer
Date:
On Tue, Oct 27, 2020 at 12:41 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2020-Oct-26, Craig Ringer wrote:

> Patch 0001 adds PQlibInfo(), which returns an array of key/value
> description items reporting on configuration like the full version string,
> SSL support, gssapi support, thread safety, default port and default unix
> socket path. This is for application use and application diagnostics. It
> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout.
> See the commit message in patch 0001 for details.

Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass
stdout or whichever fd you want.

The decision not to do so was deliberate. On any platform where a shared library could be linked to a different C runtime library than the main executable or other libraries it is not safe to pass a FILE*. This is most common on Windows.

I figured it's just a trivial wrapper anyway, so people can just write or copy it if they really care.

> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and
> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be
> accessed by a debugger even when the library cannot be loaded or executed,
> and unlike macros are available even in a stripped executable. So they can
> be used to identify a libpq binary found in the wild. Their storage is
> shared with PQlibInfo()'s static data, so they only cost three symbol table
> entries.

Interesting.  Is this real-world useful?  I'm thinking most of the time
I'd just run the library, but maybe you know of cases where that doesn't
work?

It was prompted by a support conversation about how to identify a  libpq. So I'd say yes.

In that case the eventual approach used was to use Python's ctypes to dynamically load libpq then call PQlibVersion().

> Patch 0003 allows libpq.so to be executed directly from the command line to
> print its version, configure arguments etc exactly as PQlibInfoPrint()
> would output them. This is only enabled on x64 linux for now but can be
> extended to other targets quite simply.

+1 --- to me this is the bit that would be most useful, I expect.

It's also kinda cool.

But it's using a bit of a platform quirk that's not supported by the toolchain as well as I'd really like - annoyingly, when you pass a --entrypoint to GNU ld or to LLVM's ld.lld, it should really emit the default .interp section to point to /bin/ld.so.2 or /lib64/ld-linux-x86-64.so.2 as appropriate. But when building -shared they don't seem to want to, nor do they expose a sensible macro that lets you get the default string yourself.

So I thought there was a moderate to high chance that this patch would trip someone's "yuck" meter.

Re: PATCH: Report libpq version and configuration

From
Craig Ringer
Date:
On Tue, Oct 27, 2020 at 12:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2020-Oct-26, Craig Ringer wrote:
> >> also adds PQlibInfoPrint() which dumps PQlibInfo() keys/values to stdout.
>
> > Sounds useful. I'd have PQlibInfoPrint(FILE *) instead, so you can pass
> > stdout or whichever fd you want.
>
> +1.  Are we concerned about translatability of these strings?  I think
> I'd vote against, as it would complicate applications, but it's worth
> thinking about it now not later.


It's necessary not to translate the key names, they are identifiers
not descriptive text. I don't object to having translations too, but
the translation teams have quite enough to do already with user-facing
text that will get regularly seen. So while it'd be potentially
interesting to expose translated versions too, I'm not entirely
convinced. It's a bit like translating macro names. You could, but ...
why?

> >> Patch 0002 exposes LIBPQ_VERSION_STR, LIBPQ_VERSION_NUM and
> >> LIBPQ_CONFIGURE_ARGS symbols in the dynamic symbol table. These can be
> >> accessed by a debugger even when the library cannot be loaded or executed,
> >> and unlike macros are available even in a stripped executable. So they can
> >> be used to identify a libpq binary found in the wild. Their storage is
> >> shared with PQlibInfo()'s static data, so they only cost three symbol table
> >> entries.
>
> > Interesting.  Is this real-world useful?
>
> -1, I think this is making way too many assumptions about the content
> and format of a shlib.


I'm not sure I understand what assumptions you're concerned about or
their consequences. On any ELF it should be just fine, and Mach-O
should be too. I do need to check that MSVC generates direct symbols
for WIN32 PE, not indirect thunked data symbols.

It doesn't help that I failed to supply the final revision of this
patch, which does this:

-const char * const LIBPQ_VERSION_STR = PG_VERSION_STR;
+const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;

-const char * const LIBPQ_CONFIGURE_ARGS = CONFIGURE_ARGS;
+const char LIBPQ_CONFIGURE_ARGS[] = CONFIGURE_ARGS;

... to properly ensure the string symbols go into the read-only data section:

$ eu-nm --defined-only -D $LIBPQ | grep LIBPQ_
LIBPQ_CONFIGURE_ARGS           |0000000000028640|GLOBAL|OBJECT
|00000000000000e4|  libpq-version.c:74|.rodata
LIBPQ_VERSION_NUM              |0000000000028620|GLOBAL|OBJECT
|0000000000000004|  libpq-version.c:75|.rodata
LIBPQ_VERSION_STR              |0000000000028740|GLOBAL|OBJECT
|000000000000006c|  libpq-version.c:73|.rodata

I don't propose these to replace information functions or macros, I'm
suggesting we add them as an aid to tooling and for debugging. I have
had quite enough times when I've faced a mystery libpq, and it's not
always practical in a given target environment to just compile a tool
to print the version.

In addition to easy binary identification, having symbolic references
to the version info is useful for dynamic tracing tools like perf and
systemtap - they cannot execute functions directly in the target
address space, but they can read data symbols. I actually want to
expose matching symbols in postgres itself, for the use of dynamic
tracing utilities, so they can autodetect the target postgres at
runtime even without -ggdb3 level debuginfo with macros, and correctly
adapt to version specifics of the target postgres.

In terms of standard tooling here are some different ways you can get
this information symbolically.

$ LIBPQ=/path/to/libpq.so

$ gdb -batch -ex 'p (int) LIBPQ_VERSION_NUM' -ex 'p (const char *)
LIBPQ_VERSION_STR' $LIBPQ
$1 = 140000
$2 = "PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (GCC)
10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit"

$ perl getpqver.pl $LIBPQ
LIBPQ_VERSION_NUM=140000
LIBPQ_VERSION_STR=PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled
by gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), 64-bit

I've attached getpqver.pl. It uses eu-nm from elfutils to get symbol
offset and length, which is pretty standard stuff. And it's quite
simple to adapt it to use legacy binutils "nm" by invoking

    nm --dynamic --defined -S $LIBPQ

and tweaking the reader.

If you really want something strings-able, I'm sure that's reasonably
feasible, but I don't think it's particularly unreasonable to expect
to be able to inspect the symbol table using appropriate platform
tools or a simple debugger command.

> Again, I'm not exactly excited about this.  I do not one bit like
> patches that assume that x64 linux is the universe, or at least
> all of it that need be catered to.  Reminds me of people who thought
> Windows was the universe, not too many years ago.

Yeah. I figured you'd say that, and don't disagree. It's why I split
this patch out - it's kind of a sacrificial patch.

I actually wrote this part first.

Then I wrote  PQlibInfo() when I realised that there was no sensible
pre-existing way to get the information I wanted to dump from libpq at
the API level, and adapted the executable .so output to call it.

> I'd rather try to set this up so that some fairly standard tooling
> like "strings" + "grep" can be used to pull out the info.  Sure,
> it would be less convenient, but honestly how often is this really
> going to be necessary?


eu-readelf and objdump are pretty standard tooling. But I really don't
much care if the executable .so hack gets in, it's mostly a fun PoC.
If you can execute libpq then the dynamic linker must be able to load
it and resolve its symbols, in which case you can probably just as
easily do this:

    python -c "import sys, ctypes;
ctypes.cdll.LoadLibrary(sys.argv[1]).PQlibInfoPrint()"
build/src/interfaces/libpq/libpq.so

or compile and run a trivial C one-liner.

As much as anything I thought it was a good way to stimulate
discussion and give you something easy to reject ;)

Attachment

Re: PATCH: Report libpq version and configuration

From
Alvaro Herrera
Date:
On 2020-Oct-27, Craig Ringer wrote:

> On Tue, Oct 27, 2020 at 12:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > +1.  Are we concerned about translatability of these strings?  I think
> > I'd vote against, as it would complicate applications, but it's worth
> > thinking about it now not later.
> 
> It's necessary not to translate the key names, they are identifiers
> not descriptive text. I don't object to having translations too, but
> the translation teams have quite enough to do already with user-facing
> text that will get regularly seen. So while it'd be potentially
> interesting to expose translated versions too, I'm not entirely
> convinced. It's a bit like translating macro names. You could, but ...
> why?

I don't think translating these values is useful for much.  I see it
similar to translating pg_controldata output: it is troublesome (to
pg_upgrade for instance) and serves no public that I know of.


> > Again, I'm not exactly excited about this.  I do not one bit like
> > patches that assume that x64 linux is the universe, or at least
> > all of it that need be catered to.  Reminds me of people who thought
> > Windows was the universe, not too many years ago.
> 
> Yeah. I figured you'd say that, and don't disagree. It's why I split
> this patch out - it's kind of a sacrificial patch.

Well, if we can make it run in more systems than just Linux, then it
seems worth having.  The submitted patch seems a little bit on the
naughty side.



Re: PATCH: Report libpq version and configuration

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Well, if we can make it run in more systems than just Linux, then it
> seems worth having.  The submitted patch seems a little bit on the
> naughty side.

I agree that the facility seems possibly useful, as long as we can
minimize its platform dependency.  Just embedding some strings, as
I suggested upthread, seems like it'd go far in that direction.
Yeah, you could spend a lot of effort to make it a bit more user
friendly, but is the effort really going to be repaid?  The use
case for this isn't that large, I don't think.

            regards, tom lane



Re: PATCH: Report libpq version and configuration

From
Craig Ringer
Date:
On Tue, Nov 10, 2020 at 12:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Well, if we can make it run in more systems than just Linux, then it
> seems worth having.  The submitted patch seems a little bit on the
> naughty side.

I agree that the facility seems possibly useful, as long as we can
minimize its platform dependency.  Just embedding some strings, as
I suggested upthread, seems like it'd go far in that direction.
Yeah, you could spend a lot of effort to make it a bit more user
friendly, but is the effort really going to be repaid?  The use
case for this isn't that large, I don't think.

The reason I want to expose symbols is mainly for tracing tooling - perf, systemtap, etc.

I thought it'd make sense to also provide another way to identify the libpq binary.

My other hesitation about using a "strings libpq.so" approach is that it's not something I'd be super happy about automating and relying on in scripts etc. It could break depending on how the compiler decides to arrange things or due to unrelated changes in libpq that create similar-looking strings later. I'd prefer to do it deterministically. You can already use "strings" to identify an unstripped binary built with -ggdb3 (macros in DWARF debuginfo), but we don't compile the PG_VERSION into the binary, so you can't even get the basic version string like "postgres (PostgreSQL) 11.9" from 'strings'.

The whole PQlibInfo() thing came about because I thought it'd be potentially useful. I've had issues before with applications being built against a newer version of libpq than what they're linked against, and it's been rather frustrating to make the app tolerant of that. But it can be solved (clumsily) using various existing workarounds.

The main things I'd really like to get in place are a way to get the version as an ELF data symbol, and a simple way to ID the binary.

So the minimal change would be to declare:

const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;
const int LIBPQ_VERSION_NUM = PG_VERSION_NUM;

then change PQgetVersion() to return LIBPQ_VERSION_NUM and add a PQgetVersionStr() that returns LIBPQ_VERSION_STR.

That OK with you?


Re: PATCH: Report libpq version and configuration

From
Craig Ringer
Date:
On Tue, Nov 10, 2020 at 2:22 PM Craig Ringer <craig.ringer@enterprisedb.com> wrote:

The main things I'd really like to get in place are a way to get the version as an ELF data symbol, and a simple way to ID the binary.

So the minimal change would be to declare:

const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;
const int LIBPQ_VERSION_NUM = PG_VERSION_NUM;

then change PQgetVersion() to return LIBPQ_VERSION_NUM and add a PQgetVersionStr() that returns LIBPQ_VERSION_STR.

That OK with you?

Proposed minimal patch attached.

Attachment

Re: PATCH: Report libpq version and configuration

From
Masahiko Sawada
Date:
Hi Craig,

On Wed, Nov 11, 2020 at 1:11 PM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:
>
> On Tue, Nov 10, 2020 at 2:22 PM Craig Ringer <craig.ringer@enterprisedb.com> wrote:
>>
>>
>> The main things I'd really like to get in place are a way to get the version as an ELF data symbol, and a simple way
toID the binary.
 
>>
>> So the minimal change would be to declare:
>>
>> const char LIBPQ_VERSION_STR[] = PG_VERSION_STR;
>> const int LIBPQ_VERSION_NUM = PG_VERSION_NUM;
>>
>> then change PQgetVersion() to return LIBPQ_VERSION_NUM and add a PQgetVersionStr() that returns LIBPQ_VERSION_STR.
>>
>> That OK with you?
>
>
> Proposed minimal patch attached.

You sent in your patch, v2-0001-Add-PQlibVersionString-to-libpq.patch
to pgsql-hackers on Nov 11, but you did not post it to the next
CommitFest[1].  If this was intentional, then you need to take no
action.  However, if you want your patch to be reviewed as part of the
upcoming CommitFest, then you need to add it yourself before
2021-01-01 AOE[2]. Thanks for your contributions.

Regards,

[1] https://commitfest.postgresql.org/31/
[2] https://en.wikipedia.org/wiki/Anywhere_on_Earth

-- 
Masahiko Sawada
EnterpriseDB:  https://www.enterprisedb.com/