Re: Static PostgreSQL Binaries (Linux + Windows) - Mailing list pgsql-general

From Zach van Rijn
Subject Re: Static PostgreSQL Binaries (Linux + Windows)
Date
Msg-id 1547408553.4002.24.camel@zv.io
Whole thread Raw
In response to Re: Static PostgreSQL Binaries (Linux + Windows)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Sun, 2019-01-13 at 09:35 -0500, Tom Lane wrote:
> Zach van Rijn <me@zv.io> writes:
> > ...
> > The workaround is simply to ignore these errors during build
> > until I or someone else can get around to supplying patches
> > (in the next week or so; I have other commitments).
>
> TBH, there's going to be zero community interest in such
> patches.

Hi Tom,


Thank you for writing.

Given that several others have raised the topic before, there is
at least a little interest. It might not be for the general use-
case, but in certain scenarios static binaries are quite useful.

I wrote previously to offer this option and to solicit help in
testing and finding ways to improve these binaries.

If this doesn't align with the broader PostgreSQL community then
I apologize in advance for the mail.

> There is no reason to avoid shared libraries,

Why not?

  * It eliminates a whole class of vulnerabilities, namely those
    related to LD_PRELOAD;

  * It ensures programs behave consistently ("this file produced
    those results") which is significantly more difficult to do
    when a shared library is updated (e.g., on a shared system)
    and the user is left wondering why the output is different
    after their company performs "scheduled maintenance" and can
    not reproduce their results (even if the deviation is due to
    a bug in the underlying library);

  * It ensures programs behave consistently across compatible
    machines, regardless of the underlying system configuration,
    as in the case of poorly-managed computing clusters, which
    is unfortunately a reality for some people;

among others, including for performance reasons.

When scientific papers are published, one expects to be able to
reproduce the results exactly. That last bullet point is from my
personal experience, whereby a bug in one of the system's math
libraries caused slight variations in simulation output even
though our own code was later verified to be correct. The system
libc is not usually the first place one looks for errors, though
it cannot be ruled out.

> and they're an essential part of the modern Postgres build
> architecture --- particularly our extensibility story.

If you're referring to the loading of shared "user-code" modules
then that's a fair argument. In which case, you're correct that
it will be difficult or impossible to dynamically load modules
without some non-trivial design changes. But not all users will
need or want this feature.

Otherwise, would you kindly elaborate on "extensibility story"
and how that factors into why shared libraries are essential to
everyone who uses PostgreSQL? I'm not too familiar with its
internals or system architecture and this is a sincere request.

> Personally, I find your claim that purely-static linking is
> somehow a security advantage to be quite bizarre.

I said "may have additional performance / safety benefits." --
though I should have added "in certain applications."

Have you seen this post [1]? That situation is quite bizarre. If
they knew of this issue before proceeding with the upgrades, how
would they have avoided it? What if some machines could not be
upgraded to matching libc versions? These sort of issues are why
xstatic is being developed.

My position is, PostgreSQL is just one package of many, that I'd
like to support in my free time. It's not practical to expect or
achieve perfection the first time around, but it is one step
forward. What will computing look like 30 years from now?

> All modern Linux distros actually forbid static linking, I
> believe, because it'd put them in an impossible rebuild
> situation when some low-level component requires an update ---
> possibly for security reasons.

I haven't heard this before. Could you please point me to some
documentation on this? Debian permits it (see 8.2-3) [2], Gentoo
recommends against it [3]; others argue similarly and provide
context and further justification based on their intended uses.

Most of the reasons outlined in those links are valid, for
general-purpose Linux distros. I'm not going for that use case,
nor is this initiative intended to benefit all PostgreSQL users.

While it is true that shared libraries make system-wide updates
easier, there are several "modern" Linux distributions which
leverage static linking, e.g.: Sabotage Linux [4]. These are not
intended to be general-purpose either. And that's perfectly OK.

As to the "impossible rebuild situation", that is not necessary:
they only need to be re-linked. Future versions of xstatic will
cache build artifacts for fast updates. Perhaps it would still
be inconvenient, but it is certainly not impossible. Compression
techniques could be used to minimize wasted disk space.

> Are you going to promise immediate updates anytime glibc gets
> patched, across all the platforms you're proposing to support
> this on?

No; xstatic packages use musl libc [5], not glibc, so even if a
security-related bug in the libc came up, statistically speaking
it's less maintenance:

  https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=glibc (131)

  https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=musl  (  4)

As you know, static binaries only include the code they use, not
full copies of the entire dependency library/libraries. So if an
issue in the dependency library is discovered, the static binary
may not be affected. But on this level, we're splitting hairs.

What happens if you discover that your system's glibc is
vulnerable (and maybe some of your system's other software) when
your favorite static software is not? It's a two-way street.

In addition, at least in restricted environments, such as in
government and healthcare research settings, users are often
forbidden to perform system upgrades and must wait for their
system administrators to apply "immediate" updates, whenever
that actually happens.

Both types of linking (or even a hybrid solution) can certainly
be appropriate. It's not a one-size-fits-all solution, and this
fits into xstatic's extensibility story: allowing software to
extend the user, not the other way around.

It's about getting more mileage out of software. Modern vehicles
often recommend against using "conventional" oil and my point is
that "conventional" wisdom about static vs. dynamic linking may
be a thing of the past.

People who say things like "there's going to be zero community
interest" and spread one-sided logic are the reason why projects
like xstatic exist, why people ask on forums and mailing lists
how to build such software, and justify responses like this:

Your arguments "There is no reason to avoid shared libraries"
and "All modern Linux distros actually forbid static linking"
plus the implication that libc vulnerabilities are inevitable
and therefore one should not be bothered with rebuilding static
binaries, come across as absolutist and off-topic.


>             regards, tom lane

ZV


[1]: https://www.postgresql.org/message-id/flat/BA6132ED-1F6B-4A
0B-AC22-81278F5AB81E%40tripadvisor.com

[2]: https://www.debian.org/doc/debian-policy/ch-sharedlibs.html
#shared-library-support-files

[3]: https://wiki.gentoo.org/wiki/Why_not_bundle_dependencies

[4]: https://github.com/sabotage-linux/sabotage/wiki/Project-Goa
ls

[5]: https://www.musl-libc.org/



pgsql-general by date:

Previous
From: Aleš Zelený
Date:
Subject: Logical replication issue with row level trigger
Next
From: Mike Martin
Date:
Subject: Array_agg and dimensions in Array