Thread: Building pg_xlogdump reproducibly

Building pg_xlogdump reproducibly

From
Christoph Berg
Date:
The list of objects used to link pg_xlogdump is coming from
$(wildcard *desc.c) which returns them in filesystem order. This makes
the build result depend on this ordering, yielding different
compilation results.

This patch fixes the reproducibility issue:

--- a/src/bin/pg_xlogdump/Makefile
+++ b/src/bin/pg_xlogdump/Makefile
@@ -12,7 +12,7 @@ OBJS = pg_xlogdump.o compat.o xlogreaderoverride CPPFLAGS := -DFRONTEND $(CPPFLAGS)
-RMGRDESCSOURCES = $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c))
+RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c)))RMGRDESCOBJS =
$(patsubst%.c,%.o,$(RMGRDESCSOURCES))
 

Spotted by Debian's reproducible builds project:
https://wiki.debian.org/ReproducibleBuilds
https://reproducible-builds.org/

Christoph



Re: Building pg_xlogdump reproducibly

From
Andres Freund
Date:
Hi,

On 2016-01-04 15:59:46 +0100, Christoph Berg wrote:
> The list of objects used to link pg_xlogdump is coming from
> $(wildcard *desc.c) which returns them in filesystem order. This makes
> the build result depend on this ordering, yielding different
> compilation results.

> -RMGRDESCSOURCES = $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c))
> +RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c)))
>  RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))

That's probably not the only non-deterministic rule in postgres, given
nobody paid attention tot that so far? At least transform modules added
in 9.5 (hstore_plpython et al) look like they might similar issues.

Wonder if we should instead define a wildcard wrapper in
Makefile.global.in that does the sorting, including an explanation?

Andres



Re: Building pg_xlogdump reproducibly

From
David Fetter
Date:
On Mon, Jan 04, 2016 at 04:51:25PM +0100, Andres Freund wrote:
> Hi,
> 
> On 2016-01-04 15:59:46 +0100, Christoph Berg wrote:
> > The list of objects used to link pg_xlogdump is coming from
> > $(wildcard *desc.c) which returns them in filesystem order. This makes
> > the build result depend on this ordering, yielding different
> > compilation results.
> 
> > -RMGRDESCSOURCES = $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c))
> > +RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc.c)))
> >  RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
> 
> That's probably not the only non-deterministic rule in postgres, given
> nobody paid attention tot that so far? At least transform modules added
> in 9.5 (hstore_plpython et al) look like they might similar issues.
> 
> Wonder if we should instead define a wildcard wrapper in
> Makefile.global.in that does the sorting, including an explanation?

That sounds like it will avert a lot of pain in the future, and the
sort overhead is negligible compared to the build time.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Building pg_xlogdump reproducibly

From
Christoph Berg
Date:
Re: Andres Freund 2016-01-04 <20160104155125.GD28025@awork2.anarazel.de>
> That's probably not the only non-deterministic rule in postgres, given
> nobody paid attention tot that so far? At least transform modules added
> in 9.5 (hstore_plpython et al) look like they might similar issues.

I was wondering the same. At least for 9.4, this seems to be the only
issue:

https://reproducible.debian.net/dbd/unstable/armhf/postgresql-9.4_9.4.5-2.diffoscope.html

The armhf builds there are running on "disorderfs" which triggers
ordering problems more easily.

I don't have data for 9.5/9.6 atm. (We will have for 9.5.0 soonish...)

Christoph
-- 
Senior Berater, Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE



Re: Building pg_xlogdump reproducibly

From
Alvaro Herrera
Date:
Christoph Berg wrote:
> Re: Andres Freund 2016-01-04 <20160104155125.GD28025@awork2.anarazel.de>
> > That's probably not the only non-deterministic rule in postgres, given
> > nobody paid attention tot that so far? At least transform modules added
> > in 9.5 (hstore_plpython et al) look like they might similar issues.
> 
> I was wondering the same. At least for 9.4, this seems to be the only
> issue:

Seems okay to me, then -- the requirement that link order is consistent
doesn't sound terribly strong.

> https://reproducible.debian.net/dbd/unstable/armhf/postgresql-9.4_9.4.5-2.diffoscope.html

Ugh.  I guess this output is helpful enough given that it mentions the
offending executable; since our Makefiles are simple enough, we
shouldn't have much trouble finding the problem spot.  I do wonder if
the CMake conversion is going to cause problems.

> > At least transform modules added in 9.5 (hstore_plpython et al) look
> > like they might similar issues.

Hmm.  hstore_plperl uses $(wildcard) but only in the AIX and Win32
cases, unless I'm misreading.

I don't see any other $(wildcard) used to build executables; it's used
for tests and flags in many places, but that shouldn't matter.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Building pg_xlogdump reproducibly

From
Christoph Berg
Date:
Re: Alvaro Herrera 2016-01-04 <20160104175623.GA170910@alvherre.pgsql>
> > https://reproducible.debian.net/dbd/unstable/armhf/postgresql-9.4_9.4.5-2.diffoscope.html

9.5 was already tested as well, I just couldn't find the link
yesterday:

https://reproducible.debian.net/rb-pkg/experimental/armhf/postgresql-9.5.html

Again, the only real difference between the two builds there is in
pg_xlogdump, the remaining differences are about the checksums of the
surrounding containers (.deb files are really ar files containing
tarballs).

> Ugh.  I guess this output is helpful enough given that it mentions the
> offending executable; since our Makefiles are simple enough, we
> shouldn't have much trouble finding the problem spot.  I do wonder if
> the CMake conversion is going to cause problems.

cmake is writing makefiles, so I wouldn't expect much problems. (But
it could be the case that problems are harder to fix.)

> > > At least transform modules added in 9.5 (hstore_plpython et al) look
> > > like they might similar issues.
>
> Hmm.  hstore_plperl uses $(wildcard) but only in the AIX and Win32
> cases, unless I'm misreading.
>
> I don't see any other $(wildcard) used to build executables; it's used
> for tests and flags in many places, but that shouldn't matter.

Nod. Attached is a patch that covers all relevant $(wildcard)
occurrences in Makefiles for devel.

 contrib/hstore_plperl/Makefile   |    2 !!
 contrib/hstore_plpython/Makefile |    4 !!!!
 contrib/ltree_plpython/Makefile  |    4 !!!!
 src/bin/pg_xlogdump/Makefile     |    2 !!
 4 files changed, 12 modifications(!)

Mit freundlichen Grüßen,
Christoph Berg
--
Senior Berater, Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE

Attachment

Re: Building pg_xlogdump reproducibly

From
Tom Lane
Date:
Christoph Berg <christoph.berg@credativ.de> writes:
> Re: Alvaro Herrera 2016-01-04 <20160104175623.GA170910@alvherre.pgsql>
>> I don't see any other $(wildcard) used to build executables; it's used
>> for tests and flags in many places, but that shouldn't matter.

> Nod. Attached is a patch that covers all relevant $(wildcard)
> occurrences in Makefiles for devel.

There was some upthread discussion of trying to centralize this logic
in a macro, but I think this patch is good as-is.  A macro wouldn't
really add any readability, nor would it do much to help us remember
to do the right thing in future places that might need this.
        regards, tom lane



Re: Building pg_xlogdump reproducibly

From
Tom Lane
Date:
Christoph Berg <christoph.berg@credativ.de> writes:
> Nod. Attached is a patch that covers all relevant $(wildcard)
> occurrences in Makefiles for devel.

Applied back to 9.3, which is as far as any of these cases exist.
        regards, tom lane