Thread: Reproducible builds: genbki.pl vs schemapg.h
Hi, Debian's reproducible builds project has revealed that the full build path gets embedded into server/catalog/schemapg.h: /*------------------------------------------------------------------------- * * schemapg.h * Schema_pg_xxx macros for use by relcache.c * * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * NOTES * ****************************** * *** DO NOT EDIT THIS FILE! *** * ****************************** * * It has been GENERATED by /build/postgresql-11-05gRdu/build/../src/backend/catalog/genbki.pl * *------------------------------------------------------------------------- */ This information will vary on rebuilding the binaries, needlessly causing different checksums. I'm proposing to strip that down to "It has been GENERATED genbki.pl". Patch attached. Christoph
Attachment
Christoph Berg <christoph.berg@credativ.de> writes: > Debian's reproducible builds project has revealed that the full build > path gets embedded into server/catalog/schemapg.h: genbki.pl is hardly our only script that prints its $0 ... regards, tom lane
Re: Tom Lane 2017-12-15 <9616.1513351766@sss.pgh.pa.us> > Christoph Berg <christoph.berg@credativ.de> writes: > > Debian's reproducible builds project has revealed that the full build > > path gets embedded into server/catalog/schemapg.h: > > genbki.pl is hardly our only script that prints its $0 ... As per https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html, that's the only place that makes it into the resulting binary. I wouldn't be sending a patch if it didn't fix the issue. Christoph -- Senior Berater, Tel.: +49 2166 9901 187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg <christoph.berg@credativ.de> wrote: > Re: Tom Lane 2017-12-15 <9616.1513351766@sss.pgh.pa.us> >> Christoph Berg <christoph.berg@credativ.de> writes: >> > Debian's reproducible builds project has revealed that the full build >> > path gets embedded into server/catalog/schemapg.h: >> >> genbki.pl is hardly our only script that prints its $0 ... > > As per > https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html, > that's the only place that makes it into the resulting binary. > I wouldn't be sending a patch if it didn't fix the issue. Why not fixing that? Reproducible builds are a trend of these days, and what's proposed here is really simple to make PG more compliant with this purpose in mind. -- Michael
On 2017-12-16 07:52:41 +0900, Michael Paquier wrote: > On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg > <christoph.berg@credativ.de> wrote: > > Re: Tom Lane 2017-12-15 <9616.1513351766@sss.pgh.pa.us> > >> Christoph Berg <christoph.berg@credativ.de> writes: > >> > Debian's reproducible builds project has revealed that the full build > >> > path gets embedded into server/catalog/schemapg.h: > >> > >> genbki.pl is hardly our only script that prints its $0 ... > > > > As per > > https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html, > > that's the only place that makes it into the resulting binary. > > I wouldn't be sending a patch if it didn't fix the issue. > > Why not fixing that? Reproducible builds are a trend of these days, > and what's proposed here is really simple to make PG more compliant > with this purpose in mind. It's not like $0 instead of a hardcoded name in the header actually buys us anything afaict. Greetings, Andres Freund
On Fri, Dec 15, 2017 at 3:21 PM, Andres Freund <andres@anarazel.de> wrote: >> Why not fixing that? Reproducible builds are a trend of these days, >> and what's proposed here is really simple to make PG more compliant >> with this purpose in mind. > > It's not like $0 instead of a hardcoded name in the header actually buys > us anything afaict. +1. I think that reproducible builds are a worthwhile goal, and I welcome Christoph's continued work on them. -- Peter Geoghegan
Andres Freund <andres@anarazel.de> writes: > On 2017-12-16 07:52:41 +0900, Michael Paquier wrote: >> On Sat, Dec 16, 2017 at 3:13 AM, Christoph Berg >> <christoph.berg@credativ.de> wrote: >>> Re: Tom Lane 2017-12-15 <9616.1513351766@sss.pgh.pa.us> >>>> genbki.pl is hardly our only script that prints its $0 ... >>> As per >>> https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html, >>> that's the only place that makes it into the resulting binary. I'm fairly confused by this claim. Since the string in question is in a comment, it really shouldn't affect built binaries at all. I can believe that it would affect the non-binary contents of the finished package, because we include schemapg.h as one of the installed headers. However, we also include fmgroids.h, which also has an expanded version of $0 in it. So if schemapg.h is affected by build path, why isn't fmgroids.h? In my build, neither one of these files contains any path information; I speculate that you need to use a VPATH build to have an issue, or maybe Debian's build environment does something even weirder. But I feel confident in saying that if indeed fmgroids.h is invariant for you today, that's a phase-of-the-moon behavior that will break someday if we don't make a similar change in Gen_fmgrtab.pl. Or else we should propagate what's preventing it back to genbki.pl. plperl's text2macro.pl is also emitting $0, and there may be other places I missed (I grepped for 'DO NOT EDIT', not for $0 per se). We seem not to install the output file of that one, perlchunks.h, but that's a decision that somebody might change too. > It's not like $0 instead of a hardcoded name in the header actually buys > us anything afaict. Agreed so far as the script name goes. However, two out of three of these scripts also print their input file names, and I'm suspicious that that output is also gonna change in a VPATH build. I'm a little less inclined to buy the claim that we're not losing anything if we suppress that :-( regards, tom lane
Re: Tom Lane 2017-12-16 <5525.1513381145@sss.pgh.pa.us> > >>> As per > >>> https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/postgresql-10.html, > >>> that's the only place that makes it into the resulting binary. > > I'm fairly confused by this claim. Since the string in question is in a > comment, it really shouldn't affect built binaries at all. I can believe > that it would affect the non-binary contents of the finished package, "Binary" in that context was the .deb package file (in contrast to the .dsc source package). > In my build, neither one of these files contains any path information; > I speculate that you need to use a VPATH build to have an issue, or > maybe Debian's build environment does something even weirder. This is a VPATH build, yes. > > It's not like $0 instead of a hardcoded name in the header actually buys > > us anything afaict. > > Agreed so far as the script name goes. However, two out of three of these > scripts also print their input file names, and I'm suspicious that that > output is also gonna change in a VPATH build. I'm a little less inclined > to buy the claim that we're not losing anything if we suppress that :-( Well, patching this instance of $0 would fix a binary-package variation in practise. Of course there might be more issues waiting to come into effect, but I don't see why that would be an argument against fixing the current issue. Christoph -- Senior Berater, Tel.: +49 2166 9901 187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Christoph Berg <christoph.berg@credativ.de> writes: >> Agreed so far as the script name goes. However, two out of three of these >> scripts also print their input file names, and I'm suspicious that that >> output is also gonna change in a VPATH build. I'm a little less inclined >> to buy the claim that we're not losing anything if we suppress that :-( > Well, patching this instance of $0 would fix a binary-package > variation in practise. Of course there might be more issues waiting to > come into effect, but I don't see why that would be an argument > against fixing the current issue. I think we're talking at cross-purposes. I'm not saying we should not fix this problem. I'm saying that the proposed fix appears incomplete, which means that (a) even if it solves your problem, it probably does not solve related problems for other people; (b) since it's not clear why this patch is apparently sufficient for you, I'd like to understand that in some detail before deeming the problem solved; and (c) leaving instances of the problematic code in our tree is just about guaranteed to mean you'll have the same problem in future, when somebody either copies that coding pattern into some new script or tweaks the way those existing scripts are being used. regards, tom lane
Re: Tom Lane 2017-12-16 <417.1513438031@sss.pgh.pa.us> > I think we're talking at cross-purposes. I'm not saying we should not fix > this problem. I'm saying that the proposed fix appears incomplete, which > means that (a) even if it solves your problem, it probably does not solve > related problems for other people; (b) since it's not clear why this > patch is apparently sufficient for you, I'd like to understand that in > some detail before deeming the problem solved; and (c) leaving instances > of the problematic code in our tree is just about guaranteed to mean > you'll have the same problem in future, when somebody either copies that > coding pattern into some new script or tweaks the way those existing > scripts are being used. Grepping through the source, there are three places where $0 printed to files in regular operation (as opposed to being used in --help): * src/backend/catalog/genbki.pl * It has been GENERATED by $0 * src/backend/utils/Gen_fmgrtab.pl * It has been GENERATED by $0 * src/backend/utils/mb/Unicode/convutils.pm, a perl module used by src/backend/utils/mb/Unicode/*_to_*.pl my $this_script = $0; print $out "/* This file is generated by $this_script */\n\n"; The latter is not relevant because the generated .map files just included in other .c files, but not shipped in source. (And a changed comment does not modify the build result.) The first case is fixed in my original patch, the updated patch catches the second as well. I believe the reason why we've only been seeing half of the problem yet is that the generated files are shipped with the tarballs, so it might be a timestamping issue determining if the scripts are re-executed. Filed as https://commitfest.postgresql.org/16/1417/ Christoph -- Senior Berater, Tel.: +49 2166 9901 187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Attachment
Christoph Berg <christoph.berg@credativ.de> writes: > Re: Tom Lane 2017-12-16 <417.1513438031@sss.pgh.pa.us> >> I think we're talking at cross-purposes. I'm not saying we should not fix >> this problem. I'm saying that the proposed fix appears incomplete ... > Grepping through the source, there are three places where $0 printed > to files in regular operation (as opposed to being used in --help): I poked around and found a few more. > I believe the reason why we've only been seeing half of the problem > yet is that the generated files are shipped with the tarballs, so it > might be a timestamping issue determining if the scripts are > re-executed. Right; some parts of this problem would only materialize for you if you needed to rebuild the generated files that are included in the tarball, which should basically not be happening in normal packager builds. Rather the risk is at our end: if we ever switched the tarball creation process to be a VPATH build, then there'd be path dependencies in the created tarballs. That would be bad. More generally, my concern here is not just that we fix this problem but that it stays fixed. If some individual scripts print $0 into their output and it happens to not affect any built distribution files today, it's still bad, because tomorrow somebody might copy that coding pattern into someplace else where it matters more. I think we need a project policy that thou shalt not print $0 into generated files, period. Also, experimenting with a VPATH build, I verified that such "helpful" practices as printing $infile or @ARGV into the output file will also create path dependencies. So I think we need to lose those too. It's not like they're adding any info you can't find out from the Makefiles. On the other hand, there is something we can do that will improve matters: rather than just printing the base name of the script, let's print its full relative path within the PG sources, eg instead of Gen_fmgrtab.pl let's print src/backend/utils/Gen_fmgrtab.pl. My thought here is that if you're not already intimately familiar with a script you might not remember where it lives, the more so if you're looking at a file that's been put into an installation tree far away from where it was generated. I see that this policy was already followed in some places, just not in the ones that were using the $0 shortcut. In short, I propose the attached more-extensive patch. Some of the files generated by these scripts, particularly the map files generated by the src/backend/utils/mb/Unicode/ scripts, are not just present in tarballs but are actually in our git repo. So changing those scripts won't affect anything until/unless someone updates the repo's generated files, which I've not done here and don't feel much need to do. I just want to establish a principle that we don't print path-dependent info into generated files. regards, tom lane diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 256a9c9..e4a0b8b 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -340,7 +340,7 @@ print $schemapg <<EOM; * *** DO NOT EDIT THIS FILE! *** * ****************************** * - * It has been GENERATED by $0 + * It has been GENERATED by src/backend/catalog/genbki.pl * *------------------------------------------------------------------------- */ diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index ee89d50..26b428b 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -118,8 +118,7 @@ qq|/*------------------------------------------------------------------------- * *** DO NOT EDIT THIS FILE! *** * ****************************** * - * It has been GENERATED by $0 - * from $infile + * It has been GENERATED by src/backend/utils/Gen_fmgrtab.pl * *------------------------------------------------------------------------- */ @@ -153,8 +152,7 @@ qq|/*------------------------------------------------------------------------- * *** DO NOT EDIT THIS FILE! *** * ****************************** * - * It has been GENERATED by $0 - * from $infile + * It has been GENERATED by src/backend/utils/Gen_fmgrtab.pl * *------------------------------------------------------------------------- */ @@ -181,8 +179,7 @@ qq|/*------------------------------------------------------------------------- * *** DO NOT EDIT THIS FILE! *** * ****************************** * - * It has been GENERATED by $0 - * from $infile + * It has been GENERATED by src/backend/utils/Gen_fmgrtab.pl * *------------------------------------------------------------------------- */ diff --git a/src/backend/utils/mb/Unicode/UCS_to_BIG5.pl b/src/backend/utils/mb/Unicode/UCS_to_BIG5.pl index 05822b2..f73e2a1 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_BIG5.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_BIG5.pl @@ -27,7 +27,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_BIG5.pl'; # Load BIG5.TXT my $all = &read_source("BIG5.TXT"); diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl index f7f9496..63f5e1b 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl @@ -16,7 +16,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl'; # Read the input diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl index 2290031..4dac7d2 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl @@ -10,7 +10,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl'; # first generate UTF-8 --> EUC_JIS_2004 table diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl index 5db663f..77ce273 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl @@ -14,7 +14,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl'; # Load JIS0212.TXT my $jis0212 = &read_source("JIS0212.TXT"); diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl index f0d978d..8e0126c 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl @@ -19,7 +19,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl'; # Load the source file. diff --git a/src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl b/src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl index 0bfcbd5..61013e3 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl @@ -20,7 +20,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl'; my $mapping = &read_source("CNS11643.TXT"); diff --git a/src/backend/utils/mb/Unicode/UCS_to_GB18030.pl b/src/backend/utils/mb/Unicode/UCS_to_GB18030.pl index 4469cc7..e9f816c 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_GB18030.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_GB18030.pl @@ -16,7 +16,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_GB18030.pl'; # Read the input diff --git a/src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl b/src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl index 7c6f526..be10d52 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl @@ -18,7 +18,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl'; # Load the source file. diff --git a/src/backend/utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl b/src/backend/utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl index d1b36c0..98a6ee7 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl @@ -12,7 +12,7 @@ use convutils; # first generate UTF-8 --> SHIFT_JIS_2004 table -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl'; my $in_file = "sjis-0213-2004-std.txt"; diff --git a/src/backend/utils/mb/Unicode/UCS_to_SJIS.pl b/src/backend/utils/mb/Unicode/UCS_to_SJIS.pl index 7460964..cc1edcc 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_SJIS.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_SJIS.pl @@ -13,7 +13,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_SJIS.pl'; my $mapping = read_source("CP932.TXT"); diff --git a/src/backend/utils/mb/Unicode/UCS_to_UHC.pl b/src/backend/utils/mb/Unicode/UCS_to_UHC.pl index 8d99ca3..640a2ec 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_UHC.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_UHC.pl @@ -16,7 +16,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_UHC.pl'; # Read the input diff --git a/src/backend/utils/mb/Unicode/UCS_to_most.pl b/src/backend/utils/mb/Unicode/UCS_to_most.pl index 1c3922e..2d69748 100755 --- a/src/backend/utils/mb/Unicode/UCS_to_most.pl +++ b/src/backend/utils/mb/Unicode/UCS_to_most.pl @@ -18,7 +18,7 @@ use strict; use convutils; -my $this_script = $0; +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_most.pl'; my %filename = ( 'WIN866' => 'CP866.TXT', diff --git a/src/bin/psql/create_help.pl b/src/bin/psql/create_help.pl index cedb767..9fa1855 100644 --- a/src/bin/psql/create_help.pl +++ b/src/bin/psql/create_help.pl @@ -51,8 +51,7 @@ print $hfile_handle "/* * *** Do not change this file by hand. It is automatically * *** generated from the DocBook documentation. * - * generated by - * $^X $0 @ARGV + * generated by src/bin/psql/create_help.pl * */ @@ -76,8 +75,7 @@ print $cfile_handle "/* * *** Do not change this file by hand. It is automatically * *** generated from the DocBook documentation. * - * generated by - * $^X $0 @ARGV + * generated by src/bin/psql/create_help.pl * */ @@ -131,7 +129,7 @@ foreach my $file (sort readdir DIR) my $nl_count = () = $cmdsynopsis =~ /\n/g; $cmdsynopsis =~ m!</>! - and die "$0:$file: null end tag not supported in synopsis\n"; + and die "$0: $file: null end tag not supported in synopsis\n"; $cmdsynopsis =~ s/%/%%/g; while ($cmdsynopsis =~ m!<(\w+)[^>]*>(.+?)</\1[^>]*>!) diff --git a/src/pl/plperl/text2macro.pl b/src/pl/plperl/text2macro.pl index e681fca..27c6ef7 100644 --- a/src/pl/plperl/text2macro.pl +++ b/src/pl/plperl/text2macro.pl @@ -40,7 +40,7 @@ die "No text files specified" print qq{ /* * DO NOT EDIT - THIS FILE IS AUTOGENERATED - CHANGES WILL BE LOST - * Written by $0 from @ARGV + * Generated by src/pl/plperl/text2macro.pl */ }; diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm index 6019040..72826d5 100644 --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -66,7 +66,7 @@ BEGIN delete $ENV{PGPORT}; delete $ENV{PGHOST}; - $ENV{PGAPPNAME} = $0; + $ENV{PGAPPNAME} = basename($0); # Must be set early $windows_os = $Config{osname} eq 'MSWin32' || $Config{osname} eq 'msys';
On 12/20/2017 09:50 PM, Tom Lane wrote: > Christoph Berg <christoph.berg@credativ.de> writes: >> Re: Tom Lane 2017-12-16 <417.1513438031@sss.pgh.pa.us> >>> I think we're talking at cross-purposes. I'm not saying we should not fix >>> this problem. I'm saying that the proposed fix appears incomplete ... >> Grepping through the source, there are three places where $0 printed >> to files in regular operation (as opposed to being used in --help): > I poked around and found a few more. > >> I believe the reason why we've only been seeing half of the problem >> yet is that the generated files are shipped with the tarballs, so it >> might be a timestamping issue determining if the scripts are >> re-executed. > Right; some parts of this problem would only materialize for you if you > needed to rebuild the generated files that are included in the tarball, > which should basically not be happening in normal packager builds. > Rather the risk is at our end: if we ever switched the tarball creation > process to be a VPATH build, then there'd be path dependencies in the > created tarballs. That would be bad. > > More generally, my concern here is not just that we fix this problem > but that it stays fixed. If some individual scripts print $0 into > their output and it happens to not affect any built distribution files > today, it's still bad, because tomorrow somebody might copy that coding > pattern into someplace else where it matters more. I think we need a > project policy that thou shalt not print $0 into generated files, period. > > Also, experimenting with a VPATH build, I verified that such "helpful" > practices as printing $infile or @ARGV into the output file will also > create path dependencies. So I think we need to lose those too. > It's not like they're adding any info you can't find out from the > Makefiles. > > On the other hand, there is something we can do that will improve > matters: rather than just printing the base name of the script, > let's print its full relative path within the PG sources, eg instead > of Gen_fmgrtab.pl let's print src/backend/utils/Gen_fmgrtab.pl. > My thought here is that if you're not already intimately familiar > with a script you might not remember where it lives, the more so > if you're looking at a file that's been put into an installation > tree far away from where it was generated. I see that this policy > was already followed in some places, just not in the ones that were > using the $0 shortcut. > > In short, I propose the attached more-extensive patch. > > Some of the files generated by these scripts, particularly the map > files generated by the src/backend/utils/mb/Unicode/ scripts, are > not just present in tarballs but are actually in our git repo. > So changing those scripts won't affect anything until/unless someone > updates the repo's generated files, which I've not done here and > don't feel much need to do. I just want to establish a principle > that we don't print path-dependent info into generated files. > > Looks reasonable. Regarding the change to TestLib.pm, we should make sure that the tests have unique names. There is a small amount of duplication currently: ./src/bin/pg_dump/t/001_basic.pl ./src/bin/pg_rewind/t/001_basic.pl ./src/test/modules/commit_ts/t/001_base.pl ./src/test/modules/test_pg_dump/t/001_base.pl cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Dec 21, 2017 at 10:13 PM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > Looks reasonable. Regarding the change to TestLib.pm, we should make > sure that the tests have unique names. There is a small amount of > duplication currently: > > ./src/bin/pg_dump/t/001_basic.pl > ./src/bin/pg_rewind/t/001_basic.pl > ./src/test/modules/commit_ts/t/001_base.pl > ./src/test/modules/test_pg_dump/t/001_base.pl Why does the uniqueness of application_name set in the TAP test suite matter? Each instance of Postgres is self-contained into its own Unix domain path and port. -- Michael
On 12/21/2017 08:27 AM, Michael Paquier wrote: > On Thu, Dec 21, 2017 at 10:13 PM, Andrew Dunstan > <andrew.dunstan@2ndquadrant.com> wrote: >> Looks reasonable. Regarding the change to TestLib.pm, we should make >> sure that the tests have unique names. There is a small amount of >> duplication currently: >> >> ./src/bin/pg_dump/t/001_basic.pl >> ./src/bin/pg_rewind/t/001_basic.pl >> ./src/test/modules/commit_ts/t/001_base.pl >> ./src/test/modules/test_pg_dump/t/001_base.pl > Why does the uniqueness of application_name set in the TAP test suite > matter? Each instance of Postgres is self-contained into its own Unix > domain path and port. I've found it annoying in the past. Probably now that the buildfarm reports each test series separately it matters less. Previously we reported many of the TAP tests together. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Andrew Dunstan 2017-12-21 <d68d254a-6f97-a8c2-4656-ab8886fe2c38@2ndQuadrant.com> > >> ./src/bin/pg_dump/t/001_basic.pl > >> ./src/bin/pg_rewind/t/001_basic.pl > >> ./src/test/modules/commit_ts/t/001_base.pl > >> ./src/test/modules/test_pg_dump/t/001_base.pl > > Why does the uniqueness of application_name set in the TAP test suite > > matter? Each instance of Postgres is self-contained into its own Unix > > domain path and port. > > I've found it annoying in the past. Probably now that the buildfarm > reports each test series separately it matters less. Previously we > reported many of the TAP tests together. Rename them to the directory name: ./src/bin/pg_dump/t/001_pg_dump.pl ./src/bin/pg_rewind/t/001_pg_rewind.pl ./src/test/modules/commit_ts/t/001_commit_ts.pl ./src/test/modules/test_pg_dump/t/001_test_pg_dump.pl Christoph -- Senior Berater, Tel.: +49 2166 9901 187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Christoph Berg wrote: > Re: Andrew Dunstan 2017-12-21 <d68d254a-6f97-a8c2-4656-ab8886fe2c38@2ndQuadrant.com> > > I've found it annoying in the past. Probably now that the buildfarm > > reports each test series separately it matters less. Previously we > > reported many of the TAP tests together. > > Rename them to the directory name: > > ./src/bin/pg_rewind/t/001_pg_rewind.pl > ./src/test/modules/commit_ts/t/001_commit_ts.pl > ./src/test/modules/test_pg_dump/t/001_test_pg_dump.pl These sound good to me. > ./src/bin/pg_dump/t/001_pg_dump.pl There already is t/002_pg_dump.pl. src/bin/pg_dump/t/001_basic.pl is all about command lines, perhaps we can find a name that evokes that -- 001_dump_cmdline.pl? (prefixed "_dump_" to be in line with t/010_dump_connstr.pl) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > Looks reasonable. Regarding the change to TestLib.pm, we should make > sure that the tests have unique names. There is a small amount of > duplication currently: > ./src/bin/pg_dump/t/001_basic.pl > ./src/bin/pg_rewind/t/001_basic.pl > ./src/test/modules/commit_ts/t/001_base.pl > ./src/test/modules/test_pg_dump/t/001_base.pl Perhaps, but that's a separate/pre-existing problem no? The point of the change in my patch was just to avoid getting different behavior in a VPATH build. regards, tom lane
On 12/21/17 08:13, Andrew Dunstan wrote: > Looks reasonable. Regarding the change to TestLib.pm, we should make > sure that the tests have unique names. There is a small amount of > duplication currently: > > ./src/bin/pg_dump/t/001_basic.pl > ./src/bin/pg_rewind/t/001_basic.pl > ./src/test/modules/commit_ts/t/001_base.pl > ./src/test/modules/test_pg_dump/t/001_base.pl But that's one of the reasons one has directories, so you don't need to have globally unique names. I don't actually see why the change in TestLib.pm is necessary or how it relates to this thread. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 12/20/17 21:50, Tom Lane wrote: > -my $this_script = $0; > +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_BIG5.pl'; This kind of things looks awful. Why is this better? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Dec 24, 2017 at 12:26:53PM -0500, Peter Eisentraut wrote: > On 12/20/17 21:50, Tom Lane wrote: > > -my $this_script = $0; > > +my $this_script = 'src/backend/utils/mb/Unicode/UCS_to_BIG5.pl'; > > This kind of things looks awful. Why is this better? What's wrong with using the path? As you point out elsewhere on this thread, it's a way to identify the program uniquely and reproducibly. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Dec 25, 2017 at 2:25 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 12/21/17 08:13, Andrew Dunstan wrote: >> Looks reasonable. Regarding the change to TestLib.pm, we should make >> sure that the tests have unique names. There is a small amount of >> duplication currently: >> >> ./src/bin/pg_dump/t/001_basic.pl >> ./src/bin/pg_rewind/t/001_basic.pl >> ./src/test/modules/commit_ts/t/001_base.pl >> ./src/test/modules/test_pg_dump/t/001_base.pl > > But that's one of the reasons one has directories, so you don't need to > have globally unique names. I don't actually see why the change in > TestLib.pm is necessary or how it relates to this thread. +1 to both things here. I have a hard time undestanding why changing the output of TAP test logs has anything to do with this thread, as well as why having unique file names is mandatory. If this really matters, I would suggest spawning a different thread. -- Michael
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 12/21/17 08:13, Andrew Dunstan wrote: >> Looks reasonable. Regarding the change to TestLib.pm, we should make >> sure that the tests have unique names. There is a small amount of >> duplication currently: >> >> ./src/bin/pg_dump/t/001_basic.pl >> ./src/bin/pg_rewind/t/001_basic.pl >> ./src/test/modules/commit_ts/t/001_base.pl >> ./src/test/modules/test_pg_dump/t/001_base.pl > But that's one of the reasons one has directories, so you don't need to > have globally unique names. I agree that it's not obvious that we need to rename these scripts. In the current setup for running regression tests, the associated postmaster logs would go into different subdirectories, so the conflict in "application" names appearing in log entries doesn't seem like it would lead to real confusion. > I don't actually see why the change in > TestLib.pm is necessary or how it relates to this thread. Basically, as the code stood: $ENV{PGAPPNAME} = $0; running any TAP test in a VPATH build would result in PGAPPNAME becoming a full path, resulting in different log entries than you get in a non-VPATH build. That seemed undesirable to me, so I added basename() to eliminate the difference in behavior. There's some room to argue that given that we have duplicate script names, a (partial) path would be a good thing. But if you believe that, then we should attempt to make that happen the same way with or without VPATH. regards, tom lane