Thread: fairywren is generating bogus BASE_BACKUP commands
Thomas Munro pointed out this failure to me on fairywren: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2022-01-21%2020%3A10%3A22 He theorizes that I need some perl2host magic in there, which may well be true. But I also noticed this: # Running: pg_basebackup --no-sync -cfast --target server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver -X none pg_basebackup: error: could not initiate base backup: ERROR: unrecognized target: "server;C" "server" is a valid backup target, but "server;C" is not. And I think this must be a bug on the client side, because the server logs the generated query: 2022-01-21 20:53:11.618 UTC [8404:10] 010_pg_basebackup.pl LOG: received replication command: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, CHECKPOINT 'fast', MANIFEST 'yes', TABLESPACE_MAP, TARGET 'server;C', TARGET_DETAIL '\\tools\\msys64\\home\\pgrunner\\bf\\root\\HEAD\\pgsql.build\\src\\bin\\pg_basebackup\\tmp_check\\tmp_test_Ag8r\\backuponserver') So it's not that the server parses the query and introduces gibberish -- the client has already introduced gibberish when constructing the query. But the code to do that is pretty straightforward -- we just call strchr to find the colon in the backup target, and then pnstrdup() the part before the colon and use the latter part as-is. If pnstrdup were failing to add a terminating \0 then this would be quite plausible, but I think it shouldn't. Unless the operating sytem's strnlen() is buggy? That seems like a stretch, so feel free to tell me what obvious stupid thing I did here and am not seeing... -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > "server" is a valid backup target, but "server;C" is not. And I think > this must be a bug on the client side, because the server logs the > generated query: > 2022-01-21 20:53:11.618 UTC [8404:10] 010_pg_basebackup.pl LOG: > received replication command: BASE_BACKUP ( LABEL 'pg_basebackup base > backup', PROGRESS, CHECKPOINT 'fast', MANIFEST 'yes', > TABLESPACE_MAP, TARGET 'server;C', TARGET_DETAIL > '\\tools\\msys64\\home\\pgrunner\\bf\\root\\HEAD\\pgsql.build\\src\\bin\\pg_basebackup\\tmp_check\\tmp_test_Ag8r\\backuponserver') > So it's not that the server parses the query and introduces gibberish > -- the client has already introduced gibberish when constructing the > query. But the code to do that is pretty straightforward -- we just > call strchr to find the colon in the backup target, and then > pnstrdup() the part before the colon and use the latter part as-is. If > pnstrdup were failing to add a terminating \0 then this would be quite > plausible, but I think it shouldn't. Unless the operating sytem's > strnlen() is buggy? That seems like a stretch, so feel free to tell me > what obvious stupid thing I did here and am not seeing... I think the backup_target string was already corrupted that way when pg_basebackup absorbed it from optarg. It's pretty hard to believe that the strchr/pnstrdup stanza got it wrong. However, comparing the TARGET_DETAIL to what the TAP test says it issued: # Running: pg_basebackup --no-sync -cfast --target server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver-X none pg_basebackup: error: could not initiate base backup: ERROR: unrecognized target: "server;C" it's absolutely clear that something decided to munge the target string. Given that colon is reserved in Windows filename syntax, it's not so surprising if it munged it wrong, or at least more aggressively than you expected. I kinda wonder if this notation for the target was well-chosen. Keeping the file name strictly separate from the "type" keyword might be a wiser move. Quite aside from Windows-isms, there are going to be usages where this is hard to tell from a URL. (If memory serves, double leading slash is significant to some networked file systems.) While we're on the subject of ill-chosen option syntax: "-cfast" with non double dashes? Really? That's horribly ambiguous. Most programs would parse something like that as five single-letter options, and most users who know Unix would expect it to mean that. regards, tom lane
On Sat, Jan 22, 2022 at 10:42 AM Robert Haas <robertmhaas@gmail.com> wrote: > # Running: pg_basebackup --no-sync -cfast --target > server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver > -X none > pg_basebackup: error: could not initiate base backup: ERROR: > unrecognized target: "server;C" > > "server" is a valid backup target, but "server;C" is not. And I think > this must be a bug on the client side, because the server logs the > generated query: It looks a bit like msys perl could be recognising "server:/home/pgrunner/..." and converting it to "server;C:\tools\msys64\home\pgrunner\...". From a bit of light googling I see that such conversions happen in msys perl's system() unless you turn them off with MSYS_NO_PATHCONV, and then we'd have to do it ourselves in the right places.
On Fri, Jan 21, 2022 at 5:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think the backup_target string was already corrupted that way when > pg_basebackup absorbed it from optarg. It's pretty hard to believe that > the strchr/pnstrdup stanza got it wrong. However, comparing the > TARGET_DETAIL to what the TAP test says it issued: > > # Running: pg_basebackup --no-sync -cfast --target server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver-X none > pg_basebackup: error: could not initiate base backup: ERROR: unrecognized target: "server;C" > > it's absolutely clear that something decided to munge the target string. > Given that colon is reserved in Windows filename syntax, it's not > so surprising if it munged it wrong, or at least more aggressively > than you expected. Nothing in the Perl code tells it that the particular argument in question is a path rather than anything else, so it must be applying some heuristic to decide whether to munge it. That's horrible. > I kinda wonder if this notation for the target was well-chosen. > Keeping the file name strictly separate from the "type" keyword > might be a wiser move. Quite aside from Windows-isms, there > are going to be usages where this is hard to tell from a URL. > (If memory serves, double leading slash is significant to some > networked file systems.) Maybe. I think it's important that the notation is not ridiculously verbose, and -t server --target-detail $PATH is a LOT more typing. > While we're on the subject of ill-chosen option syntax: "-cfast" > with non double dashes? Really? That's horribly ambiguous. > Most programs would parse something like that as five single-letter > options, and most users who know Unix would expect it to mean that. I'm not sure whether you're complaining that we accept that syntax or using it, but AFAIK I'm responsible for neither. I think the syntax has been accepted since pg_basebackup was added in 2011, and Andres added it to this test case earlier this week (with -cfast in the subject line of the commit message). FWIW, though, I've been aware of that syntax for a long time and never thought it was a problem. I usually spell the option in exactly that way when I use it, and I'm relatively sure that things I've given to customers would break if we removed support for it. I don't know how we'd do that anyway, since all that's happening here is a call to getopt_long(). -- Robert Haas EDB: http://www.enterprisedb.com
On 1/21/22 17:10, Thomas Munro wrote: > On Sat, Jan 22, 2022 at 10:42 AM Robert Haas <robertmhaas@gmail.com> wrote: >> # Running: pg_basebackup --no-sync -cfast --target >> server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver >> -X none >> pg_basebackup: error: could not initiate base backup: ERROR: >> unrecognized target: "server;C" >> >> "server" is a valid backup target, but "server;C" is not. And I think >> this must be a bug on the client side, because the server logs the >> generated query: > It looks a bit like msys perl could be recognising > "server:/home/pgrunner/..." and converting it to > "server;C:\tools\msys64\home\pgrunner\...". From a bit of light > googling I see that such conversions happen in msys perl's system() > unless you turn them off with MSYS_NO_PATHCONV, and then we'd have to > do it ourselves in the right places. c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says: my $source_ts_prefix = $source_ts_path; $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!; ... # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; Probably in this case just setting it to 'server:' would do the trick. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Jan 21, 2022 at 5:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> While we're on the subject of ill-chosen option syntax: "-cfast" >> with non double dashes? Really? That's horribly ambiguous. > I'm not sure whether you're complaining that we accept that syntax or > using it, but AFAIK I'm responsible for neither. I think the syntax > has been accepted since pg_basebackup was added in 2011, and Andres > added it to this test case earlier this week (with -cfast in the > subject line of the commit message). pg_basebackup's help defines the syntax as -c, --checkpoint=fast|spread set fast or spread checkpointing which I'd read as requiring a space (or possibly equal sign) between "-c" and "fast". If it works as written in this test, that's an accident of the particular getopt implementation, and I'll bet it won't be too long before we come across a getopt that doesn't like it. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says: > my $source_ts_prefix = $source_ts_path; > $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!; > ... > # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces > local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; > Probably in this case just setting it to 'server:' would do the trick. The point I was trying to make is that if we have to jump through that sort of hoop in the test scripts, then real users are going to have to jump through it as well, and they won't like that (and we will get bug reports about it). It'd be better to design the option syntax to avoid such requirements. regards, tom lane
Hi, On 2022-01-21 17:42:45 -0500, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says: > > my $source_ts_prefix = $source_ts_path; > > $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!; > > ... > > > # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces > > local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; > > > Probably in this case just setting it to 'server:' would do the trick. > > The point I was trying to make is that if we have to jump through > that sort of hoop in the test scripts, then real users are going > to have to jump through it as well, and they won't like that > (and we will get bug reports about it). It'd be better to design > the option syntax to avoid such requirements. Normal users aren't going to invoke a "native" basebackup from inside msys. I assume the translation happens because an "msys world" perl invokes a "native" pg_basebackup via msys system(), right? If pg_basebackup instead is "normally" invoked from a windows terminal, or anything else "native" windows, the problem won't exist, no? As we're building a "native" postgres in this case, none of our tools should internally have such translations happening. So I don't think it'll be a huge issue for users themselves? Not that I think that there are all that many users of mingw built postgres on windows... I think it's mostly msvc built postgres in that world? Greetings, Andres Freund
On Fri, Jan 21, 2022 at 5:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The point I was trying to make is that if we have to jump through > that sort of hoop in the test scripts, then real users are going > to have to jump through it as well, and they won't like that > (and we will get bug reports about it). It'd be better to design > the option syntax to avoid such requirements. Well, as Andrew points out, pg_basebackup's -T foo=bar syntax causes the same issue, so you'll need to redesign that, too. But even that is not really a proper fix. The real root of this problem is that the operating system's notion of a valid path differs from PostgreSQL's notion of a valid path on this platform, and I imagine that fixing that is a rather large project. ISTM that you're basically just complaining about options syntax that you don't like, but I think there's nothing particularly worse about this syntax than lots of other things we type all the time. psql -v VAR=VALUE -P OTHERKINDOFVAR=OTHERVALUE? curl -u USER:PASSWORD? pg_basebackup -T OLD=NEW? perl -d[t]:MODULE=OPT,OPT? I mean, that last one actually seems kinda horrible and if this were as bad as that I'd say yeah, it should be redesigned. But I don't think it is. There's plenty of precedent for bundling closely-related values into a single command-line option, which is all I've done here. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On 2022-01-21 17:26:32 -0500, Robert Haas wrote: > I think the syntax has been accepted since pg_basebackup was added in 2011, > and Andres added it to this test case earlier this week (with -cfast in the > subject line of the commit message). The reason I used -cfast instead of -c fast or --checkpoint=fast is that the way perltidy formats leads to very wide lines already, and making them even longer seemed unattractive... Given the -cfast syntax successfully passed tests on at least AIX, freebsd, linux, macos, netbsd, openbsd, windows msvc, windows msys, I'm not too worried about portability either. Greetings, Andres Freund
On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote: > # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces > local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; > Probably in this case just setting it to 'server:' would do the trick. Oh, thanks for the tip. Do you want to push a commit that does that, or ... should I do it? -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, Jan 22, 2022 at 3:55 PM Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote: > > # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces > > local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; > > Probably in this case just setting it to 'server:' would do the trick. > > Oh, thanks for the tip. Do you want to push a commit that does that, > or ... should I do it? Just a thought: Would it prevent the magic path translation and all just work if the path were already in Windows form? So, if we did just this change at the top: -my $tempdir = PostgreSQL::Test::Utils::tempdir; +my $tempdir = PostgreSQL::Test::Utils::perl2host(PostgreSQL::Test::Utils::tempdir);
On 1/21/22 18:04, Andres Freund wrote: > Hi, > > On 2022-01-21 17:42:45 -0500, Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says: >>> my $source_ts_prefix = $source_ts_path; >>> $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!; >>> ... >>> # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces >>> local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; >>> Probably in this case just setting it to 'server:' would do the trick. >> The point I was trying to make is that if we have to jump through >> that sort of hoop in the test scripts, then real users are going >> to have to jump through it as well, and they won't like that >> (and we will get bug reports about it). It'd be better to design >> the option syntax to avoid such requirements. > Normal users aren't going to invoke a "native" basebackup from inside msys. I > assume the translation happens because an "msys world" perl invokes > a "native" pg_basebackup via msys system(), right? If pg_basebackup instead is > "normally" invoked from a windows terminal, or anything else "native" windows, > the problem won't exist, no? > > As we're building a "native" postgres in this case, none of our tools should > internally have such translations happening. So I don't think it'll be a huge > issue for users themselves? All true. This is purely an issue for our testing regime, and not for end users. > > Not that I think that there are all that many users of mingw built postgres on > windows... I think it's mostly msvc built postgres in that world? > The vast majority use the installer which is built with MSVC. Very few in my experience build their own. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 1/21/22 22:43, Thomas Munro wrote: > On Sat, Jan 22, 2022 at 3:55 PM Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote: >>> # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces >>> local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix; >>> Probably in this case just setting it to 'server:' would do the trick. >> Oh, thanks for the tip. Do you want to push a commit that does that, >> or ... should I do it? > Just a thought: Would it prevent the magic path translation and all > just work if the path were already in Windows form? So, if we did > just this change at the top: > > -my $tempdir = PostgreSQL::Test::Utils::tempdir; > +my $tempdir = PostgreSQL::Test::Utils::perl2host(PostgreSQL::Test::Utils::tempdir); It's not as simple as that :-( But you're on the right track. My suggestion above doesn't work. The rule for paths is: when you're passing a path to an external program that's not msys aware (typically, one of our build artefacts like psql or pg_basebackup) it needs to be a native path. But when you're passing it to a perl function (e.g. mkdir) or to an external program that's msys aware it needs to be a virtual path, i.e. one not mangled by perl2host. Some recent commits to this file especially have not obeyed this rule. Here's a patch that does it consistently for the whole file. I have tested it on a system very like fairywren, and the test passes. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Attachment
On Sun, Jan 23, 2022 at 12:20 PM Andrew Dunstan <andrew@dunslane.net> wrote: > It's not as simple as that :-( But you're on the right track. My > suggestion above doesn't work. > > The rule for paths is: when you're passing a path to an external program > that's not msys aware (typically, one of our build artefacts like psql > or pg_basebackup) it needs to be a native path. But when you're passing > it to a perl function (e.g. mkdir) or to an external program that's msys > aware it needs to be a virtual path, i.e. one not mangled by perl2host. > > Some recent commits to this file especially have not obeyed this rule. > Here's a patch that does it consistently for the whole file. I have > tested it on a system very like fairywren, and the test passes. I can't understand how this would prevent server:/what/ever from getting turned into server;c:\what\ever. But if it does, great! Maybe we need to have a README in the tree somewhere that tries to explain this. Or maybe we should make our build artifacts msys-aware, if that's possible, so that this just works. Or maybe supporting msys is not worth the trouble. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > Maybe we need to have a README in the tree somewhere that tries to > explain this. Or maybe we should make our build artifacts msys-aware, > if that's possible, so that this just works. Or maybe supporting msys > is not worth the trouble. I've been wondering that last myself. Supporting Windows-native is already a huge amount of work, which we put up with because there are a lot of users. If msys is going to add another large chunk of work, has it got enough users to justify that? The recent argument that this behavior isn't user-visible doesn't do anything to mollify me on that point; it appears to me to be tantamount to a concession that no real users actually care about msys. regards, tom lane
On 1/23/22 15:07, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Maybe we need to have a README in the tree somewhere that tries to >> explain this. Or maybe we should make our build artifacts msys-aware, >> if that's possible, so that this just works. Or maybe supporting msys >> is not worth the trouble. > I've been wondering that last myself. Supporting Windows-native is > already a huge amount of work, which we put up with because there > are a lot of users. If msys is going to add another large chunk of > work, has it got enough users to justify that? > > The recent argument that this behavior isn't user-visible doesn't do > anything to mollify me on that point; it appears to me to be tantamount > to a concession that no real users actually care about msys. Msys is a unix-like environment that is useful to build Postgres. It's not intended as a general runtime environment. We therefore don't build msys-aware Postgres. We use msys to build standalone Postgres binaries that don't need or use any msys runtime. There is nothing in the least bit new about this - that's the way it's been since day one of the Windows port nearly 20 years ago. Speaking as someone who (for my sins) regularly deals with problems on Windows, I find msys much easier to deal with than VisualStudio, probably because it's so much like what I use elsewhere. So I think dropping msys support would be a serious mistake. The most common issues we get are around this issue of virtualized paths in the TAP tests. If people followed the rule I suggested upthread, 99% of those problems would go away. I realize it's annoying - I've been caught by it myself on more than one occasion. Maybe there's a way to avoid it, but if there is I'm unaware of it. But I don't think it's in any way a good reason to drop msys support. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > The most common issues we get are around this issue of virtualized paths > in the TAP tests. If people followed the rule I suggested upthread, 99% > of those problems would go away. I realize it's annoying - I've been > caught by it myself on more than one occasion. Maybe there's a way to > avoid it, but if there is I'm unaware of it. But I don't think it's in > any way a good reason to drop msys support. Well, let's go back to Robert's other suggestion: some actual documentation of these rules, in the source tree, might help people to follow them. src/test/perl/README seems like an appropriate place. regards, tom lane
Hi, On 2022-01-23 16:09:01 -0500, Andrew Dunstan wrote: > Msys is a unix-like environment that is useful to build Postgres. It's > not intended as a general runtime environment. We therefore don't build > msys-aware Postgres. We use msys to build standalone Postgres binaries > that don't need or use any msys runtime. There is nothing in the least > bit new about this - that's the way it's been since day one of the > Windows port nearly 20 years ago. > > Speaking as someone who (for my sins) regularly deals with problems on > Windows, I find msys much easier to deal with than VisualStudio, > probably because it's so much like what I use elsewhere. So I think > dropping msys support would be a serious mistake. I agree that msys support is useful (although configure is *so* slow that I find its usefullness reduced substantially). > The most common issues we get are around this issue of virtualized paths > in the TAP tests. If people followed the rule I suggested upthread, 99% > of those problems would go away. I realize it's annoying - I've been > caught by it myself on more than one occasion. Maybe there's a way to > avoid it, but if there is I'm unaware of it. But I don't think it's in > any way a good reason to drop msys support. Needing to sprinkle perl2host and MSYS2_ARG_CONV_EXCL over a good number of tests, getting weird errors when failing, etc IMO isn't a scalable approach, for a platform that most of use never use. Can't we solve this in a generic way? E.g. by insisting that the test run with a native perl and normalizing the few virtual paths we get invoked with centrally? Making the msys initial setup a bit more cumbersome would IMO be an OK price to pay for making it maintainable / predictable from other platforms, as long as we have some decent docs and decent error messages. Greetings, Andres Freund
On 1/23/22 16:31, Andres Freund wrote: >> The most common issues we get are around this issue of virtualized paths >> in the TAP tests. If people followed the rule I suggested upthread, 99% >> of those problems would go away. I realize it's annoying - I've been >> caught by it myself on more than one occasion. Maybe there's a way to >> avoid it, but if there is I'm unaware of it. But I don't think it's in >> any way a good reason to drop msys support. > Needing to sprinkle perl2host and MSYS2_ARG_CONV_EXCL over a good number of > tests, getting weird errors when failing, etc IMO isn't a scalable approach, > for a platform that most of use never use. > > Can't we solve this in a generic way? E.g. by insisting that the test run with > a native perl and normalizing the few virtual paths we get invoked with > centrally? Making the msys initial setup a bit more cumbersome would IMO be > an OK price to pay for making it maintainable / predictable from other > platforms, as long as we have some decent docs and decent error messages. > Nice idea. I have a suspicion that it's going to be harder than you think, but I'll be very happy to be proved wrong. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2022-01-23 17:38:26 -0500, Andrew Dunstan wrote: > Nice idea. I have a suspicion that it's going to be harder than you > think, but I'll be very happy to be proved wrong. FWIW, a manual invocation of the pg_basebackup tests works via a ucrt perl (ucrt64/mingw-w64-ucrt-x86_64-perl package). It did require installing IPC::Run via cpan though, or at least I think it did. That's a bit annoying. But not all that onerous if we document it? I did make sure that the ucrt perl deals with windows paths. For good measure I forced an early return in perl2host. I think we would have to modify the values for a few environment variables. Stuff like PG_REGRESS, TAR etc. Either where we specify them, or Utils.pm or such. Greetings, Andres Freund
On Sun, Jan 23, 2022 at 4:09 PM Andrew Dunstan <andrew@dunslane.net> wrote: > The most common issues we get are around this issue of virtualized paths > in the TAP tests. If people followed the rule I suggested upthread, 99% > of those problems would go away. Well, that makes it sound like it's the fault of people for not following the rules, but I don't think that's really a fair assessment. Even your first guess as to how to solve this particular problem wasn't correct, and if you can't guess right on the first try, I don't know how anyone else is supposed to do it. I still don't even understand why your first guess wasn't right. I feel like every time I try to add a TAP test Msys blows up, and I can't figure out how to fix it myself. Which is not a great feeling. -- Robert Haas EDB: http://www.enterprisedb.com
On 1/23/22 22:52, Robert Haas wrote: > On Sun, Jan 23, 2022 at 4:09 PM Andrew Dunstan <andrew@dunslane.net> wrote: >> The most common issues we get are around this issue of virtualized paths >> in the TAP tests. If people followed the rule I suggested upthread, 99% >> of those problems would go away. > Well, that makes it sound like it's the fault of people for not > following the rules, but I don't think that's really a fair > assessment. Even your first guess as to how to solve this particular > problem wasn't correct, and if you can't guess right on the first try, > I don't know how anyone else is supposed to do it. I still don't even > understand why your first guess wasn't right. I feel like every time I > try to add a TAP test Msys blows up, and I can't figure out how to fix > it myself. Which is not a great feeling. > Well if we can get Andres' suggestion to work all this might go away, which would keep everyone happy, especially me. You're right that I was a little careless upthread. Mea culpa. Meanwhile I am committing a minimal one line fix. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Mon, Jan 24, 2022 at 2:01 PM Andrew Dunstan <andrew@dunslane.net> wrote: > Well if we can get Andres' suggestion to work all this might go away, > which would keep everyone happy, especially me. You're right that I was > a little careless upthread. Mea culpa. Meanwhile I am committing a > minimal one line fix. I in no way intended to accuse you of being careless. I was just pointing out that this stuff seems to be hard to get right, even for smart people. I really hate committing stuff that turns out to be broken. It's such a fire drill when the build farm turns red. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Jan 24, 2022 at 2:27 PM Robert Haas <robertmhaas@gmail.com> wrote: > I really hate committing stuff that turns out to be broken. It's such > a fire drill when the build farm turns red. And there's a good chance it's about to break again, because I just committed the next patch in the series which, shockingly, also includes tests. I'd like to tell you I believe I got it right this time ... but I don't. -- Robert Haas EDB: http://www.enterprisedb.com
On 1/24/22 15:17, Robert Haas wrote: > On Mon, Jan 24, 2022 at 2:27 PM Robert Haas <robertmhaas@gmail.com> wrote: >> I really hate committing stuff that turns out to be broken. It's such >> a fire drill when the build farm turns red. > And there's a good chance it's about to break again, because I just > committed the next patch in the series which, shockingly, also > includes tests. > > I'd like to tell you I believe I got it right this time ... but I don't. I'll just keep playing whackamole :-) cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote: > Well if we can get Andres' suggestion to work all this might go away, > which would keep everyone happy, especially me. I successfully tried it for a few tests. But I see tests hanging a lot independent of the way I run the tests, presumably due to the issues discussed in [1]. So we need to do something about that. I don't have the cycles to finish changing over to that way of running tests - do you have some time to work on it, if I clean up the bit I have? - Andres [1] https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com
On 1/24/22 16:39, Andres Freund wrote: > Hi, > > On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote: >> Well if we can get Andres' suggestion to work all this might go away, >> which would keep everyone happy, especially me. > I successfully tried it for a few tests. But I see tests hanging a lot > independent of the way I run the tests, presumably due to the issues discussed > in [1]. So we need to do something about that. > > I don't have the cycles to finish changing over to that way of running tests - > do you have some time to work on it, if I clean up the bit I have? > > - Andres > > [1] https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com Give me what you can and I'll see what I can do. I have a couple of moderately high priority items on my plate, but I will probably be able to fit in some testing when those make my eyes completely glaze over. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote: > Give me what you can and I'll see what I can do. I have a couple of > moderately high priority items on my plate, but I will probably be able > to fit in some testing when those make my eyes completely glaze over. Steps: # install msys from https://www.msys2.org/ # install dependencies in msys shell pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl ucrt64/mingw-w64-ucrt-x86_64-gcc ucrt64/mingw-w64-ucrt-x86_64-zlibucrt64/mingw-w64-ucrt-x86_64-ccache diffutils # start mingw ucrt64 x64 shell cpan install -T IPC::Run perl -MIPC::Run -e 1 # verify ipc run is installed cd /c/dev # I added --reference postgres to accelerate the cloning git clone https://git.postgresql.org/git/postgresql.git postgres-mingw cd /c/dev/postgres-mingw git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7 git revert 6051857fc953a62db318329c4ceec5f9668fd42a # apply attached patch # see below why out-of-tree is easier or now mkdir build cd build # path parameters probably not necessary, I thought I needed them earlier, not sure why ../configure --without-readline --cache cache --enable-tap-tests PROVE=/ucrt64/bin/core_perl/prove PERL=/ucrt64/bin/perl.exeCC="ccache gcc" make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test make -j8 -s temp-install # pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR mkdir /c/dev/postgres-mingw/build/tmp # the TAR= ensures that tests pick up a tar accessible with a windows path # PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise there are port conflicts (make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 TMPDIR=C:/dev/postgres-mingw/tmp TAR="C:\Windows\System32\tar.exe"2>&1 && echo test-world-success || echo test-world-fail) 2>&1 |tee test-world.log To make tests in "in-tree" builds work, a bit more hackery would be needed. The problem is that windows chooses binaries from the current working directory *before* PATH. That's a problem for things like initdb.exe or pg_ctl.exe that want to find postgres.exe, as that only works with the program in their proper location, rather than CWD. Greetings, Andres Freund
Attachment
On 1/24/22 21:36, Andres Freund wrote: > Hi, > > On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote: >> Give me what you can and I'll see what I can do. I have a couple of >> moderately high priority items on my plate, but I will probably be able >> to fit in some testing when those make my eyes completely glaze over. > Steps: > > # install msys from https://www.msys2.org/ > # install dependencies in msys shell > pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl ucrt64/mingw-w64-ucrt-x86_64-gcc ucrt64/mingw-w64-ucrt-x86_64-zlibucrt64/mingw-w64-ucrt-x86_64-ccache diffutils > > > # start mingw ucrt64 x64 shell > cpan install -T IPC::Run > perl -MIPC::Run -e 1 # verify ipc run is installed > > cd /c/dev > # I added --reference postgres to accelerate the cloning > git clone https://git.postgresql.org/git/postgresql.git postgres-mingw > cd /c/dev/postgres-mingw > > git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7 > git revert 6051857fc953a62db318329c4ceec5f9668fd42a > > # apply attached patch > > # see below why out-of-tree is easier or now > mkdir build > cd build > # path parameters probably not necessary, I thought I needed them earlier, not sure why > ../configure --without-readline --cache cache --enable-tap-tests PROVE=/ucrt64/bin/core_perl/prove PERL=/ucrt64/bin/perl.exeCC="ccache gcc" > make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test > make -j8 -s temp-install > > # pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR > mkdir /c/dev/postgres-mingw/build/tmp > > # the TAR= ensures that tests pick up a tar accessible with a windows path > # PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise there are port conflicts > > (make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 TMPDIR=C:/dev/postgres-mingw/tmp TAR="C:\Windows\System32\tar.exe"2>&1 && echo test-world-success || echo test-world-fail) 2>&1 |tee test-world.log OK, I have all the pieces working and I know what I need to do to adapt fairywren. The patch you provided is not necessary any more. (I think your TMPDIR spec is missing a /build/) The recipe worked (mutatis mutandis) for the mingw64 toolchain as well as for the ucrt64 toolchain. Is there a reason to prefer ucrt64? I think the next steps are: * do those two reverts * adjust fairywren * get rid of perl2host At that stage jacana will no longer be able to run TAP tests. I can do one of these: * disable the TAP tests on jacana * migrate jacana to msys2 * kiss jacana goodbye. Thoughts? > To make tests in "in-tree" builds work, a bit more hackery would be > needed. The problem is that windows chooses binaries from the current working > directory *before* PATH. That's a problem for things like initdb.exe or > pg_ctl.exe that want to find postgres.exe, as that only works with the program > in their proper location, rather than CWD. > Yeah, we should do something about that. For example, we could possibly use the new install_path option of PostgreSQL::Test::Cluster::new() so it would find these in the right location. However, I don't need it as my animals all use vpath builds. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2022-02-03 17:25:51 -0500, Andrew Dunstan wrote: > OK, I have all the pieces working and I know what I need to do to adapt > fairywren. The patch you provided is not necessary any more. Cool. Are you going to post that? > (I think your TMPDIR spec is missing a /build/) I think I went back/forth between in-tree/out-of-tree build... > The recipe worked (mutatis mutandis) for the mingw64 toolchain as well > as for the ucrt64 toolchain. Is there a reason to prefer ucrt64? There's a lot of oddities in the mingw64 target, due to targetting the much older C runtime library (lots of bugs, missing functionality). MSVC targets UCRT by default for quite a few years by now. Targetting msvcrt is basically on its way out from what I understand. > I think the next steps are: > > * do those two reverts > * adjust fairywren > * get rid of perl2host > > At that stage jacana will no longer be able to run TAP tests. I can do > one of these: I guess because its install is too old? > * disable the TAP tests on jacana > * migrate jacana to msys2 > * kiss jacana goodbye. Having a non-server mingw animal seems like it could be useful (I think that's just Jacana), even if server / client versions of windows have grown closer. So I think an update to msys2 makes the most sense? > > To make tests in "in-tree" builds work, a bit more hackery would be > > needed. The problem is that windows chooses binaries from the current working > > directory *before* PATH. That's a problem for things like initdb.exe or > > pg_ctl.exe that want to find postgres.exe, as that only works with the program > > in their proper location, rather than CWD. > Yeah, we should do something about that. For example, we could possibly > use the new install_path option of PostgreSQL::Test::Cluster::new() so > it would find these in the right location. It'd be easy enough to adjust the central invocations of initdb. I think the bigger problem is that there's plenty calls to initdb, pg_ctl "directly" in the respective test scripts. > However, I don't need it as my animals all use vpath builds. I think it'd be fine to just error out in non-vpath builds on msvc. The search-for-binaries behaviour is just too weird. Greetings, Andres Freund
On 2/3/22 20:51, Andres Freund wrote: > Hi, > > On 2022-02-03 17:25:51 -0500, Andrew Dunstan wrote: >> OK, I have all the pieces working and I know what I need to do to adapt >> fairywren. The patch you provided is not necessary any more. > Cool. Are you going to post that? About the only thing missing in your recipe is this: # force ucrt64 prove to use the ucrt64 perl rather than whatever is in the path sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove Given that, you don't need to set PERL, and configure can find the perl to build against from the PATH. > > > Is there a reason to prefer ucrt64? > There's a lot of oddities in the mingw64 target, due to targetting the much > older C runtime library (lots of bugs, missing functionality). MSVC targets > UCRT by default for quite a few years by now. Targetting msvcrt is basically > on its way out from what I understand. OK. >> I think the next steps are: >> >> * do those two reverts >> * adjust fairywren >> * get rid of perl2host >> >> At that stage jacana will no longer be able to run TAP tests. I can do >> one of these: > I guess because its install is too old? Yeah. fairywren is now running with ucrt64-perl for TAP tests. >> * disable the TAP tests on jacana >> * migrate jacana to msys2 >> * kiss jacana goodbye. > Having a non-server mingw animal seems like it could be useful (I think that's > just Jacana), even if server / client versions of windows have grown > closer. So I think an update to msys2 makes the most sense? Working on that. There appear to be some issues with third party libraries. I might need to rebuild libxml2 and zlib for example. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote: > Working on that. There appear to be some issues with third party > libraries. I might need to rebuild libxml2 and zlib for example. Any reason not to use the ones from msys2?
Hi, On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote: > About the only thing missing in your recipe is this: Re requiring out-of-tree builds: Thomas on IM noted that there's the NoDefaultCurrentDirectoryInExePath environment variable. That should avoid the problem leading to requiring out-of-tree builds. But I've not tested it. > # force ucrt64 prove to use the ucrt64 perl rather than whatever is in > the path > sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove > > > Given that, you don't need to set PERL, and configure can find the perl > to build against from the PATH. That shouldn't even be needed from what I understand now. If correctly started the msys shell shoul dhave the right perl in path? Greetings, Andres Freund
On 2/14/22 18:02, Andres Freund wrote: > Hi, > > On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote: >> About the only thing missing in your recipe is this: > Re requiring out-of-tree builds: Thomas on IM noted that there's the > NoDefaultCurrentDirectoryInExePath environment variable. That should avoid the > problem leading to requiring out-of-tree builds. But I've not tested it. Good to know. > > >> # force ucrt64 prove to use the ucrt64 perl rather than whatever is in >> the path >> sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove >> >> >> Given that, you don't need to set PERL, and configure can find the perl >> to build against from the PATH. > That shouldn't even be needed from what I understand now. If correctly started > the msys shell shoul dhave the right perl in path? FSVO "the right perl". However, jacana is building against a separate installation of AS perl, and I was trying to preserve that. For a buildfarm animal, there is one extra package that is needed: perl-LWP-Protocol-https cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 2/14/22 17:37, Andres Freund wrote: > On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote: >> Working on that. There appear to be some issues with third party >> libraries. I might need to rebuild libxml2 and zlib for example. > Any reason not to use the ones from msys2? That seems to work. Needed to add these 2 packages to the recipe: ucrt64/mingw-w64-ucrt-x86_64-libxml2 ucrt64/mingw-w64-ucrt-x86_64-libxslt cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com