Thread: fairywren is generating bogus BASE_BACKUP commands

fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
Thomas Munro pointed out this failure to me on fairywren:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2022-01-21%2020%3A10%3A22

He theorizes that I need some perl2host magic in there, which may well
be true. But I also noticed this:

# Running: pg_basebackup --no-sync -cfast --target
server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver
-X none
pg_basebackup: error: could not initiate base backup: ERROR:
unrecognized target: "server;C"

"server" is a valid backup target, but "server;C" is not. And I think
this must be a bug on the client side, because the server logs the
generated query:

2022-01-21 20:53:11.618 UTC [8404:10] 010_pg_basebackup.pl LOG:
received replication command: BASE_BACKUP ( LABEL 'pg_basebackup base
backup',  PROGRESS,  CHECKPOINT 'fast',  MANIFEST 'yes',
TABLESPACE_MAP,  TARGET 'server;C',  TARGET_DETAIL

'\\tools\\msys64\\home\\pgrunner\\bf\\root\\HEAD\\pgsql.build\\src\\bin\\pg_basebackup\\tmp_check\\tmp_test_Ag8r\\backuponserver')

So it's not that the server parses the query and introduces gibberish
-- the client has already introduced gibberish when constructing the
query. But the code to do that is pretty straightforward -- we just
call strchr to find the colon in the backup target, and then
pnstrdup() the part before the colon and use the latter part as-is. If
pnstrdup were failing to add a terminating \0 then this would be quite
plausible, but I think it shouldn't. Unless the operating sytem's
strnlen() is buggy? That seems like a stretch, so feel free to tell me
what obvious stupid thing I did here and am not seeing...

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> "server" is a valid backup target, but "server;C" is not. And I think
> this must be a bug on the client side, because the server logs the
> generated query:

> 2022-01-21 20:53:11.618 UTC [8404:10] 010_pg_basebackup.pl LOG:
> received replication command: BASE_BACKUP ( LABEL 'pg_basebackup base
> backup',  PROGRESS,  CHECKPOINT 'fast',  MANIFEST 'yes',
> TABLESPACE_MAP,  TARGET 'server;C',  TARGET_DETAIL
>
'\\tools\\msys64\\home\\pgrunner\\bf\\root\\HEAD\\pgsql.build\\src\\bin\\pg_basebackup\\tmp_check\\tmp_test_Ag8r\\backuponserver')

> So it's not that the server parses the query and introduces gibberish
> -- the client has already introduced gibberish when constructing the
> query. But the code to do that is pretty straightforward -- we just
> call strchr to find the colon in the backup target, and then
> pnstrdup() the part before the colon and use the latter part as-is. If
> pnstrdup were failing to add a terminating \0 then this would be quite
> plausible, but I think it shouldn't. Unless the operating sytem's
> strnlen() is buggy? That seems like a stretch, so feel free to tell me
> what obvious stupid thing I did here and am not seeing...

I think the backup_target string was already corrupted that way when
pg_basebackup absorbed it from optarg.  It's pretty hard to believe that
the strchr/pnstrdup stanza got it wrong.  However, comparing the
TARGET_DETAIL to what the TAP test says it issued:

# Running: pg_basebackup --no-sync -cfast --target
server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver-X none 
pg_basebackup: error: could not initiate base backup: ERROR:  unrecognized target: "server;C"

it's absolutely clear that something decided to munge the target string.
Given that colon is reserved in Windows filename syntax, it's not
so surprising if it munged it wrong, or at least more aggressively
than you expected.

I kinda wonder if this notation for the target was well-chosen.
Keeping the file name strictly separate from the "type" keyword
might be a wiser move.  Quite aside from Windows-isms, there
are going to be usages where this is hard to tell from a URL.
(If memory serves, double leading slash is significant to some
networked file systems.)

While we're on the subject of ill-chosen option syntax: "-cfast"
with non double dashes?  Really?  That's horribly ambiguous.
Most programs would parse something like that as five single-letter
options, and most users who know Unix would expect it to mean that.

            regards, tom lane



Re: fairywren is generating bogus BASE_BACKUP commands

From
Thomas Munro
Date:
On Sat, Jan 22, 2022 at 10:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
> # Running: pg_basebackup --no-sync -cfast --target
> server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver
> -X none
> pg_basebackup: error: could not initiate base backup: ERROR:
> unrecognized target: "server;C"
>
> "server" is a valid backup target, but "server;C" is not. And I think
> this must be a bug on the client side, because the server logs the
> generated query:

It looks a bit like msys perl could be recognising
"server:/home/pgrunner/..." and converting it to
"server;C:\tools\msys64\home\pgrunner\...".  From a bit of light
googling I see that such conversions happen in msys perl's system()
unless you turn them off with MSYS_NO_PATHCONV, and then we'd have to
do it ourselves in the right places.



Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Fri, Jan 21, 2022 at 5:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think the backup_target string was already corrupted that way when
> pg_basebackup absorbed it from optarg.  It's pretty hard to believe that
> the strchr/pnstrdup stanza got it wrong.  However, comparing the
> TARGET_DETAIL to what the TAP test says it issued:
>
> # Running: pg_basebackup --no-sync -cfast --target
server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver-X none
 
> pg_basebackup: error: could not initiate base backup: ERROR:  unrecognized target: "server;C"
>
> it's absolutely clear that something decided to munge the target string.
> Given that colon is reserved in Windows filename syntax, it's not
> so surprising if it munged it wrong, or at least more aggressively
> than you expected.

Nothing in the Perl code tells it that the particular argument in
question is a path rather than anything else, so it must be applying
some heuristic to decide whether to munge it. That's horrible.

> I kinda wonder if this notation for the target was well-chosen.
> Keeping the file name strictly separate from the "type" keyword
> might be a wiser move.  Quite aside from Windows-isms, there
> are going to be usages where this is hard to tell from a URL.
> (If memory serves, double leading slash is significant to some
> networked file systems.)

Maybe. I think it's important that the notation is not ridiculously
verbose, and -t server --target-detail $PATH is a LOT more typing.

> While we're on the subject of ill-chosen option syntax: "-cfast"
> with non double dashes?  Really?  That's horribly ambiguous.
> Most programs would parse something like that as five single-letter
> options, and most users who know Unix would expect it to mean that.

I'm not sure whether you're complaining that we accept that syntax or
using it, but AFAIK I'm responsible for neither. I think the syntax
has been accepted since pg_basebackup was added in 2011, and Andres
added it to this test case earlier this week (with -cfast in the
subject line of the commit message). FWIW, though, I've been aware of
that syntax for a long time and never thought it was a problem. I
usually spell the option in exactly that way when I use it, and I'm
relatively sure that things I've given to customers would break if we
removed support for it. I don't know how we'd do that anyway, since
all that's happening here is a call to getopt_long().

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/21/22 17:10, Thomas Munro wrote:
> On Sat, Jan 22, 2022 at 10:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> # Running: pg_basebackup --no-sync -cfast --target
>> server:/home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/tmp_test_Ag8r/backuponserver
>> -X none
>> pg_basebackup: error: could not initiate base backup: ERROR:
>> unrecognized target: "server;C"
>>
>> "server" is a valid backup target, but "server;C" is not. And I think
>> this must be a bug on the client side, because the server logs the
>> generated query:
> It looks a bit like msys perl could be recognising
> "server:/home/pgrunner/..." and converting it to
> "server;C:\tools\msys64\home\pgrunner\...".  From a bit of light
> googling I see that such conversions happen in msys perl's system()
> unless you turn them off with MSYS_NO_PATHCONV, and then we'd have to
> do it ourselves in the right places.



c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says:


    my $source_ts_prefix = $source_ts_path;
    $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!;
    ...

    # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
    local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;


Probably in this case just setting it to 'server:' would do the trick.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jan 21, 2022 at 5:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> While we're on the subject of ill-chosen option syntax: "-cfast"
>> with non double dashes?  Really?  That's horribly ambiguous.

> I'm not sure whether you're complaining that we accept that syntax or
> using it, but AFAIK I'm responsible for neither. I think the syntax
> has been accepted since pg_basebackup was added in 2011, and Andres
> added it to this test case earlier this week (with -cfast in the
> subject line of the commit message).

pg_basebackup's help defines the syntax as

  -c, --checkpoint=fast|spread
                         set fast or spread checkpointing

which I'd read as requiring a space (or possibly equal sign)
between "-c" and "fast".  If it works as written in this test,
that's an accident of the particular getopt implementation,
and I'll bet it won't be too long before we come across
a getopt that doesn't like it.

            regards, tom lane



Re: fairywren is generating bogus BASE_BACKUP commands

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says:
>     my $source_ts_prefix = $source_ts_path;
>     $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!;
>     ...

>     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
>     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;

> Probably in this case just setting it to 'server:' would do the trick.

The point I was trying to make is that if we have to jump through
that sort of hoop in the test scripts, then real users are going
to have to jump through it as well, and they won't like that
(and we will get bug reports about it).  It'd be better to design
the option syntax to avoid such requirements.

            regards, tom lane



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-21 17:42:45 -0500, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says:
> >     my $source_ts_prefix = $source_ts_path;
> >     $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!;
> >     ...
>
> >     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
> >     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;
>
> > Probably in this case just setting it to 'server:' would do the trick.
>
> The point I was trying to make is that if we have to jump through
> that sort of hoop in the test scripts, then real users are going
> to have to jump through it as well, and they won't like that
> (and we will get bug reports about it).  It'd be better to design
> the option syntax to avoid such requirements.

Normal users aren't going to invoke a "native" basebackup from inside msys. I
assume the translation happens because an "msys world" perl invokes
a "native" pg_basebackup via msys system(), right?  If pg_basebackup instead is
"normally" invoked from a windows terminal, or anything else "native" windows,
the problem won't exist, no?

As we're building a "native" postgres in this case, none of our tools should
internally have such translations happening. So I don't think it'll be a huge
issue for users themselves?

Not that I think that there are all that many users of mingw built postgres on
windows... I think it's mostly msvc built postgres in that world?

Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Fri, Jan 21, 2022 at 5:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The point I was trying to make is that if we have to jump through
> that sort of hoop in the test scripts, then real users are going
> to have to jump through it as well, and they won't like that
> (and we will get bug reports about it).  It'd be better to design
> the option syntax to avoid such requirements.

Well, as Andrew points out, pg_basebackup's -T foo=bar syntax causes
the same issue, so you'll need to redesign that, too. But even that is
not really a proper fix. The real root of this problem is that the
operating system's notion of a valid path differs from PostgreSQL's
notion of a valid path on this platform, and I imagine that fixing
that is a rather large project.

ISTM that you're basically just complaining about options syntax that
you don't like, but I think there's nothing particularly worse about
this syntax than lots of other things we type all the time. psql -v
VAR=VALUE -P OTHERKINDOFVAR=OTHERVALUE? curl -u USER:PASSWORD?
pg_basebackup -T OLD=NEW? perl -d[t]:MODULE=OPT,OPT? I mean, that last
one actually seems kinda horrible and if this were as bad as that I'd
say yeah, it should be redesigned. But I don't think it is. There's
plenty of precedent for bundling closely-related values into a single
command-line option, which is all I've done here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-21 17:26:32 -0500, Robert Haas wrote:
> I think the syntax has been accepted since pg_basebackup was added in 2011,
> and Andres added it to this test case earlier this week (with -cfast in the
> subject line of the commit message).

The reason I used -cfast instead of -c fast or --checkpoint=fast is that the
way perltidy formats leads to very wide lines already, and making them even
longer seemed unattractive...

Given the -cfast syntax successfully passed tests on at least AIX, freebsd,
linux, macos, netbsd, openbsd, windows msvc, windows msys, I'm not too worried
about portability either.

Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
>     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;
> Probably in this case just setting it to 'server:' would do the trick.

Oh, thanks for the tip. Do you want to push a commit that does that,
or ... should I do it?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Thomas Munro
Date:
On Sat, Jan 22, 2022 at 3:55 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> >     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
> >     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;
> > Probably in this case just setting it to 'server:' would do the trick.
>
> Oh, thanks for the tip. Do you want to push a commit that does that,
> or ... should I do it?

Just a thought:  Would it prevent the magic path translation and all
just work if the path were already in Windows form?  So, if we did
just this change at the top:

-my $tempdir = PostgreSQL::Test::Utils::tempdir;
+my $tempdir = PostgreSQL::Test::Utils::perl2host(PostgreSQL::Test::Utils::tempdir);



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/21/22 18:04, Andres Freund wrote:
> Hi,
>
> On 2022-01-21 17:42:45 -0500, Tom Lane wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> c.f. src/bin/pg_verifybackup/t/003_corruption.pl which says:
>>>     my $source_ts_prefix = $source_ts_path;
>>>     $source_ts_prefix =~ s!(^[A-Z]:/[^/]*)/.*!$1!;
>>>     ...
>>>     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
>>>     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;
>>> Probably in this case just setting it to 'server:' would do the trick.
>> The point I was trying to make is that if we have to jump through
>> that sort of hoop in the test scripts, then real users are going
>> to have to jump through it as well, and they won't like that
>> (and we will get bug reports about it).  It'd be better to design
>> the option syntax to avoid such requirements.
> Normal users aren't going to invoke a "native" basebackup from inside msys. I
> assume the translation happens because an "msys world" perl invokes
> a "native" pg_basebackup via msys system(), right?  If pg_basebackup instead is
> "normally" invoked from a windows terminal, or anything else "native" windows,
> the problem won't exist, no?
>
> As we're building a "native" postgres in this case, none of our tools should
> internally have such translations happening. So I don't think it'll be a huge
> issue for users themselves?


All true. This is purely an issue for our testing regime, and not for
end users.


>
> Not that I think that there are all that many users of mingw built postgres on
> windows... I think it's mostly msvc built postgres in that world?
>

The vast majority use the installer which is built with MSVC. Very few
in my experience build their own.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/21/22 22:43, Thomas Munro wrote:
> On Sat, Jan 22, 2022 at 3:55 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Jan 21, 2022 at 5:35 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>>     # See https://www.msys2.org/wiki/Porting/#filesystem-namespaces
>>>     local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix;
>>> Probably in this case just setting it to 'server:' would do the trick.
>> Oh, thanks for the tip. Do you want to push a commit that does that,
>> or ... should I do it?
> Just a thought:  Would it prevent the magic path translation and all
> just work if the path were already in Windows form?  So, if we did
> just this change at the top:
>
> -my $tempdir = PostgreSQL::Test::Utils::tempdir;
> +my $tempdir = PostgreSQL::Test::Utils::perl2host(PostgreSQL::Test::Utils::tempdir);


It's not as simple as that :-(  But you're on the right track. My
suggestion above doesn't work.

The rule for paths is: when you're passing a path to an external program
that's not msys aware (typically, one of our build artefacts like psql
or pg_basebackup) it needs to be a native path. But when you're passing
it to a perl function (e.g. mkdir) or to an external program that's msys
aware it needs to be a virtual path, i.e. one not mangled by perl2host.

Some recent commits to this file especially have not obeyed this rule.
Here's a patch that does it consistently for the whole file. I have
tested it on a system very like fairywren, and the test passes.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Sun, Jan 23, 2022 at 12:20 PM Andrew Dunstan <andrew@dunslane.net> wrote:

> It's not as simple as that :-(  But you're on the right track. My
> suggestion above doesn't work.
>
> The rule for paths is: when you're passing a path to an external program
> that's not msys aware (typically, one of our build artefacts like psql
> or pg_basebackup) it needs to be a native path. But when you're passing
> it to a perl function (e.g. mkdir) or to an external program that's msys
> aware it needs to be a virtual path, i.e. one not mangled by perl2host.
>
> Some recent commits to this file especially have not obeyed this rule.
> Here's a patch that does it consistently for the whole file. I have
> tested it on a system very like fairywren, and the test passes.

I can't understand how this would prevent server:/what/ever from
getting turned into server;c:\what\ever. But if it does, great!

Maybe we need to have a README in the tree somewhere that tries to
explain this. Or maybe we should make our build artifacts msys-aware,
if that's possible, so that this just works. Or maybe supporting msys
is not worth the trouble.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Maybe we need to have a README in the tree somewhere that tries to
> explain this. Or maybe we should make our build artifacts msys-aware,
> if that's possible, so that this just works. Or maybe supporting msys
> is not worth the trouble.

I've been wondering that last myself.  Supporting Windows-native is
already a huge amount of work, which we put up with because there
are a lot of users.  If msys is going to add another large chunk of
work, has it got enough users to justify that?

The recent argument that this behavior isn't user-visible doesn't do
anything to mollify me on that point; it appears to me to be tantamount
to a concession that no real users actually care about msys.

            regards, tom lane



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/23/22 15:07, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Maybe we need to have a README in the tree somewhere that tries to
>> explain this. Or maybe we should make our build artifacts msys-aware,
>> if that's possible, so that this just works. Or maybe supporting msys
>> is not worth the trouble.
> I've been wondering that last myself.  Supporting Windows-native is
> already a huge amount of work, which we put up with because there
> are a lot of users.  If msys is going to add another large chunk of
> work, has it got enough users to justify that?
>
> The recent argument that this behavior isn't user-visible doesn't do
> anything to mollify me on that point; it appears to me to be tantamount
> to a concession that no real users actually care about msys.


Msys is a unix-like environment that is useful to build Postgres. It's
not intended as a general runtime environment. We therefore don't build
msys-aware Postgres. We use msys to build standalone Postgres binaries
that don't need or use any msys runtime. There is nothing in the least
bit new about this - that's the way it's been since day one of the
Windows port nearly 20 years ago.

Speaking as someone who (for my sins) regularly deals with problems on
Windows, I find msys much easier to deal with than VisualStudio,
probably because it's so much like what I use elsewhere. So I think
dropping msys support would be a serious mistake.

The most common issues we get are around this issue of virtualized paths
in the TAP tests. If people followed the rule I suggested upthread, 99%
of those problems would go away. I realize it's annoying - I've been
caught by it myself on more than one occasion. Maybe there's a way to
avoid it, but if there is I'm unaware of it. But I don't think it's in
any way a good reason to drop msys support.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The most common issues we get are around this issue of virtualized paths
> in the TAP tests. If people followed the rule I suggested upthread, 99%
> of those problems would go away. I realize it's annoying - I've been
> caught by it myself on more than one occasion. Maybe there's a way to
> avoid it, but if there is I'm unaware of it. But I don't think it's in
> any way a good reason to drop msys support.

Well, let's go back to Robert's other suggestion: some actual
documentation of these rules, in the source tree, might help
people to follow them.  src/test/perl/README seems like an
appropriate place.

            regards, tom lane



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-23 16:09:01 -0500, Andrew Dunstan wrote:
> Msys is a unix-like environment that is useful to build Postgres. It's
> not intended as a general runtime environment. We therefore don't build
> msys-aware Postgres. We use msys to build standalone Postgres binaries
> that don't need or use any msys runtime. There is nothing in the least
> bit new about this - that's the way it's been since day one of the
> Windows port nearly 20 years ago.
>
> Speaking as someone who (for my sins) regularly deals with problems on
> Windows, I find msys much easier to deal with than VisualStudio,
> probably because it's so much like what I use elsewhere. So I think
> dropping msys support would be a serious mistake.

I agree that msys support is useful (although configure is *so* slow that I
find its usefullness reduced substantially).


> The most common issues we get are around this issue of virtualized paths
> in the TAP tests. If people followed the rule I suggested upthread, 99%
> of those problems would go away. I realize it's annoying - I've been
> caught by it myself on more than one occasion. Maybe there's a way to
> avoid it, but if there is I'm unaware of it. But I don't think it's in
> any way a good reason to drop msys support.

Needing to sprinkle perl2host and MSYS2_ARG_CONV_EXCL over a good number of
tests, getting weird errors when failing, etc IMO isn't a scalable approach,
for a platform that most of use never use.

Can't we solve this in a generic way? E.g. by insisting that the test run with
a native perl and normalizing the few virtual paths we get invoked with
centrally?  Making the msys initial setup a bit more cumbersome would IMO be
an OK price to pay for making it maintainable / predictable from other
platforms, as long as we have some decent docs and decent error messages.

Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/23/22 16:31, Andres Freund wrote:
>> The most common issues we get are around this issue of virtualized paths
>> in the TAP tests. If people followed the rule I suggested upthread, 99%
>> of those problems would go away. I realize it's annoying - I've been
>> caught by it myself on more than one occasion. Maybe there's a way to
>> avoid it, but if there is I'm unaware of it. But I don't think it's in
>> any way a good reason to drop msys support.
> Needing to sprinkle perl2host and MSYS2_ARG_CONV_EXCL over a good number of
> tests, getting weird errors when failing, etc IMO isn't a scalable approach,
> for a platform that most of use never use.
>
> Can't we solve this in a generic way? E.g. by insisting that the test run with
> a native perl and normalizing the few virtual paths we get invoked with
> centrally?  Making the msys initial setup a bit more cumbersome would IMO be
> an OK price to pay for making it maintainable / predictable from other
> platforms, as long as we have some decent docs and decent error messages.
>

Nice idea. I have a suspicion that it's going to be harder than you
think, but I'll be very happy to be proved wrong.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-23 17:38:26 -0500, Andrew Dunstan wrote:
> Nice idea. I have a suspicion that it's going to be harder than you
> think, but I'll be very happy to be proved wrong.

FWIW, a manual invocation of the pg_basebackup tests works via a ucrt perl
(ucrt64/mingw-w64-ucrt-x86_64-perl package).

It did require installing IPC::Run via cpan though, or at least I think it
did. That's a bit annoying. But not all that onerous if we document it?

I did make sure that the ucrt perl deals with windows paths. For good measure
I forced an early return in perl2host.


I think we would have to modify the values for a few environment
variables. Stuff like PG_REGRESS, TAR etc. Either where we specify them, or
Utils.pm or such.

Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Sun, Jan 23, 2022 at 4:09 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> The most common issues we get are around this issue of virtualized paths
> in the TAP tests. If people followed the rule I suggested upthread, 99%
> of those problems would go away.

Well, that makes it sound like it's the fault of people for not
following the rules, but I don't think that's really a fair
assessment. Even your first guess as to how to solve this particular
problem wasn't correct, and if you can't guess right on the first try,
I don't know how anyone else is supposed to do it. I still don't even
understand why your first guess wasn't right. I feel like every time I
try to add a TAP test Msys blows up, and I can't figure out how to fix
it myself. Which is not a great feeling.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/23/22 22:52, Robert Haas wrote:
> On Sun, Jan 23, 2022 at 4:09 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> The most common issues we get are around this issue of virtualized paths
>> in the TAP tests. If people followed the rule I suggested upthread, 99%
>> of those problems would go away.
> Well, that makes it sound like it's the fault of people for not
> following the rules, but I don't think that's really a fair
> assessment. Even your first guess as to how to solve this particular
> problem wasn't correct, and if you can't guess right on the first try,
> I don't know how anyone else is supposed to do it. I still don't even
> understand why your first guess wasn't right. I feel like every time I
> try to add a TAP test Msys blows up, and I can't figure out how to fix
> it myself. Which is not a great feeling.
>


Well if we can get Andres' suggestion to work all this might go away,
which would keep everyone happy, especially me. You're right that I was
a little careless upthread. Mea culpa. Meanwhile I am committing a
minimal one line fix.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Mon, Jan 24, 2022 at 2:01 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> Well if we can get Andres' suggestion to work all this might go away,
> which would keep everyone happy, especially me. You're right that I was
> a little careless upthread. Mea culpa. Meanwhile I am committing a
> minimal one line fix.

I in no way intended to accuse you of being careless. I was just
pointing out that this stuff seems to be hard to get right, even for
smart people.

I really hate committing stuff that turns out to be broken. It's such
a fire drill when the build farm turns red.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Robert Haas
Date:
On Mon, Jan 24, 2022 at 2:27 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I really hate committing stuff that turns out to be broken. It's such
> a fire drill when the build farm turns red.

And there's a good chance it's about to break again, because I just
committed the next patch in the series which, shockingly, also
includes tests.

I'd like to tell you I believe I got it right this time ... but I don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/24/22 15:17, Robert Haas wrote:
> On Mon, Jan 24, 2022 at 2:27 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> I really hate committing stuff that turns out to be broken. It's such
>> a fire drill when the build farm turns red.
> And there's a good chance it's about to break again, because I just
> committed the next patch in the series which, shockingly, also
> includes tests.
>
> I'd like to tell you I believe I got it right this time ... but I don't.



I'll just keep playing whackamole :-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote:
> Well if we can get Andres' suggestion to work all this might go away,
> which would keep everyone happy, especially me.

I successfully tried it for a few tests. But I see tests hanging a lot
independent of the way I run the tests, presumably due to the issues discussed
in [1]. So we need to do something about that.

I don't have the cycles to finish changing over to that way of running tests -
do you have some time to work on it, if I clean up the bit I have?

- Andres

[1] https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/24/22 16:39, Andres Freund wrote:
> Hi,
>
> On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote:
>> Well if we can get Andres' suggestion to work all this might go away,
>> which would keep everyone happy, especially me.
> I successfully tried it for a few tests. But I see tests hanging a lot
> independent of the way I run the tests, presumably due to the issues discussed
> in [1]. So we need to do something about that.
>
> I don't have the cycles to finish changing over to that way of running tests -
> do you have some time to work on it, if I clean up the bit I have?
>
> - Andres
>
> [1] https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com



Give me what you can and I'll see what I can do. I have a couple of
moderately high priority items on my plate, but I will probably be able
to fit in some testing when those make my eyes completely glaze over.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote:
> Give me what you can and I'll see what I can do. I have a couple of
> moderately high priority items on my plate, but I will probably be able
> to fit in some testing when those make my eyes completely glaze over.

Steps:

# install msys from https://www.msys2.org/
# install dependencies in msys shell
pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl ucrt64/mingw-w64-ucrt-x86_64-gcc
ucrt64/mingw-w64-ucrt-x86_64-zlibucrt64/mingw-w64-ucrt-x86_64-ccache diffutils
 


# start mingw ucrt64 x64 shell
cpan install -T IPC::Run
perl -MIPC::Run -e 1 # verify ipc run is installed

cd /c/dev
# I added --reference postgres to accelerate the cloning
git clone https://git.postgresql.org/git/postgresql.git postgres-mingw
cd /c/dev/postgres-mingw

git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7
git revert 6051857fc953a62db318329c4ceec5f9668fd42a

# apply attached patch

# see below why out-of-tree is easier or now
mkdir build
cd build
# path parameters probably not necessary, I thought I needed them earlier, not sure why
../configure --without-readline --cache cache --enable-tap-tests PROVE=/ucrt64/bin/core_perl/prove
PERL=/ucrt64/bin/perl.exeCC="ccache gcc"
 
make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test
make -j8 -s temp-install

# pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR
mkdir /c/dev/postgres-mingw/build/tmp

# the TAR= ensures that tests pick up a tar accessible with a windows path
# PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise there are port conflicts

(make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 TMPDIR=C:/dev/postgres-mingw/tmp
TAR="C:\Windows\System32\tar.exe"2>&1 && echo test-world-success || echo test-world-fail) 2>&1 |tee test-world.log
 


To make tests in "in-tree" builds work, a bit more hackery would be
needed. The problem is that windows chooses binaries from the current working
directory *before* PATH. That's a problem for things like initdb.exe or
pg_ctl.exe that want to find postgres.exe, as that only works with the program
in their proper location, rather than CWD.

Greetings,

Andres Freund

Attachment

Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 1/24/22 21:36, Andres Freund wrote:
> Hi,
>
> On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote:
>> Give me what you can and I'll see what I can do. I have a couple of
>> moderately high priority items on my plate, but I will probably be able
>> to fit in some testing when those make my eyes completely glaze over.
> Steps:
>
> # install msys from https://www.msys2.org/
> # install dependencies in msys shell
> pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl ucrt64/mingw-w64-ucrt-x86_64-gcc
ucrt64/mingw-w64-ucrt-x86_64-zlibucrt64/mingw-w64-ucrt-x86_64-ccache diffutils
 
>
>
> # start mingw ucrt64 x64 shell
> cpan install -T IPC::Run
> perl -MIPC::Run -e 1 # verify ipc run is installed
>
> cd /c/dev
> # I added --reference postgres to accelerate the cloning
> git clone https://git.postgresql.org/git/postgresql.git postgres-mingw
> cd /c/dev/postgres-mingw
>
> git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7
> git revert 6051857fc953a62db318329c4ceec5f9668fd42a
>
> # apply attached patch
>
> # see below why out-of-tree is easier or now
> mkdir build
> cd build
> # path parameters probably not necessary, I thought I needed them earlier, not sure why
> ../configure --without-readline --cache cache --enable-tap-tests PROVE=/ucrt64/bin/core_perl/prove
PERL=/ucrt64/bin/perl.exeCC="ccache gcc"
 
> make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test
> make -j8 -s temp-install
>
> # pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR
> mkdir /c/dev/postgres-mingw/build/tmp
>
> # the TAR= ensures that tests pick up a tar accessible with a windows path
> # PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise there are port conflicts
>
> (make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 TMPDIR=C:/dev/postgres-mingw/tmp
TAR="C:\Windows\System32\tar.exe"2>&1 && echo test-world-success || echo test-world-fail) 2>&1 |tee test-world.log
 



OK, I have all the pieces working and I know what I need to do to adapt
fairywren. The patch you provided is not necessary any more.

(I think your TMPDIR spec is missing a /build/)

The recipe worked (mutatis mutandis) for the mingw64 toolchain as well
as for the ucrt64 toolchain. Is there a reason to prefer ucrt64?

I think the next steps are:

  * do those two reverts
  * adjust fairywren
  * get rid of perl2host

At that stage jacana will no longer be able to run TAP tests. I can do
one of these:

  * disable the TAP tests on jacana
  * migrate jacana to msys2
  * kiss jacana goodbye.

Thoughts?


> To make tests in "in-tree" builds work, a bit more hackery would be
> needed. The problem is that windows chooses binaries from the current working
> directory *before* PATH. That's a problem for things like initdb.exe or
> pg_ctl.exe that want to find postgres.exe, as that only works with the program
> in their proper location, rather than CWD.
>

Yeah, we should do something about that. For example, we could possibly
use the new install_path option of PostgreSQL::Test::Cluster::new() so
it would find these in the right location.


However, I don't need it as my animals all use vpath builds.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-02-03 17:25:51 -0500, Andrew Dunstan wrote:
> OK, I have all the pieces working and I know what I need to do to adapt
> fairywren. The patch you provided is not necessary any more.

Cool. Are you going to post that?


> (I think your TMPDIR spec is missing a /build/)

I think I went back/forth between in-tree/out-of-tree build...


> The recipe worked (mutatis mutandis) for the mingw64 toolchain as well
> as for the ucrt64 toolchain. Is there a reason to prefer ucrt64?

There's a lot of oddities in the mingw64 target, due to targetting the much
older C runtime library (lots of bugs, missing functionality). MSVC targets
UCRT by default for quite a few years by now. Targetting msvcrt is basically
on its way out from what I understand.


> I think the next steps are:
> 
>   * do those two reverts
>   * adjust fairywren
>   * get rid of perl2host
> 
> At that stage jacana will no longer be able to run TAP tests. I can do
> one of these:

I guess because its install is too old?


>   * disable the TAP tests on jacana
>   * migrate jacana to msys2
>   * kiss jacana goodbye.

Having a non-server mingw animal seems like it could be useful (I think that's
just Jacana), even if server / client versions of windows have grown
closer. So I think an update to msys2 makes the most sense?


> > To make tests in "in-tree" builds work, a bit more hackery would be
> > needed. The problem is that windows chooses binaries from the current working
> > directory *before* PATH. That's a problem for things like initdb.exe or
> > pg_ctl.exe that want to find postgres.exe, as that only works with the program
> > in their proper location, rather than CWD.

> Yeah, we should do something about that. For example, we could possibly
> use the new install_path option of PostgreSQL::Test::Cluster::new() so
> it would find these in the right location.

It'd be easy enough to adjust the central invocations of initdb. I think the
bigger problem is that there's plenty calls to initdb, pg_ctl "directly" in
the respective test scripts.


> However, I don't need it as my animals all use vpath builds.

I think it'd be fine to just error out in non-vpath builds on msvc. The
search-for-binaries behaviour is just too weird.

Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 2/3/22 20:51, Andres Freund wrote:
> Hi,
>
> On 2022-02-03 17:25:51 -0500, Andrew Dunstan wrote:
>> OK, I have all the pieces working and I know what I need to do to adapt
>> fairywren. The patch you provided is not necessary any more.
> Cool. Are you going to post that?



About the only thing missing in your recipe is this:


# force ucrt64 prove to use the ucrt64 perl rather than whatever is in
the path
sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove


Given that, you don't need to set PERL, and configure can find the perl
to build against from the PATH.



>
>
>  Is there a reason to prefer ucrt64?
> There's a lot of oddities in the mingw64 target, due to targetting the much
> older C runtime library (lots of bugs, missing functionality). MSVC targets
> UCRT by default for quite a few years by now. Targetting msvcrt is basically
> on its way out from what I understand.


OK.


>> I think the next steps are:
>>
>>   * do those two reverts
>>   * adjust fairywren
>>   * get rid of perl2host
>>
>> At that stage jacana will no longer be able to run TAP tests. I can do
>> one of these:
> I guess because its install is too old?


Yeah. fairywren is now running with ucrt64-perl for TAP tests. 


>>   * disable the TAP tests on jacana
>>   * migrate jacana to msys2
>>   * kiss jacana goodbye.
> Having a non-server mingw animal seems like it could be useful (I think that's
> just Jacana), even if server / client versions of windows have grown
> closer. So I think an update to msys2 makes the most sense?


Working on that. There appear to be some issues with third party
libraries. I might need to rebuild libxml2 and zlib for example.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote:
> Working on that. There appear to be some issues with third party
> libraries. I might need to rebuild libxml2 and zlib for example.

Any reason not to use the ones from msys2?



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andres Freund
Date:
Hi,

On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote:
> About the only thing missing in your recipe is this:

Re requiring out-of-tree builds: Thomas on IM noted that there's the
NoDefaultCurrentDirectoryInExePath environment variable. That should avoid the
problem leading to requiring out-of-tree builds. But I've not tested it.


> # force ucrt64 prove to use the ucrt64 perl rather than whatever is in
> the path
> sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove
> 
> 
> Given that, you don't need to set PERL, and configure can find the perl
> to build against from the PATH.

That shouldn't even be needed from what I understand now. If correctly started
the msys shell shoul dhave the right perl in path?


Greetings,

Andres Freund



Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 2/14/22 18:02, Andres Freund wrote:
> Hi,
>
> On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote:
>> About the only thing missing in your recipe is this:
> Re requiring out-of-tree builds: Thomas on IM noted that there's the
> NoDefaultCurrentDirectoryInExePath environment variable. That should avoid the
> problem leading to requiring out-of-tree builds. But I've not tested it.


Good to know.


>
>
>> # force ucrt64 prove to use the ucrt64 perl rather than whatever is in
>> the path
>> sed -i 's,^#!perl,#!/ucrt64/bin/perl,' /ucrt64/bin/core_perl/prove
>>
>>
>> Given that, you don't need to set PERL, and configure can find the perl
>> to build against from the PATH.
> That shouldn't even be needed from what I understand now. If correctly started
> the msys shell shoul dhave the right perl in path?



FSVO "the right perl". However, jacana is building against a separate
installation of AS perl, and I was trying to preserve that.


For a buildfarm animal, there is one extra package that is needed:


perl-LWP-Protocol-https


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

From
Andrew Dunstan
Date:
On 2/14/22 17:37, Andres Freund wrote:
> On 2022-02-14 17:32:11 -0500, Andrew Dunstan wrote:
>> Working on that. There appear to be some issues with third party
>> libraries. I might need to rebuild libxml2 and zlib for example.
> Any reason not to use the ones from msys2?



That seems to work. Needed to add these 2 packages to the recipe:


ucrt64/mingw-w64-ucrt-x86_64-libxml2

ucrt64/mingw-w64-ucrt-x86_64-libxslt


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com