Thread: [PATCH] pgbench tap tests fail if the path contains a perl special character

[PATCH] pgbench tap tests fail if the path contains a perl special character

From
Raúl Marín Rodríguez
Date:
Hi,

As mentioned in the subject, if the $PATH where you are building postgres
contains a perl's special character [1], for example a `+`, pgbench's tap
tests will fail.

The outputs looks something like this (full at [2]):
```
#   Failed test 'file name format'
#   at t/001_pgbench_with_server.pl line 805.

#   Failed test 'file name format'
#   at t/001_pgbench_with_server.pl line 805.
# Looks like you failed 2 tests of 312.
t/001_pgbench_with_server.pl ..
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/312 subtests
t/002_pgbench_no_server.pl .... ok
```

This error affects both PG11 and master, and a way to reproduce it is by
renaming the source folder from `postgresql` to `postgresql+XXX`.

I'm attaching a patch to fix this by using quotemeta [3]. It's generated
from master and also applies to REL_11_STABLE.

1 - http://jkorpela.fi/perl/regexp.html
2 - https://launchpadlibrarian.net/406634533/buildlog_ubuntu-xenial-amd64.postgresql-11_11.1.1+carto-2_BUILDING.txt.gz
3 - https://perldoc.perl.org/functions/quotemeta.html


Regards,
--
Raúl Marín Rodríguez
carto.com

Attachment
=?UTF-8?B?UmHDumwgTWFyw61uIFJvZHLDrWd1ZXo=?= <rmrodriguez@carto.com> writes:
> As mentioned in the subject, if the $PATH where you are building postgres
> contains a perl's special character [1], for example a `+`, pgbench's tap
> tests will fail.

Fun.  But is that really the only place that fails?

> I'm attaching a patch to fix this by using quotemeta [3].

This does not look right to me: glob has a different set of special
characters than regexes, no?

            regards, tom lane


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Raúl Marín Rodríguez
Date:
Hi,

> Fun.  But is that really the only place that fails?

Yes, other than this it builds flawlessly.

> This does not look right to me: glob has a different set of special
> characters than regexes, no?

The issue itself manifests in the grep call, not in glob, but I tried
using `quotemeta` only for grep without success (probably because I didn't
know how to properly call `quotemeta` over @logs.

This is the first time I do anything in perl beyond adding some simple tap
tests, so this could be me not knowing how to deal with it properly. I used
this small script to understand perl's behaviour:

```
$ cat a.pl
#!/usr/bin/perl
use strict;
use warnings;

my @prefix = '~/postgres+XXXX/001_pgbench_log_2.20745';
my @quoted_prefix = quotemeta(@prefix);

my @found1 = grep(/^@prefix\.\d+(\.\d+)?$/, "@prefix.20745");
my @found2 = grep(/^@prefix\.\d+(\.\d+)?$/, "@quoted_prefix.20745");
my @found3 = grep(/^@quoted_prefix\.\d+(\.\d+)?$/, "@prefix.20745");
my @found4 = grep(/^@quoted_prefix\.\d+(\.\d+)?$/, "@quoted_prefix.20745");
print "1: @found1\n";
print "2: @found2\n";
print "3: @found3\n";
print "4: @found4\n";

$ perl a.pl
1:
2:
3:
4: 1.20745
```

Regards,
--
Raúl Marín Rodríguez
carto.com


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Alvaro Herrera
Date:
On 2019-Jan-18, Raúl Marín Rodríguez wrote:

> Hi,
> 
> > Fun.  But is that really the only place that fails?
> 
> Yes, other than this it builds flawlessly.
> 
> > This does not look right to me: glob has a different set of special
> > characters than regexes, no?
> 
> The issue itself manifests in the grep call, not in glob, but I tried
> using `quotemeta` only for grep without success (probably because I didn't
> know how to properly call `quotemeta` over @logs.

Perhaps you can do
   @quoted = map { quotemeta($_) } @logs;

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>>> This does not look right to me: glob has a different set of special
>>> characters than regexes, no?

> Perhaps you can do
>    @quoted = map { quotemeta($_) } @logs;

Actually, playing with it here, it seems that quotemeta() quotes
anything that's a special character for either purpose (in fact,
it looks like it quotes anything that's not alphanumeric).
So the patch as given should work.

            regards, tom lane


I wrote:
> So the patch as given should work.

... well, modulo the use of "@foo" where you should've used "$foo".
I don't know enough Perl to understand why that seemed to work, but
I do know it's not the right thing here.

Pushed with that correction.

            regards, tom lane


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Raúl Marín Rodríguez
Date:
> Pushed with that correction.

Thanks a lot!

--
Raúl Marín Rodríguez
carto.com


=?UTF-8?B?UmHDumwgTWFyw61uIFJvZHLDrWd1ZXo=?= <rmrodriguez@carto.com> writes:
>> Pushed with that correction.
> Thanks a lot!

Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
since this went in, cf

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01

I'm just guessing, but I suspect that bowerbird is using a path spec that
includes a backslash directory separator and that's somehow bollixing
things.  If so, we might be able to fix it by converting backslashes to
forward slashes before applying quotemeta().

It's also possible that on Windows, "glob" handles backslashes
differently than it does elsewhere, which would be harder to fix
--- that would bring back my original fear that the appropriate
quoting rules are different for "glob" than for regexes.

Andrew, any insight?

            regards, tom lane


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Fabien COELHO
Date:
Hello Tom,

> Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
> since this went in, cf
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01
>
> I'm just guessing, but I suspect that bowerbird is using a path spec that
> includes a backslash directory separator and that's somehow bollixing
> things.

This point is unclear from the log, where plain slashes are used in the 
log prefix path, and furthermore no strange characters appear in the 
log prefix path:

  # Running: pgbench -n -S -t 50 -c 2 --log
--log-prefix=G:/prog/bf/root/HEAD/pgsql.build/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_log_2
--sampling-rate=0.5
  ok 345 - pgbench logs status (got 0 vs expected 0)
  ok 346 - pgbench logs stdout /(?^:select only)/
  ok 347 - pgbench logs stdout /(?^:processed: 100/100)/
  ok 348 - pgbench logs stderr /(?^:^$)/
  not ok 349 - number of log files

> If so, we might be able to fix it by converting backslashes to
> forward slashes before applying quotemeta().
>
> It's also possible that on Windows, "glob" handles backslashes
> differently than it does elsewhere, which would be harder to fix
> --- that would bring back my original fear that the appropriate
> quoting rules are different for "glob" than for regexes.

I'm afraid it could indeed be due to platform-specific behavior, so 
testing on the target machine to understand the actual behavior would 
help.

I just tested the current version on my laptop within a directory 
containing spaces and other misc chars: Pgbench TAP tests are currently 
broken in this context on master, and this may be used to a collection of 
issues, not just one, eg pgbench function splits parameters on spaces 
which breaks if there are spaces in bdir.

I'm going to investigate, possibly over next week-end, so please be 
patient.

About windows-specific issues, from File::Glob man page:

""" On DOSISH systems, backslash is a valid directory separator character. 
In this case, use of backslash as a quoting character (via GLOB_QUOTE) 
interferes with the use of backslash as a directory separator. ... """

It seems to suggest that quotemeta backslash-on-nearly-everything approach 
is not appropriate.

-- 
Fabien.


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Andrew Dunstan
Date:
On 1/21/19 4:50 AM, Fabien COELHO wrote:
>
> Hello Tom,
>
>> Hm, so bowerbird (a Windows machine) has been failing the pgbench tests
>> since this went in, cf
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-01-20%2004%3A57%3A01
>>
>>
>> I'm just guessing, but I suspect that bowerbird is using a path spec
>> that
>> includes a backslash directory separator and that's somehow bollixing
>> things.
>
> This point is unclear from the log, where plain slashes are used in
> the log prefix path, and furthermore no strange characters appear in
> the log prefix path:
>
>  # Running: pgbench -n -S -t 50 -c 2 --log
>
--log-prefix=G:/prog/bf/root/HEAD/pgsql.build/src/bin/pgbench/tmp_check/t_001_pgbench_with_server_main_data/001_pgbench_log_2
> --sampling-rate=0.5
>  ok 345 - pgbench logs status (got 0 vs expected 0)
>  ok 346 - pgbench logs stdout /(?^:select only)/
>  ok 347 - pgbench logs stdout /(?^:processed: 100/100)/
>  ok 348 - pgbench logs stderr /(?^:^$)/
>  not ok 349 - number of log files
>
>> If so, we might be able to fix it by converting backslashes to
>> forward slashes before applying quotemeta().
>>
>> It's also possible that on Windows, "glob" handles backslashes
>> differently than it does elsewhere, which would be harder to fix
>> --- that would bring back my original fear that the appropriate
>> quoting rules are different for "glob" than for regexes.
>
> I'm afraid it could indeed be due to platform-specific behavior, so
> testing on the target machine to understand the actual behavior would
> help.
>
> I just tested the current version on my laptop within a directory
> containing spaces and other misc chars: Pgbench TAP tests are
> currently broken in this context on master, and this may be used to a
> collection of issues, not just one, eg pgbench function splits
> parameters on spaces which breaks if there are spaces in bdir.
>
> I'm going to investigate, possibly over next week-end, so please be
> patient.
>
> About windows-specific issues, from File::Glob man page:
>
> """ On DOSISH systems, backslash is a valid directory separator
> character. In this case, use of backslash as a quoting character (via
> GLOB_QUOTE) interferes with the use of backslash as a directory
> separator. ... """
>
> It seems to suggest that quotemeta backslash-on-nearly-everything
> approach is not appropriate.
>


File globbing is not the same thing as regex matching. For example, '.'
is a literal character in a glob pattern but a metacharacter in a regex,
while ' ' is a literal character in a regex but a globbing metacharacter
(but see below), and '*' is a metacharacter in both but has different
meanings. quotemeta is intended for regexes but being used here on an
expression used for globbing.

Perhaps it would be OK we changed back the glob line to use $prefix
instead of $qprefix, but kept the use of $qprefix in the later regex.

To deal with the issue of spaces in file names (not an issue eher ob
bowerbird), we should consider adding this:

    use File::Glob ':bsd_glob';

This removes the globbing metcharacter nature of the space character,
although it might have other effects that cause pain. See `perldoc
File::Glob`


cheers


andrew


--
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> Perhaps it would be OK we changed back the glob line to use $prefix
> instead of $qprefix, but kept the use of $qprefix in the later regex.

Well, glob does have some metacharacters, so not doing any quoting
at all still leaves us with failure modes.

I did a little bit of googling on this subject last night, and it
seems that at least some people believe that the answer is to not
use glob, period, but read the directory for yourself.  Maybe there's
a better way, but the breakage around backslashes doesn't really
leave me hopeful.

As a short-term move to un-break the buildfarm, I'm just going to
revert that patch altogether.  We can reapply it once we've
figured out how to do the glob part correctly.

            regards, tom lane


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Fabien COELHO
Date:
Hello Andrew,

>> About windows-specific issues, from File::Glob man page:
>>
>> """ On DOSISH systems, backslash is a valid directory separator
>> character. In this case, use of backslash as a quoting character (via
>> GLOB_QUOTE) interferes with the use of backslash as a directory
>> separator. ... """
>>
>> It seems to suggest that quotemeta backslash-on-nearly-everything
>> approach is not appropriate.
>
> File globbing is not the same thing as regex matching. For example, '.'
> is a literal character in a glob pattern but a metacharacter in a regex,
> while ' ' is a literal character in a regex but a globbing metacharacter
> (but see below), and '*' is a metacharacter in both but has different
> meanings. quotemeta is intended for regexes but being used here on an
> expression used for globbing.
>
> Perhaps it would be OK we changed back the glob line to use $prefix
> instead of $qprefix, but kept the use of $qprefix in the later regex.

Yep, possibly. I'll have to test, though.

> To deal with the issue of spaces in file names (not an issue eher ob
> bowerbird), we should consider adding this:
>
>     use File::Glob ':bsd_glob';

I was planning that so that the behavior is the same on all systems.

> This removes the globbing metcharacter nature of the space character,
> although it might have other effects that cause pain. See `perldoc
> File::Glob`

Yep. Thanks for doing my homework:-) I'll test starting with these 
options.

-- 
Fabien.

Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Fabien COELHO
Date:
Hello Tom,

> Well, glob does have some metacharacters, so not doing any quoting
> at all still leaves us with failure modes.
>
> I did a little bit of googling on this subject last night, and it
> seems that at least some people believe that the answer is to not
> use glob, period, but read the directory for yourself.

Yep, would work, it can then be filtered with standard regexpr, although 
it will take a few lines (opendir/grep on readdir/closedir).

> Maybe there's a better way, but the breakage around backslashes doesn't 
> really leave me hopeful.

Indeed. I was thinking of defining my own quote function, but without 
being convinced that it is such a good idea.

> As a short-term move to un-break the buildfarm, I'm just going to
> revert that patch altogether.

I was considering to suggest that, so I'm ok with that.

> We can reapply it once we've figured out how to do the glob part 
> correctly.

Yep. No need to spend time much on that, I'll have a look at it, although 
not right now, allow me a few days.

-- 
Fabien.


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Andrew Dunstan
Date:
On 1/21/19 11:18 AM, Fabien COELHO wrote:
>
> Hello Tom,
>
>> Well, glob does have some metacharacters, so not doing any quoting
>> at all still leaves us with failure modes.
>>
>> I did a little bit of googling on this subject last night, and it
>> seems that at least some people believe that the answer is to not
>> use glob, period, but read the directory for yourself.
>
> Yep, would work, it can then be filtered with standard regexpr,
> although it will take a few lines (opendir/grep on readdir/closedir).


Sure, probably the best solution. Given that Perl has
opendir/readdir/closedir, it should be only a handful of lines.


cheers


andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Fabien COELHO
Date:
Hello Tom,

> I did a little bit of googling on this subject last night, and it
> seems that at least some people believe that the answer is to not
> use glob, period, but read the directory for yourself.
>
> As a short-term move to un-break the buildfarm, I'm just going to
> revert that patch altogether.
>
> We can reapply it once we've figured out how to do the glob part correctly.

Here is a proposal patch which:

  - works around pgbench command splitting on spaces,
    if postgres sources are in a strangely named directory…
    I tested within a directory named "pg .* dir".

  - works aroung "glob" lack of portability by reading the directory
    directly and filtering with a standard re (see list_files).

  - adds a few comments

  - removes a spurious i option on an empty re

-- 
Fabien.
Attachment

Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Raúl Marín Rodríguez
Date:
Hi,

>   - works around pgbench command splitting on spaces,
>    if postgres sources are in a strangely named directory…
>    I tested within a directory named "pg .* dir".

I've also tested it with the initial case (including a +) and it works.


--
Raúl Marín Rodríguez
carto.com


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Fabien COELHO
Date:
>>   - works around pgbench command splitting on spaces,
>>    if postgres sources are in a strangely named directory…
>>    I tested within a directory named "pg .* dir".
>
> I've also tested it with the initial case (including a +) and it works.

Good, thanks for checking!

-- 
Fabien.

Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Raúl Marín Rodríguez
Date:
Hi,

Any way I can help to get this commited? Should we add this to the commitfest?

Regards,
--
Raúl Marín Rodríguez
carto.com


Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Daniel Gustafsson
Date:
> On 8 Feb 2019, at 13:33, Raúl Marín Rodríguez <rmrodriguez@carto.com> wrote:

> Any way I can help to get this commited? Should we add this to the commitfest?

Please do, it’s a good way to ensure that the patch is tracked and not
forgotten about.

cheers ./daniel

Re: [PATCH] pgbench tap tests fail if the path contains a perl special character

From
Raúl Marín
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

Tested `pgbench-tap-glob-fix-1.patch` with a "+" in the path and the tap tests now pass.
On 2/8/19 3:30 PM, Raúl Marín wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:       not tested
> Spec compliant:           not tested
> Documentation:            not tested
> 
> Tested `pgbench-tap-glob-fix-1.patch` with a "+" in the path and the tap tests now pass.

Is this now ready for committer or is there anything else Fabien needs 
to do?

Marking the patch as either Ready for Committer or Waiting on Author 
will help move things along.

Regards,
-- 
-David
david@pgmasters.net


Re: Re: [PATCH] pgbench tap tests fail if the path contains a perlspecial character

From
Raúl Marín Rodríguez
Date:
> Marking the patch as either Ready for Committer or Waiting on Author
will help move things along.

Thanks for the heads-up, I've retested it again and it applies cleanly on
top of master and addresses the issue.

I've moved it to `Ready for Committer`.



--
Raúl Marín Rodríguez
carto.com