Thread: [HACKERS] Still another race condition in recovery TAP tests

[HACKERS] Still another race condition in recovery TAP tests

From
Tom Lane
Date:
In a moment of idleness I tried to run the TAP tests on pademelon,
which is a mighty old and slow machine.  Behold,
src/test/recovery/t/010_logical_decoding_timelines.pl fell over,
with the relevant section of its log contents being:

# testing logical timeline following with a filesystem-level copy
# Taking filesystem backup b1 from node "master"
# pg_start_backup: 0/2000028
could not
opendir(/home/postgres/pgsql/src/test/recovery/tmp_check/t_010_logical_decoding_timelines_master_data/pgdata/pg_wal/archive_status/000000010000000000000001.ready):
Nosuch file or directory at ../../../src/test/perl//RecursiveCopy.pm line 115. 
### Stopping node "master" using mode immediate

The postmaster log has this relevant entry:

2017-09-08 22:03:22.917 EDT [19160] DEBUG:  archived write-ahead log file "000000010000000000000001"

It looks to me like the archiver removed 000000010000000000000001.ready
between where RecursiveCopy.pm checks that $srcpath is a regular file
or directory (line 95) and where it rechecks whether it's a regular
file (line 105).  Then the "-f" test on line 105 fails, allowing it to
fall through to the its-a-directory path, and unsurprisingly the opendir
at line 115 fails with the above symptom.

In short, RecursiveCopy.pm is woefully unprepared for the idea that the
source directory tree might be changing underneath it.

I'm not real sure if the appropriate answer to this is "we need to fix
RecursiveCopy" or "we need to fix the callers to not call RecursiveCopy
until the source directory is stable".  Thoughts?

(I do kinda wonder why we rolled our own RecursiveCopy; surely there's
a better implementation in CPAN?)
        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Michael Paquier
Date:
On Sat, Sep 9, 2017 at 11:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In a moment of idleness I tried to run the TAP tests on pademelon,
> which is a mighty old and slow machine.

How long did it take? Just wondering if that's actually the slowest
one or not to run the full set of recovery tests. This would be mighty
slow. Even hamster never detected this failure I think.

> It looks to me like the archiver removed 000000010000000000000001.ready
> between where RecursiveCopy.pm checks that $srcpath is a regular file
> or directory (line 95) and where it rechecks whether it's a regular
> file (line 105).  Then the "-f" test on line 105 fails, allowing it to
> fall through to the its-a-directory path, and unsurprisingly the opendir
> at line 115 fails with the above symptom.
>
> In short, RecursiveCopy.pm is woefully unprepared for the idea that the
> source directory tree might be changing underneath it.

Yes, before 010_logical_decoding_timelines and _backup_fs were
introduced, we only used RecursiveCopy with a static source in
init_from_backup(), which represented no risks.

> I'm not real sure if the appropriate answer to this is "we need to fix
> RecursiveCopy" or "we need to fix the callers to not call RecursiveCopy
> until the source directory is stable".  Thoughts?

We could make RecursiveCopy smarter here. In backup_fs_hot, if for
example another segment switch happens between pg_start_backup() and
CopyRecursive, say with a low archive_timeout with a TAP test using
this configuration, then we would stay at the same point. Another
thing that we could actually use here is the fact that _backup_fs
assumes that archives must be enabled (a sanity check would be nice!),
so we could just exclude pg_wal from the FS-backup and avoid any
problems, at least for this test butthat won't save from any other
paths getting modified on disk. So making RecursiveCopy really looks
like the best bet in the long term.

> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
> a better implementation in CPAN?)

This comes from here, which is something I got involved in:
commit: 1caef31d9e550408d0cbc5788a422dcb69736df5
author: Alvaro Herrera <alvherre@alvh.no-ip.org>
date: Wed, 2 Dec 2015 18:46:16 -0300
Refactor Perl test code

So the idea here is really to have:
1) Something compatible down to perl 5.8.
2) Something which is not external, which is where we could have used
File::Copy::Recursive (only compatible from 5.12 if I recall
correctly? I am too lazy to check now). Note as well that a copy of
the whole directory is used for backups instead of tar format of
pg_basebackup because there is no easy support for tar files down to
5.8.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Sat, Sep 9, 2017 at 11:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In a moment of idleness I tried to run the TAP tests on pademelon,
>> which is a mighty old and slow machine.

> How long did it take?

The last time I tried it, make check-world took about 3 hours IIRC.
But that was a few months ago, I suspect it's more now.  I re-launched
the run after diagnosing this failure; it's gotten past the recovery
tests (proving that this is a race condition not a hard failure) and
is somewhere in the contrib tests, about 2 hours in.

>> I'm not real sure if the appropriate answer to this is "we need to fix
>> RecursiveCopy" or "we need to fix the callers to not call RecursiveCopy
>> until the source directory is stable".  Thoughts?

> ... So making RecursiveCopy really looks
> like the best bet in the long term.

Yeah, even if we fixed this particular call site, I'm sure the issue
would come up again.  Certainly we expect hot backups to work with
a changing source directory.

>> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
>> a better implementation in CPAN?)

> This comes from here, which is something I got involved in:

> So the idea here is really to have:
> 1) Something compatible down to perl 5.8.
> 2) Something which is not external, which is where we could have used
> File::Copy::Recursive (only compatible from 5.12 if I recall
> correctly? I am too lazy to check now).

Hm.  Even if we don't want to use File::Copy::Recursive, maybe we should
look at it and see what (if anything) it does about source-tree changes.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Michael Paquier
Date:
On Sat, Sep 9, 2017 at 1:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>>> I'm not real sure if the appropriate answer to this is "we need to fix
>>> RecursiveCopy" or "we need to fix the callers to not call RecursiveCopy
>>> until the source directory is stable".  Thoughts?
>
>> ... So making RecursiveCopy really looks
>> like the best bet in the long term.
>
> Yeah, even if we fixed this particular call site, I'm sure the issue
> would come up again.  Certainly we expect hot backups to work with
> a changing source directory.

Definitely...

> Hm.  Even if we don't want to use File::Copy::Recursive, maybe we should
> look at it and see what (if anything) it does about source-tree changes.

Actually, looking at the CPAN file of the module here:
http://cpansearch.perl.org/src/DMUEY/File-Copy-Recursive-0.38/Recursive.pm
There is no minimum version defined. There is even some handling for
perl 5.6 so I think that we should actually be fine from a
compatibility point of view. Also dircopy() uses fcopy(), which just
uses File::Copy::copy() to copy a file. A difference wiht
RecursiveCopy is that File::Copy::Recursive does not fail if copy()
itself fails (see "copy(@_) or return;" in "sub fcopy"). It just
returns for a failure.

We have actually discussed about RecursiveCopy and
File::Copy::Recursive two years ago:
https://www.postgresql.org/message-id/20151123212707.GD4073@alvherre.pgsql
The first version of RecursiveCopy did not need any fancy copy except
a simple recursive function, and I personally still like the fact that
symlinks are not handled. Should we care about that actually for cases
where we take a FS-backup with tablespaces and a pg_wal symlink? Such
case has not showed up yet, but if it does I think that we could
consider integrating File::Copy::Recursive.

In short, I'd still like to keep RecursiveCopy for now, but change its
code so as a copy() is not a hard failure. What do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Sat, Sep 9, 2017 at 1:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, even if we fixed this particular call site, I'm sure the issue
>> would come up again.  Certainly we expect hot backups to work with
>> a changing source directory.

> In short, I'd still like to keep RecursiveCopy for now, but change its
> code so as a copy() is not a hard failure. What do you think?

The specific case we need to allow is "ENOENT on a file/dir that was
there a moment ago".  I think it still behooves us to complain about
anything else.  If you think it's a simple fix, have at it.  But
I see at least three ways for _copypath_recurse to fail depending on
exactly when the file disappears.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Michael Paquier
Date:
On Sun, Sep 10, 2017 at 12:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Sat, Sep 9, 2017 at 1:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, even if we fixed this particular call site, I'm sure the issue
>>> would come up again.  Certainly we expect hot backups to work with
>>> a changing source directory.
>
>> In short, I'd still like to keep RecursiveCopy for now, but change its
>> code so as a copy() is not a hard failure. What do you think?
>
> The specific case we need to allow is "ENOENT on a file/dir that was
> there a moment ago".  I think it still behooves us to complain about
> anything else.  If you think it's a simple fix, have at it.  But
> I see at least three ways for _copypath_recurse to fail depending on
> exactly when the file disappears.

I have finally been able to take a couple of hours and looked at
problems in this module.

Errno is supported down to 5.8.8 per the docs, and those are set also
on Windows, copy() and opendir() complain with ENOENT for missing
entries:
http://perldoc.perl.org/5.8.8/Errno.html
readdir() does not though. If for example a directory gets removed in
the middle of a scan, then an empty list is returned. mkdir() would
not fail on ENOENT either, the parent directory would have already
been created if a child repository is being scanned.

With the check for -d and -f, this brings indeed the count to three
code paths. With the patch attached, I have added some manual sleep
calls in RecursiveCopy.pm and triggered manual deletions of the source
repository. The copy is able to complete in any case, warning about
missing directories or files. I have added warn messages in the patch
when ENOENT is triggered, though I think that those should be removed
in the final patch.

By the way, 010_logical_decoding_timelines.pl does not need to include
RecursiveCopy.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Sun, Sep 10, 2017 at 12:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The specific case we need to allow is "ENOENT on a file/dir that was
>> there a moment ago".  I think it still behooves us to complain about
>> anything else.  If you think it's a simple fix, have at it.  But
>> I see at least three ways for _copypath_recurse to fail depending on
>> exactly when the file disappears.

> With the check for -d and -f, this brings indeed the count to three
> code paths. With the patch attached, I have added some manual sleep
> calls in RecursiveCopy.pm and triggered manual deletions of the source
> repository. The copy is able to complete in any case, warning about
> missing directories or files. I have added warn messages in the patch
> when ENOENT is triggered, though I think that those should be removed
> in the final patch.

Hm, I don't much like having it silently ignore files that are present;
that seems like a foot-gun in the long run.  What do you think of the
attached?

> By the way, 010_logical_decoding_timelines.pl does not need to include
> RecursiveCopy.

Good catch.

            regards, tom lane

diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm
index 28ecaf6..19f7dd2 100644
*** a/src/test/perl/RecursiveCopy.pm
--- b/src/test/perl/RecursiveCopy.pm
*************** use File::Copy;
*** 29,40 ****
  =head2 copypath($from, $to, %params)

  Recursively copy all files and directories from $from to $to.

  Only regular files and subdirectories are copied.  Trying to copy other types
  of directory entries raises an exception.

  Raises an exception if a file would be overwritten, the source directory can't
! be read, or any I/O operation fails. Always returns true.

  If the B<filterfn> parameter is given, it must be a subroutine reference.
  This subroutine will be called for each entry in the source directory with its
--- 29,45 ----
  =head2 copypath($from, $to, %params)

  Recursively copy all files and directories from $from to $to.
+ Does not preserve file metadata (e.g., permissions).

  Only regular files and subdirectories are copied.  Trying to copy other types
  of directory entries raises an exception.

  Raises an exception if a file would be overwritten, the source directory can't
! be read, or any I/O operation fails.  However, we silently ignore ENOENT on
! open, because when copying from a live database it's possible for a file/dir
! to be deleted after we see its directory entry but before we can open it.
!
! Always returns true.

  If the B<filterfn> parameter is given, it must be a subroutine reference.
  This subroutine will be called for each entry in the source directory with its
*************** sub copypath
*** 74,79 ****
--- 79,87 ----
          $filterfn = sub { return 1; };
      }

+     # Complain if original path is bogus, because _copypath_recurse won't.
+     die "\"$base_src_dir\" does not exist" if !-e $base_src_dir;
+
      # Start recursive copy from current directory
      return _copypath_recurse($base_src_dir, $base_dest_dir, "", $filterfn);
  }
*************** sub _copypath_recurse
*** 89,100 ****
      return 1 unless &$filterfn($curr_path);

      # Check for symlink -- needed only on source dir
!     die "Cannot operate on symlinks" if -l $srcpath;
!
!     # Can't handle symlinks or other weird things
!     die "Source path \"$srcpath\" is not a regular file or directory"
!       unless -f $srcpath
!           or -d $srcpath;

      # Abort if destination path already exists.  Should we allow directories
      # to exist already?
--- 97,104 ----
      return 1 unless &$filterfn($curr_path);

      # Check for symlink -- needed only on source dir
!     # (note: this will fall through quietly if file is already gone)
!     die "Cannot operate on symlink \"$srcpath\"" if -l $srcpath;

      # Abort if destination path already exists.  Should we allow directories
      # to exist already?
*************** sub _copypath_recurse
*** 104,128 ****
      # same name and we're done.
      if (-f $srcpath)
      {
!         copy($srcpath, $destpath)
            or die "copy $srcpath -> $destpath failed: $!";
          return 1;
      }

!     # Otherwise this is directory: create it on dest and recurse onto it.
!     mkdir($destpath) or die "mkdir($destpath) failed: $!";
!
!     opendir(my $directory, $srcpath) or die "could not opendir($srcpath): $!";
!     while (my $entry = readdir($directory))
      {
!         next if ($entry eq '.' or $entry eq '..');
!         _copypath_recurse($base_src_dir, $base_dest_dir,
!             $curr_path eq '' ? $entry : "$curr_path/$entry", $filterfn)
!           or die "copypath $srcpath/$entry -> $destpath/$entry failed";
      }
-     closedir($directory);

!     return 1;
  }

  1;
--- 108,154 ----
      # same name and we're done.
      if (-f $srcpath)
      {
!         my $fh;
!         unless (open($fh, '<', $srcpath))
!         {
!             return 1 if ($!{ENOENT});
!             die "open($srcpath) failed: $!";
!         }
!         copy($fh, $destpath)
            or die "copy $srcpath -> $destpath failed: $!";
+         close $fh;
          return 1;
      }

!     # If it's a directory, create it on dest and recurse into it.
!     if (-d $srcpath)
      {
!         my $directory;
!         unless (opendir($directory, $srcpath))
!         {
!             return 1 if ($!{ENOENT});
!             die "opendir($srcpath) failed: $!";
!         }
!
!         mkdir($destpath) or die "mkdir($destpath) failed: $!";
!
!         while (my $entry = readdir($directory))
!         {
!             next if ($entry eq '.' or $entry eq '..');
!             _copypath_recurse($base_src_dir, $base_dest_dir,
!                 $curr_path eq '' ? $entry : "$curr_path/$entry", $filterfn)
!               or die "copypath $srcpath/$entry -> $destpath/$entry failed";
!         }
!
!         closedir($directory);
!         return 1;
      }

!     # If it disappeared from sight, that's OK.
!     return 1 if !-e $srcpath;
!
!     # Else it's some weird file type; complain.
!     die "Source path \"$srcpath\" is not a regular file or directory";
  }

  1;
diff --git a/src/test/recovery/t/010_logical_decoding_timelines.pl
b/src/test/recovery/t/010_logical_decoding_timelines.pl
index edc0219..5620450 100644
*** a/src/test/recovery/t/010_logical_decoding_timelines.pl
--- b/src/test/recovery/t/010_logical_decoding_timelines.pl
*************** use warnings;
*** 24,30 ****
  use PostgresNode;
  use TestLib;
  use Test::More tests => 13;
- use RecursiveCopy;
  use File::Copy;
  use IPC::Run ();
  use Scalar::Util qw(blessed);
--- 24,29 ----

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Michael Paquier
Date:
On Mon, Sep 11, 2017 at 11:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hm, I don't much like having it silently ignore files that are present;
> that seems like a foot-gun in the long run.  What do you think of the
> attached?

That looks good to me. I have tried pretty hard to break it, but could not.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Mon, Sep 11, 2017 at 11:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hm, I don't much like having it silently ignore files that are present;
>> that seems like a foot-gun in the long run.  What do you think of the
>> attached?

> That looks good to me. I have tried pretty hard to break it, but could not.

Pushed, thanks for testing!
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Noah Misch
Date:
On Fri, Sep 08, 2017 at 10:32:03PM -0400, Tom Lane wrote:
> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
> a better implementation in CPAN?)

Fewer people will test as we grow the list of modules they must first install.
Bundling a copy is tempting, but most CPAN modules use the Perl license.
We're in that hole already[1], but I perceived a consensus to stop digging.
Code used only for testing is less concerning, but I'd still not bundle it.

[1] https://www.postgresql.org/message-id/flat/54EC6D80.5090507%40vmware.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Craig Ringer
Date:
On 6 October 2017 at 14:03, Noah Misch <noah@leadboat.com> wrote:
> On Fri, Sep 08, 2017 at 10:32:03PM -0400, Tom Lane wrote:
>> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
>> a better implementation in CPAN?)
>
> Fewer people will test as we grow the list of modules they must first install.

Meh, I don't buy that. At worst, all we have to do is provide a script
that fetches them, from distro repos if possible, and failing that
from CPAN.

With cpanminus, that's pretty darn simple too.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Noah Misch
Date:
On Fri, Oct 06, 2017 at 05:57:24PM +0800, Craig Ringer wrote:
> On 6 October 2017 at 14:03, Noah Misch <noah@leadboat.com> wrote:
> > On Fri, Sep 08, 2017 at 10:32:03PM -0400, Tom Lane wrote:
> >> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
> >> a better implementation in CPAN?)
> >
> > Fewer people will test as we grow the list of modules they must first install.
> 
> Meh, I don't buy that. At worst, all we have to do is provide a script
> that fetches them, from distro repos if possible, and failing that
> from CPAN.
> 
> With cpanminus, that's pretty darn simple too.

If the tree had such a script and it were reliable, then yes, it would matter
little whether the script procured one module or five.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Andrew Dunstan
Date:

On 10/13/2017 01:04 AM, Noah Misch wrote:
> On Fri, Oct 06, 2017 at 05:57:24PM +0800, Craig Ringer wrote:
>> On 6 October 2017 at 14:03, Noah Misch <noah@leadboat.com> wrote:
>>> On Fri, Sep 08, 2017 at 10:32:03PM -0400, Tom Lane wrote:
>>>> (I do kinda wonder why we rolled our own RecursiveCopy; surely there's
>>>> a better implementation in CPAN?)
>>> Fewer people will test as we grow the list of modules they must first install.
>> Meh, I don't buy that. At worst, all we have to do is provide a script
>> that fetches them, from distro repos if possible, and failing that
>> from CPAN.
>>
>> With cpanminus, that's pretty darn simple too.
> If the tree had such a script and it were reliable, then yes, it would matter
> little whether the script procured one module or five.
>
>



Not everyone has cpanminus installed either. My approach in the
buildfarm code is to lean over backwards in order to avoid non-standard
modules. For the problem at hand we use cp/xcopy, but the tree being
copied is stable so we don't run into the disappearing/changing file
problem.


cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Robert Haas
Date:
On Fri, Oct 6, 2017 at 5:57 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Fewer people will test as we grow the list of modules they must first install.
>
> Meh, I don't buy that.

Well, I do.  Prerequisites are a pain, and the more of them there are,
the more pain it is.

> At worst, all we have to do is provide a script
> that fetches them, from distro repos if possible, and failing that
> from CPAN.
>
> With cpanminus, that's pretty darn simple too.

I think the idea that this is going to work for everybody is not very
realistic.  I am not going to run some random script to install stuff
on my system in whatever manner it feels like, because when I do stuff
like that I usually end up regretting it.  If it's a throw-away VM,
sure, but not otherwise.

We don't even have good documentation of what packages you should
install on various packaging systems to build the server, let alone
all of the additional things that may be required depending on build
options and TAP tests you might want to run.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Still another race condition in recovery TAP tests

From
Michael Paquier
Date:
On Tue, Oct 17, 2017 at 5:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Oct 6, 2017 at 5:57 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>>> Fewer people will test as we grow the list of modules they must first install.
>> At worst, all we have to do is provide a script
>> that fetches them, from distro repos if possible, and failing that
>> from CPAN.
>>
>> With cpanminus, that's pretty darn simple too.
>
> I think the idea that this is going to work for everybody is not very
> realistic.  I am not going to run some random script to install stuff
> on my system in whatever manner it feels like, because when I do stuff
> like that I usually end up regretting it.  If it's a throw-away VM,
> sure, but not otherwise.

Yeah, having something which is network-dependent to run the full set
of regression tests is not a good idea. I have seen users, at least in
Japan, deploying Postgres on hosts that have no external connection,
just a base OS image built with all packages present. Data center
rules can be very strict.

> We don't even have good documentation of what packages you should
> install on various packaging systems to build the server, let alone
> all of the additional things that may be required depending on build
> options and TAP tests you might want to run.

On top of that, let's not forget that we take the time to improve the
modules what we already have in the code tree. Let's not forget that
;)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers