Thread: Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Tom Lane

Date:

07 April 2022, 17:40:30

Michael Paquier <michael@paquier.xyz> writes:
> Add TAP test for archive_cleanup_command and recovery_end_command

grassquit just showed a non-reproducible failure in this test [1]:

# Postmaster PID for node "standby" is 291160
ok 1 - check content from archives
not ok 2 - archive_cleanup_command executed on checkpoint

#   Failed test 'archive_cleanup_command executed on checkpoint'
#   at t/002_archiving.pl line 74.

This test is sending a CHECKPOINT command to the standby and
expecting it to run the archive_cleanup_command, but it looks
like the standby did not actually run any checkpoint:

2022-04-07 16:11:33.060 UTC [291806][not initialized][:0] LOG:  connection received: host=[local]
2022-04-07 16:11:33.078 UTC [291806][client backend][2/15:0] LOG:  connection authorized: user=bf database=postgres
application_name=002_archiving.pl
2022-04-07 16:11:33.084 UTC [291806][client backend][2/16:0] LOG:  statement: CHECKPOINT
2022-04-07 16:11:33.092 UTC [291806][client backend][:0] LOG:  disconnection: session time: 0:00:00.032 user=bf
database=postgreshost=[local] 

I am suspicious that the reason is that ProcessUtility does not
ask for a forced checkpoint when in recovery:

            RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
                              (RecoveryInProgress() ? 0 : CHECKPOINT_FORCE));

The trouble with this theory is that this test has been there for
nearly six months and this is the first such failure (I scraped the
buildfarm logs to be sure).  Seems like failures should be a lot
more common than that.  I wondered if the recent pg_stats changes
could have affected this, but I don't really see how.

Thoughts?

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2022-04-07%2015%3A45%3A48

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andres Freund

Date:

07 April 2022, 17:52:10

Hi,

On 2022-04-07 13:40:30 -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > Add TAP test for archive_cleanup_command and recovery_end_command
> 
> grassquit just showed a non-reproducible failure in this test [1]:

I was just staring at that as well.


> # Postmaster PID for node "standby" is 291160
> ok 1 - check content from archives
> not ok 2 - archive_cleanup_command executed on checkpoint
> 
> #   Failed test 'archive_cleanup_command executed on checkpoint'
> #   at t/002_archiving.pl line 74.
> 
> This test is sending a CHECKPOINT command to the standby and
> expecting it to run the archive_cleanup_command, but it looks
> like the standby did not actually run any checkpoint:
> 
> 2022-04-07 16:11:33.060 UTC [291806][not initialized][:0] LOG:  connection received: host=[local]
> 2022-04-07 16:11:33.078 UTC [291806][client backend][2/15:0] LOG:  connection authorized: user=bf database=postgres
application_name=002_archiving.pl
> 2022-04-07 16:11:33.084 UTC [291806][client backend][2/16:0] LOG:  statement: CHECKPOINT
> 2022-04-07 16:11:33.092 UTC [291806][client backend][:0] LOG:  disconnection: session time: 0:00:00.032 user=bf
database=postgreshost=[local]
 
> 
> I am suspicious that the reason is that ProcessUtility does not
> ask for a forced checkpoint when in recovery:
> 
>             RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
>                               (RecoveryInProgress() ? 0 : CHECKPOINT_FORCE));
> 
> The trouble with this theory is that this test has been there for
> nearly six months and this is the first such failure (I scraped the
> buildfarm logs to be sure).  Seems like failures should be a lot
> more common than that.

> I wondered if the recent pg_stats changes could have affected this, but I
> don't really see how.

I don't really see either. It's a bit more conceivable that the recovery
prefetching changes could affect the timing sufficiently?

It's also possible that it requires an animal of a certain speed to happen -
we didn't have an -fsanitize=address animal until recently.

I guess we'll have to wait and see what the frequency of the problem is?

Greetings,

Andres Freund

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Tom Lane

Date:

07 April 2022, 17:57:45

Andres Freund <andres@anarazel.de> writes:
> On 2022-04-07 13:40:30 -0400, Tom Lane wrote:
>> This test is sending a CHECKPOINT command to the standby and
>> expecting it to run the archive_cleanup_command, but it looks
>> like the standby did not actually run any checkpoint:
>> ...
>> I wondered if the recent pg_stats changes could have affected this, but I
>> don't really see how.

> I don't really see either. It's a bit more conceivable that the recovery
> prefetching changes could affect the timing sufficiently?

Oh, that's at least a little plausible.

> I guess we'll have to wait and see what the frequency of the problem is?

Yeah, with only one instance it could just be cosmic rays or something.
However, assuming it is real, I guess I wonder why we don't say
CHECKPOINT_FORCE in standby mode too.

            regards, tom lane

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andres Freund

Date:

07 April 2022, 18:12:02

Hi,

On 2022-04-07 13:57:45 -0400, Tom Lane wrote:
> Yeah, with only one instance it could just be cosmic rays or something.
> However, assuming it is real, I guess I wonder why we don't say
> CHECKPOINT_FORCE in standby mode too.

I guess it might partially be that restartpoints require a checkpoint to have
happened on the primary. If we used FORCE, we'd have to wait till the next
checkpoint on the primary, which'd be a problem if it's e.g. a manually issued
CHECKPOINT; before shutting the standby down.

Greetings,

Andres Freund

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Tom Lane

Date:

08 April 2022, 16:36:08

Andres Freund <andres@anarazel.de> writes:
> On 2022-04-07 13:57:45 -0400, Tom Lane wrote:
>> Yeah, with only one instance it could just be cosmic rays or something.

Not cosmic rays: skink has shown the same symptom three times running.
Looks like maybe the archive_cleanup_command itself is doing something
it shouldn't?

            regards, tom lane

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Tom Lane

Date:

08 April 2022, 21:55:51

Andres Freund <andres@anarazel.de> writes:
> On 2022-04-07 13:57:45 -0400, Tom Lane wrote:
>> Yeah, with only one instance it could just be cosmic rays or something.
>> However, assuming it is real, I guess I wonder why we don't say
>> CHECKPOINT_FORCE in standby mode too.

> I guess it might partially be that restartpoints require a checkpoint to have
> happened on the primary. If we used FORCE, we'd have to wait till the next
> checkpoint on the primary, which'd be a problem if it's e.g. a manually issued
> CHECKPOINT; before shutting the standby down.

After seeing skink's results, I tried running that test under valgrind
here, and it fails just like that every time.  skink's history allows
us to bound the failure introduction between 79b716cfb7 and
d7ab2a9a3c, which I think makes it just about certain that it was
5dc0418fab (Prefetch data referenced by the WAL, take II), though I've
not bisected to be 100% sure.

Adding some debug printouts to ExecuteRecoveryCommand convinces me
that indeed the archive_cleanup_command is NOT getting called by the
problematic CHECKPOINT command.  I surmise based on Andres' comment
above that the standby isn't making a restartpoint for lack of
an available primary checkpoint, which looks to me like it could be
a pre-existing bug in the test case: it's sure not doing anything to
guarantee that the primary's checkpoint record has reached the standby.

I tried adjusting the patch so it does guarantee that (as attached),
and in two out of two tries it got past the archive_cleanup_command
failure but then hung up waiting for standby2 to promote.

On the whole, I'm not sure that the WAL prefetch logic is noticeably
more stable than when we booted it out last year :-(.  However, I also
wonder why it is that this test case wasn't occasionally failing already.

            regards, tom lane

diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index c8f5ffbaf0..1032d8a388 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -44,13 +44,14 @@ $node_standby->start;
 # Create some content on primary
 $node_primary->safe_psql('postgres',
     "CREATE TABLE tab_int AS SELECT generate_series(1,1000) AS a");
-my $current_lsn =
-  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");

 # Note the presence of this checkpoint for the archive_cleanup_command
 # check done below, before switching to a new segment.
 $node_primary->safe_psql('postgres', "CHECKPOINT");

+my $current_lsn =
+  $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
+
 # Force archiving of WAL file to make it present on primary
 $node_primary->safe_psql('postgres', "SELECT pg_switch_wal()");

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andres Freund

Date:

08 April 2022, 23:56:08

Hi,

On 2022-04-08 17:55:51 -0400, Tom Lane wrote:
> After seeing skink's results, I tried running that test under valgrind
> here, and it fails just like that every time.  skink's history allows
> us to bound the failure introduction between 79b716cfb7 and
> d7ab2a9a3c, which I think makes it just about certain that it was
> 5dc0418fab (Prefetch data referenced by the WAL, take II), though I've
> not bisected to be 100% sure.

I've tested it, it's 5dc0418fab that makes the difference. I reduced the cycle
time by making initdb not go through valgrind, but have normal postgres
instances go through it.

> On the whole, I'm not sure that the WAL prefetch logic is noticeably
> more stable than when we booted it out last year :-(.

IDK. Last year's issues seems to have largely been caused by a flaky
machine. And a bug, if it's that, in some archiving corner case that's not
normally reached during tests...

Greetings,

Andres Freund

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andres Freund

Date:

09 April 2022, 00:59:10

Hi,

On 2022-04-08 17:55:51 -0400, Tom Lane wrote:
> I tried adjusting the patch so it does guarantee that (as attached),
> and in two out of two tries it got past the archive_cleanup_command
> failure but then hung up waiting for standby2 to promote.

Adding

$node_standby->safe_psql('postgres', "SELECT pg_switch_wal()");
just after
$node_standby2->start;

makes the tests pass here.


What is that second test really testing?

# Check the presence of temporary files specifically generated during
# archive recovery.  To ensure the presence of the temporary history
# file, switch to a timeline large enough to allow a standby to recover
# a history file from an archive.  As this requires at least two timeline
# switches, promote the existing standby first.  Then create a second
# standby based on the promoted one.  Finally, the second standby is
# promoted.

Note "Then create a second standby based on the promoted one." - but that's
not actually what's happening:

$node_standby2->init_from_backup($node_primary, $backup_name,
    has_restoring => 1);

It's created from the original primary, not the first standby, as the
description says...

Both nodes get promoted independently, in a run without valgrind:

standby:
2022-04-08 17:23:42.966 PDT [2463835][startup][1/0:0][] LOG:  INSERT @ 0/4000058:  - XLOG/END_OF_RECOVERY: tli 2; prev
tli1; time 2022-04-08 17:23:42.96686-07
 
2022-04-08 17:23:42.966 PDT [2463835][startup][1/0:0][] LOG:  xlog flush request 0/4000058; write 0/4000000; flush
0/4000000

standby2:
2022-04-08 17:23:43.307 PDT [2463999][startup][1/0:0][] LOG:  INSERT @ 0/4000058:  - XLOG/END_OF_RECOVERY: tli 3; prev
tli1; time 2022-04-08 17:23:43.307443->
 
2022-04-08 17:23:43.307 PDT [2463999][startup][1/0:0][] LOG:  xlog flush request 0/4000058; write 0/4000000; flush
0/4000000

except that standby2 can't choose tli 2 because it finds it used.

Sure looks like something is funky with that test.


But I think there's also something funky in the prefetching logic. I think it
may attempt restoring during prefetching somehow, even though there's code
that appears to try to prevent that?

on standby2 I can see replay progress like the following:
2022-04-08 17:02:12.310 PDT [2441453][startup][1/0:0][] LOG:  REDO @ 0/3024488; LSN 0/30244C8: prev 0/3024448; xid 725;
len3; blkref #0: rel 1663/5/16384, blk 4 - Heap/INSERT: off 60 flags 0x00
 
2022-04-08 17:02:12.311 PDT [2441453][startup][1/0:0][] DEBUG:  record known xact 725 latestObservedXid 725
2022-04-08 17:02:12.311 PDT [2441453][startup][1/0:0][] CONTEXT:  WAL redo at 0/3024488 for Heap/INSERT: off 60 flags
0x00;blkref #0: rel 1663/5/16384, blk 4
 
2022-04-08 17:02:12.312 PDT [2441453][startup][1/0:0][] DEBUG:  executing restore command "cp
"/home/andres/build/postgres/dev-assert/vpath/src/test/recovery/tmp_check/t_002_archiving_primary_data/archives/000000010000000000000004"
"pg_wal/RECOVERYXLOG""
2022-04-08 17:02:13.855 PDT [2441453][startup][1/0:0][] DEBUG:  could not restore file "000000010000000000000004" from
archive:child process exited with exit code 1
 
2022-04-08 17:02:13.855 PDT [2441453][startup][1/0:0][] DEBUG:  could not open file "pg_wal/000000010000000000000004":
Nosuch file or directory
 
2022-04-08 17:02:13.856 PDT [2441453][startup][1/0:0][] LOG:  REDO @ 0/30244C8; LSN 0/3024508: prev 0/3024488; xid 725;
len3; blkref #0: rel 1663/5/16384, blk 4 - Heap/INSERT: off 61 flags 0x00
 
2022-04-08 17:02:13.856 PDT [2441453][startup][1/0:0][] DEBUG:  record known xact 725 latestObservedXid 725
2022-04-08 17:02:13.856 PDT [2441453][startup][1/0:0][] CONTEXT:  WAL redo at 0/30244C8 for Heap/INSERT: off 61 flags
0x00;blkref #0: rel 1663/5/16384, blk 4
 
2022-04-08 17:02:13.857 PDT [2441453][startup][1/0:0][] DEBUG:  executing restore command "cp
"/home/andres/build/postgres/dev-assert/vpath/src/test/recovery/tmp_check/t_002_archiving_primary_data/archives/000000010000000000000004"
"pg_wal/RECOVERYXLOG""
2022-04-08 17:02:15.413 PDT [2441453][startup][1/0:0][] DEBUG:  could not restore file "000000010000000000000004" from
archive:child process exited with exit code 1
 
2022-04-08 17:02:15.413 PDT [2441453][startup][1/0:0][] DEBUG:  could not open file "pg_wal/000000010000000000000004":
Nosuch file or directory
 
2022-04-08 17:02:15.414 PDT [2441453][startup][1/0:0][] LOG:  REDO @ 0/3024508; LSN 0/3024548: prev 0/30244C8; xid 725;
len3; blkref #0: rel 1663/5/16384, blk 4 - Heap/INSERT: off 62 flags 0x00
 
2022-04-08 17:02:15.414 PDT [2441453][startup][1/0:0][] DEBUG:  record known xact 725 latestObservedXid 725
2022-04-08 17:02:15.414 PDT [2441453][startup][1/0:0][] CONTEXT:  WAL redo at 0/3024508 for Heap/INSERT: off 62 flags
0x00;blkref #0: rel 1663/5/16384, blk 4
 
2022-04-08 17:02:15.415 PDT [2441453][startup][1/0:0][] DEBUG:  executing restore command "cp
"/home/andres/build/postgres/dev-assert/vpath/src/test/recovery/tmp_check/t_002_archiving_primary_data/archives/000000010000000000000004"
"pg_wal/RECOVERYXLOG""

note that we appear to wait between replaying of records, even though we
apparently have WAL for the next record!

And interestingly I'm not seeing the
"switched WAL source from stream to archive after failure"
lines I'd expect.

Greetings,

Andres Freund

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Thomas Munro

Date:

11 April 2022, 06:48:58

On Sat, Apr 9, 2022 at 12:59 PM Andres Freund <andres@anarazel.de> wrote:
> On 2022-04-08 17:55:51 -0400, Tom Lane wrote:
> > I tried adjusting the patch so it does guarantee that (as attached),
> > and in two out of two tries it got past the archive_cleanup_command
> > failure but then hung up waiting for standby2 to promote.
>
> Adding
>
> $node_standby->safe_psql('postgres', "SELECT pg_switch_wal()");
> just after
> $node_standby2->start;
>
> makes the tests pass here.

Sorry for the delay... I got a bit confused about the different things
going on in this thread but I hope I've got it now:

1.  This test had some pre-existing bugs/races, which hadn't failed
before due to scheduling, even under Valgrind.  The above changes
appear to fix those problems.  To Michael for comment.

> What is that second test really testing?
>
> # Check the presence of temporary files specifically generated during
> # archive recovery.  To ensure the presence of the temporary history
> # file, switch to a timeline large enough to allow a standby to recover
> # a history file from an archive.  As this requires at least two timeline
> # switches, promote the existing standby first.  Then create a second
> # standby based on the promoted one.  Finally, the second standby is
> # promoted.
>
> Note "Then create a second standby based on the promoted one." - but that's
> not actually what's happening:

2.  There may also be other problems with the test but those aren't
relevant to skink's failure, which starts on the 5th test.  To Michael
for comment.

> But I think there's also something funky in the prefetching logic. I think it
> may attempt restoring during prefetching somehow, even though there's code
> that appears to try to prevent that?

3.  Urghl.  Yeah.  There is indeed code to report XLREAD_WOULDBLOCK
for the lastSourceFailed case, but we don't really want to do that
more often than every 5 seconds.  So I think I need to make that
"sticky", so that we don't attempt to prefetch again (ie to read from
the next segment, which invokes restore_command) until we've replayed
all the records we already have, then hit the end and go to sleep.
The attached patch does that, and makes the offending test pass under
Valgrind for me, even without the other changes already mentioned.  If
I understand correctly, this is due to a timing race in the tests
(though I didn't check where exactly), because all those extra
fork/exec calls are extremely slow under Valgrind.

> And interestingly I'm not seeing the
> "switched WAL source from stream to archive after failure"
> lines I'd expect.

I see them now.  It's because it gives up when it's reading ahead
(nonblocking), which may not be strictly necessary but I found it
simpler to think about.  Then when it tries again in 5 seconds it's in
blocking mode so it doesn't give up so easily.

2022-04-11 18:15:08.220 NZST [524796] DEBUG:  switched WAL source from
stream to archive after failure
cp: cannot stat '/tmp/archive/000000010000000000000017': No such file
or directory
2022-04-11 18:15:08.226 NZST [524796] DEBUG:  could not restore file
"000000010000000000000017" from archive: child process exited with
exit code 1
2022-04-11 18:15:08.226 NZST [524796] DEBUG:  could not open file
"pg_wal/000000010000000000000017": No such file or directory
2022-04-11 18:15:08.226 NZST [524796] DEBUG:  switched WAL source from
archive to stream after failure

Attachment

0001-Don-t-retry-restore_command-while-reading-ahead.patch

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

11 April 2022, 07:43:26

On Mon, Apr 11, 2022 at 06:48:58PM +1200, Thomas Munro wrote:
> Sorry for the delay... I got a bit confused about the different things
> going on in this thread but I hope I've got it now:
>
> 1.  This test had some pre-existing bugs/races, which hadn't failed
> before due to scheduling, even under Valgrind.  The above changes
> appear to fix those problems.  To Michael for comment.

I have seen the thread, and there is a lot in it.  I will try to look
tomorrow at the parts I got involved in.
--
Michael

Attachment

signature.asc

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

12 April 2022, 03:49:48

On Mon, Apr 11, 2022 at 06:48:58PM +1200, Thomas Munro wrote:
> 1.  This test had some pre-existing bugs/races, which hadn't failed
> before due to scheduling, even under Valgrind.  The above changes
> appear to fix those problems.  To Michael for comment.

Yeah, there are two problems here.  From what I can see, ensuring the
execution of archive_cleanup_command on the standby needs the
checkpoint on the primary and the restart point on the standby.  So
pg_current_wal_lsn() should be located after the primary's checkpoint
and not before it so as we are sure that the checkpoint records finds
its way to the standby.  That's what Tom mentioned upthread.

The second problem is to make sure that $standby2 sees the promotion
of $standby and its history file, but we also want to recover
00000002.history from some archives to create a RECOVERYHISTORY at
recovery for the purpose of the test.  Switching to a new segment as
proposed by Andres does not seem completely right to me because we are
not 100% sure of the ordering an archive is going to happen, no?  I
think that the logic to create $standby2 from the initial backup of
the primary is right, because there is no 00000002.history in it, but
we also need to be sure that 00000002.history has been archived once
the promotion of $standby is done.  This can be validated thanks to
the logs, actually.

>> What is that second test really testing?
>>
>> # Check the presence of temporary files specifically generated during
>> # archive recovery.  To ensure the presence of the temporary history
>> # file, switch to a timeline large enough to allow a standby to recover
>> # a history file from an archive.  As this requires at least two timeline
>> # switches, promote the existing standby first.  Then create a second
>> # standby based on the promoted one.  Finally, the second standby is
>> # promoted.
>>
>> Note "Then create a second standby based on the promoted one." - but that's
>> not actually what's happening:
>
> 2.  There may also be other problems with the test but those aren't
> relevant to skink's failure, which starts on the 5th test.  To Michael
> for comment.

This comes from df86e52, where we want to recovery a history file that
would be created as RECOVERYHISTORY and make sure that the file gets
removed at the end of recovery.  So $standby2 should choose a new
timeline different from the one of chosen by $standby.  Looking back
at what has been done, it seems to me that the comment is the
incorrect part:
https://www.postgresql.org/message-id/20190930080340.GO2888@paquier.xyz

All that stuff leads me to the attached.  Thoughts?
--
Michael

Attachment

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Thomas Munro

Date:

16 April 2022, 20:56:33

On Tue, Apr 12, 2022 at 3:49 PM Michael Paquier <michael@paquier.xyz> wrote:
> All that stuff leads me to the attached.  Thoughts?

Under valgrind I got "Undefined subroutine &main::usleep called at
t/002_archiving.pl line 103" so I added "use Time::HiRes qw(usleep);",
and now I get past the first 4 tests with your patch, but then
promotion times out, not sure why:

+++ tap check in src/test/recovery +++
t/002_archiving.pl ..
ok 1 - check content from archives
ok 2 - archive_cleanup_command executed on checkpoint
ok 3 - recovery_end_command not executed yet
# found 00000002.history after 14 attempts
ok 4 - recovery_end_command executed after promotion
Bailout called.  Further testing stopped:  command "pg_ctl -D
/home/tmunro/projects/postgresql/src/test/recovery/tmp_check/t_002_archiving_standby2_data/pgdata
-l /home/tmunro/projects/postgresql/src/test/recovery/tmp_check/log/002_archiving_standby2.log
promote" exited with value 1

Since it's quite painful to run TAP tests under valgrind, I found a
place to stick a plain old sleep to repro these problems:

--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -1035,7 +1035,7 @@ sub enable_restoring
        my $copy_command =
          $PostgreSQL::Test::Utils::windows_os
          ? qq{copy "$path\\\\%f" "%p"}
-         : qq{cp "$path/%f" "%p"};
+         : qq{sleep 1 && cp "$path/%f" "%p"};

Soon I'll push the fix to the slowness that xlogprefetcher.c
accidentally introduced to continuous archive recovery, ie the problem
of calling a failing restore_command repeatedly as we approach the end
of a WAL segment instead of just once every 5 seconds after we run out
of data, and after that you'll probably need to revert that fix
locally to repro this.

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

17 April 2022, 04:17:32

On Sun, Apr 17, 2022 at 08:56:33AM +1200, Thomas Munro wrote:
> Under valgrind I got "Undefined subroutine &main::usleep called at
> t/002_archiving.pl line 103" so I added "use Time::HiRes qw(usleep);",
> and now I get past the first 4 tests with your patch, but then
> promotion times out, not sure why:
>
> +++ tap check in src/test/recovery +++
> t/002_archiving.pl ..
> ok 1 - check content from archives
> ok 2 - archive_cleanup_command executed on checkpoint
> ok 3 - recovery_end_command not executed yet
> # found 00000002.history after 14 attempts
> ok 4 - recovery_end_command executed after promotion
> Bailout called.  Further testing stopped:  command "pg_ctl -D
> /home/tmunro/projects/postgresql/src/test/recovery/tmp_check/t_002_archiving_standby2_data/pgdata
> -l /home/tmunro/projects/postgresql/src/test/recovery/tmp_check/log/002_archiving_standby2.log
> promote" exited with value 1

Hmm.  As far as I can see, aren't you just hitting the 60s timeout of
pg_ctl here due to the slowness of valgrind?

> Since it's quite painful to run TAP tests under valgrind, I found a
> place to stick a plain old sleep to repro these problems:

Actually, I am wondering how you are patching Cluster.pm to do that.

> Soon I'll push the fix to the slowness that xlogprefetcher.c
> accidentally introduced to continuous archive recovery, ie the problem
> of calling a failing restore_command repeatedly as we approach the end
> of a WAL segment instead of just once every 5 seconds after we run out
> of data, and after that you'll probably need to revert that fix
> locally to repro this.

Okay.  Thanks.  Anyway, I'll do something about that tomorrow (no
room to look at the buildfarm today), and I was thinking about
replacing the while loop I had in the last version of the patch with a
poll_query_until that does a pg_stat_file() with an absolute path to
the history file to avoid the dependency to usleep() in the test,
splitting the fix into two commits as there is more than one problem,
each applying to different branches.
--
Michael

Attachment

signature.asc

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andrew Dunstan

Date:

17 April 2022, 14:56:08

On 2022-04-17 Su 00:17, Michael Paquier wrote:
>> Since it's quite painful to run TAP tests under valgrind, I found a
>> place to stick a plain old sleep to repro these problems:
> Actually, I am wondering how you are patching Cluster.pm to do that.

I don't really think it's Cluster.pm's business to deal with that. It
takes an install path as given either explicitly or implicitly.

It shouldn't be too hard to get Makefile.global to install valgrind
wrappers into the tmp_install/bin directory.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

17 April 2022, 23:49:17

On Sun, Apr 17, 2022 at 10:56:08AM -0400, Andrew Dunstan wrote:
> I don't really think it's Cluster.pm's business to deal with that. It
> takes an install path as given either explicitly or implicitly.
>
> It shouldn't be too hard to get Makefile.global to install valgrind
> wrappers into the tmp_install/bin directory.

Or what gets used in just a wrapper of the contents of bin/ that get
enforced to be first in PATH?
--
Michael

Attachment

signature.asc

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

18 April 2022, 04:55:13

On Tue, Apr 12, 2022 at 12:49:48PM +0900, Michael Paquier wrote:
> This comes from df86e52, where we want to recovery a history file that
> would be created as RECOVERYHISTORY and make sure that the file gets
> removed at the end of recovery.  So $standby2 should choose a new
> timeline different from the one of chosen by $standby.  Looking back
> at what has been done, it seems to me that the comment is the
> incorrect part:
> https://www.postgresql.org/message-id/20190930080340.GO2888@paquier.xyz

acf1dd42 has taken care of the failures of this test with skink, and I
have just taken care of the two races in the tests with e61efaf and
1a8b110.  I have left e61efaf out of REL_10_STABLE as the idea of
relying on a poll_query_until() with pg_stat_file() and an absolute
path would not work there, and the branch will be EOL'd soon while
there were no complains with this test for two years.
--
Michael

Attachment

signature.asc

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Andrew Dunstan

Date:

18 April 2022, 17:32:33

On 2022-04-17 Su 19:49, Michael Paquier wrote:
> On Sun, Apr 17, 2022 at 10:56:08AM -0400, Andrew Dunstan wrote:
>> I don't really think it's Cluster.pm's business to deal with that. It
>> takes an install path as given either explicitly or implicitly.
>>
>> It shouldn't be too hard to get Makefile.global to install valgrind
>> wrappers into the tmp_install/bin directory.
> Or what gets used in just a wrapper of the contents of bin/ that get
> enforced to be first in PATH?



That seems likely to be difficult. For example pg_ctl might find its
colocated postgres rather than the valgrind wrapper. Ditto initdb.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Thomas Munro

Date:

18 April 2022, 21:45:11

On Mon, Apr 18, 2022 at 11:49 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Sun, Apr 17, 2022 at 10:56:08AM -0400, Andrew Dunstan wrote:
> > I don't really think it's Cluster.pm's business to deal with that. It
> > takes an install path as given either explicitly or implicitly.
> >
> > It shouldn't be too hard to get Makefile.global to install valgrind
> > wrappers into the tmp_install/bin directory.
>
> Or what gets used in just a wrapper of the contents of bin/ that get
> enforced to be first in PATH?

Delayed response to the question on how I did that, because it was a 4
day weekend down here and I got distracted by sunshine...

A horrible slow way to do it is to build with -DUSE_VALGRIND and then
just run the whole process tree (eg make, perl, psql, ... and all)
under valgrind with --trace-children=yes:

tmunro@x1:~/projects/postgresql$ valgrind --quiet
--suppressions=`pwd`/src/tools/valgrind.supp --trace-children=yes
--track-origins=yes --run-libc-freeres=no --vgdb=no
--error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END make -C
src/test/recovery/ check PROVE_TESTS=t/002_* PROVE_FLAGS=-v

I think that sort of thing actually worked when I tried it on a
beefier workstation, but it sent my Thinkpad that "only" has a 16GB of
RAM into some kind of death spiral.  The way I succeeded was indeed
using a wrapper script, based on a suggestion from Andres, my
kludgy-hardcoded-path-assuming implementation of which looked like:

=== install-postgres-valgrind, to be run once after building ===
#!/bin/sh

SRC=$HOME/projects/postgresql

# move the real binary out of the way
mv $SRC/src/backend/postgres $SRC/src/backend/postgres.real

# install the wrapper, in the location it'll be copied from for tmp_install
cp postgres.valgrind $SRC/src/backend/postgres
===

=== postgres.valgrind wrapper script ===
#!/bin/sh
exec /usr/bin/valgrind \
    --quiet \
    --error-exitcode=128 \
    --suppressions=$HOME/projects/postgresql/src/tools/valgrind.supp \
    --trace-children=yes --track-origins=yes --read-var-info=no \
    --leak-check=no \
    --run-libc-freeres=no \
    --vgdb=no \
    --error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END \
    $HOME/projects/postgresql/src/backend/postgres.real \
    "$@"
===

Then just:

make -C src/test/recovery/ check PROVE_TESTS=t/002_* PROVE_FLAGS=-v

Yeah, it might be quite neat to find a tool-supported way to do that.

Tangentially, I'd also like to look into making
PostgreSQL-under-Valgrind work on FreeBSD and macOS, which didn't work
last time I tried it for reasons that might, I hope, have been fixed
on the Valgrind side by now.

Re: pgsql: Add TAP test for archive_cleanup_command and recovery_end_comman

From

Michael Paquier

Date:

19 April 2022, 02:33:25

On Tue, Apr 19, 2022 at 09:45:11AM +1200, Thomas Munro wrote:
> Delayed response to the question on how I did that, because it was a 4
> day weekend down here and I got distracted by sunshine...

Happy Easter.

> I think that sort of thing actually worked when I tried it on a
> beefier workstation, but it sent my Thinkpad that "only" has a 16GB of
> RAM into some kind of death spiral.  The way I succeeded was indeed
> using a wrapper script, based on a suggestion from Andres, my
> kludgy-hardcoded-path-assuming implementation of which looked like:
>
> Yeah, it might be quite neat to find a tool-supported way to do that.

Thanks for the details.  I feared that it was something like that for
the backend.  At least that's better than having valgrind spawn all
the processes kicked by the make command.  :/

> Tangentially, I'd also like to look into making
> PostgreSQL-under-Valgrind work on FreeBSD and macOS, which didn't work
> last time I tried it for reasons that might, I hope, have been fixed
> on the Valgrind side by now.

Okay.

As a side note, skink has cooled down since acf1dd4, and did not
complain either after the additions of e61efaf and 1a8b110.
--
Michael

Attachment

signature.asc