Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.) - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)
Date
Msg-id YrmHKCOBA1tR9PFy@paquier.xyz
Whole thread Raw
In response to Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)  (Noah Misch <noah@leadboat.com>)
Responses Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)
List pgsql-hackers
On Mon, Jun 27, 2022 at 12:04:57AM -0700, Noah Misch wrote:
> For me, it reproduces consistently with a sleep just before the startup
> process exits:

Nice catch.

> One can adapt the test to the server behavior by having the test wait for the
> archiver to start, as attached.  This is sufficient to make check-world pass
> with the above sleep in place.  I think we should also modify the PostgresNode
> archive_command to log a message.  That lack of logging was a obstacle
> upthread (as seen in commit 3279cef) and again here.

          ? qq{copy "%p" "$path\\\\%f"}
-         : qq{cp "%p" "$path/%f"};
+         : qq{echo >&2 "ARCHIVE_COMMAND %p"; cp "%p" "$path/%f"};

This is a bit inelegant.  Perhaps it would be better through a perl
wrapper like cp_history_files?

> An alternative would be to declare that the test is right and the server is
> wrong.  The postmaster knows how to start the checkpointer if the checkpointer
> is not running when the postmaster needs a shutdown checkpoint.  It could
> start the archiver around that same area:
>
>                 /* Start the checkpointer if not running */
>                 if (CheckpointerPID == 0)
>                     CheckpointerPID = StartCheckpointer();
>                 /* And tell it to shut down */
>                 if (CheckpointerPID != 0)
>                 {
>                     signal_child(CheckpointerPID, SIGUSR2);
>                     pmState = PM_SHUTDOWN;
>                 }
>
> Any opinions between the change-test and change-server approaches?

The startup sequence can be sometimes tricky.  Though I don't have a
specific argument coming into mind, I would stick to a fix in the
test.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Jelte Fennema
Date:
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel
Next
From: Bharath Rupireddy
Date:
Subject: Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row