Re: pgsql: Fix race in test of pg_switch_wal(). - Mailing list pgsql-committers

From Michael Paquier
Subject Re: pgsql: Fix race in test of pg_switch_wal().
Date
Msg-id 20201007023150.GE2256@paquier.xyz
Whole thread Raw
In response to Re: pgsql: Fix race in test of pg_switch_wal().  (Noah Misch <noah@leadboat.com>)
List pgsql-committers
On Tue, Oct 06, 2020 at 07:03:27PM -0700, Noah Misch wrote:
> There's a new 020_archive_status.pl failure:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2020-10-05%2023%3A02%3A17
>
> Would you like to diagnose/fix that one?

Wow, thanks.  This does not looks like an issue coming directly from
the test though:
2020-10-06 00:20:46.786 UTC [20906622:8] LOG:  restored log file "000000010000000000000003" from archive
2020-10-06 00:20:46.803 UTC [10748670:1] ERROR:  could not open file "pg_xlog/000000010000000000000003": No such file
ordirectory 
2020-10-06 00:20:46.880 UTC [21496712:4] psql ERROR:  checkpoint request failed
2020-10-06 00:20:46.880 UTC [21496712:5] psql HINT:  Consult recent messages in the server log for details.
[...]
error running SQL: 'psql:<stdin>:1: ERROR:  checkpoint request failed
HINT:  Consult recent messages in the server log for details.'

And it looks like a race condition between the checkpointer and the
startup process.  This failure involves the first checkpoint triggered
in $standby2 after it gets created, with this standby reaching a
consistent point before triggering a manual restartpoint.  That's a
bit strange though, the startup process considers that this segment is
restored, but the checkpointer complains that it does not actually
exist, so that's in contradiction with what the startup process tells
us.  :/
--
Michael

Attachment

pgsql-committers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pgsql: postgres_fdw: reestablish new connection if cached one is detect
Next
From: Amit Kapila
Date:
Subject: pgsql: Display the names of missing columns in error during logical rep