Thread: pgsql: Fix handling of WAL segments ready to be archived during crash r

pgsql: Fix handling of WAL segments ready to be archived during crash r

From
Michael Paquier
Date:
Fix handling of WAL segments ready to be archived during crash recovery

78ea8b5 has fixed an issue related to the recycling of WAL segments on
standbys depending on archive_mode.  However, it has introduced a
regression with the handling of WAL segments ready to be archived during
crash recovery, causing those files to be recycled without getting
archived.

This commit fixes the regression by tracking in shared memory if a live
cluster is either in crash recovery or archive recovery as the handling
of WAL segments ready to be archived is different in both cases (those
WAL segments should not be removed during crash recovery), and by using
this new shared memory state to decide if a segment can be recycled or
not.  Previously, it was not possible to know if a cluster was in crash
recovery or archive recovery as the shared state was able to track only
if recovery was happening or not, leading to the problem.

A set of TAP tests is added to close the gap here, making sure that WAL
segments ready to be archived are correctly handled when a cluster is in
archive or crash recovery with archive_mode set to "on" or "always", for
both standby and primary.

Reported-by: Benoît Lobréau
Author: Jehan-Guillaume de Rorthais
Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/20200331172229.40ee00dc@firost
Backpatch-through: 9.5

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/4e87c4836ab9059cdec17b0a288db3622a42ac18

Modified Files
--------------
src/backend/access/transam/xlog.c         |  57 ++++++--
src/backend/access/transam/xlogarchive.c  |  21 ++-
src/include/access/xlog.h                 |   9 ++
src/test/recovery/t/020_archive_status.pl | 214 ++++++++++++++++++++++++++++++
src/tools/pgindent/typedefs.list          |   1 +
5 files changed, 286 insertions(+), 16 deletions(-)


Re: pgsql: Fix handling of WAL segments ready to be archived duringcrash r

From
Michael Paquier
Date:
On Thu, Apr 23, 2020 at 11:51:16PM +0000, Michael Paquier wrote:
> Fix handling of WAL segments ready to be archived during crash recovery
>
> 78ea8b5 has fixed an issue related to the recycling of WAL segments on
> standbys depending on archive_mode.  However, it has introduced a
> regression with the handling of WAL segments ready to be archived during
> crash recovery, causing those files to be recycled without getting
> archived.

And the buildfarm is reporting a couple of failures related to the
stability of the test:
- The first reports show that on REL9_5_STABLE and REL9_6_STABLE, the
first round has showed to be rather stable even on Windows, except for
three animals using gcc 6.3.
- In 11~, crake and piculet have been complaining.

I have not been able to see any failures in the runs I did across all
the branches, even on Windows.  But here, all failures are related to
the three following tests on standbys:
not ok 8 - .ready file for WAL segment 000000010000000000000001
present in backup got removed with archive_mode=on on standby
not ok 10 - .done file for WAL segment 000000010000000000000002
created when archive_mode=on on standby
not ok 12 - .ready file for WAL segment 000000010000000000000002
created with archive_mode=always on standby

And this visibly comes down to the fact that we don't take care enough
of the timing between the restartpoints done, the startup process
doing its recycling work and the archiver.  The rest of the test
relies on the reports of pg_stat_archiver a points to wait at as
published by the archiver process.  So there are two things we could
do here:
1) Just remove the unstable parts of the tests (the three ones above),
and keep coverage based on everything we have using pg_stat_archiver.
2) Remove the test entirely, though I would rather have us keep some
coverage, particularly for primaries as this got broken.

I'd rather do 2), any thoughts?
--
Michael

Attachment

Re: pgsql: Fix handling of WAL segments ready to be archived duringcrash r

From
Michael Paquier
Date:
On Fri, Apr 24, 2020 at 09:59:29AM +0900, Michael Paquier wrote:
> And this visibly comes down to the fact that we don't take care enough
> of the timing between the restartpoints done, the startup process
> doing its recycling work and the archiver.  The rest of the test
> relies on the reports of pg_stat_archiver a points to wait at as
> published by the archiver process.  So there are two things we could
> do here:
> 1) Just remove the unstable parts of the tests (the three ones above),
> and keep coverage based on everything we have using pg_stat_archiver.
> 2) Remove the test entirely, though I would rather have us keep some
> coverage, particularly for primaries as this got broken.
>
> I'd rather do 2), any thoughts?

Oops, sorry.  I sent this message too quickly.  I would rather
actually do 1) and keep the major parts of the tests.  All the
buildfarm failures are just around the three checks mentioned
upthread.
--
Michael

Attachment

Re: pgsql: Fix handling of WAL segments ready to be archivedduring crash r

From
Kyotaro Horiguchi
Date:
At Fri, 24 Apr 2020 10:14:37 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> On Fri, Apr 24, 2020 at 09:59:29AM +0900, Michael Paquier wrote:
> > And this visibly comes down to the fact that we don't take care enough
> > of the timing between the restartpoints done, the startup process
> > doing its recycling work and the archiver.  The rest of the test
> > relies on the reports of pg_stat_archiver a points to wait at as
> > published by the archiver process.  So there are two things we could
> > do here:
> > 1) Just remove the unstable parts of the tests (the three ones above),
> > and keep coverage based on everything we have using pg_stat_archiver.
> > 2) Remove the test entirely, though I would rather have us keep some
> > coverage, particularly for primaries as this got broken.
> > 
> > I'd rather do 2), any thoughts?
> 
> Oops, sorry.  I sent this message too quickly.  I would rather
> actually do 1) and keep the major parts of the tests.  All the
> buildfarm failures are just around the three checks mentioned
> upthread.

Thanks for your trouble fixing the failures. 

I think we can reimplement them by waiting
pg_stat_archiver.last_failed_wal at least for archive_mode=always
case. I'm not sure about the case where archive_mode=on, though.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pgsql: Fix handling of WAL segments ready to be archived duringcrash r

From
Michael Paquier
Date:
On Fri, Apr 24, 2020 at 05:30:21PM +0900, Kyotaro Horiguchi wrote:
> I think we can reimplement them by waiting
> pg_stat_archiver.last_failed_wal at least for archive_mode=always
> case. I'm not sure about the case where archive_mode=on, though.

Yeah, the second case is more tricky, and there is no point to add a
poll query using pg_stat_file() on a .ready file either.
--
Michael

Attachment

Re: pgsql: Fix handling of WAL segments ready to be archived duringcrash r

From
Jehan-Guillaume de Rorthais
Date:
On Sat, 25 Apr 2020 09:31:14 +0900
Michael Paquier <michael@paquier.xyz> wrote:

> On Fri, Apr 24, 2020 at 05:30:21PM +0900, Kyotaro Horiguchi wrote:
> > I think we can reimplement them by waiting
> > pg_stat_archiver.last_failed_wal at least for archive_mode=always
> > case. I'm not sure about the case where archive_mode=on, though.  
> 
> Yeah, the second case is more tricky, and there is no point to add a
> poll query using pg_stat_file() on a .ready file either.

I suggested a patch in the original thread, waiting for the LSN gathered on 
the primary.