Home > mailing lists

Re: needless complexity in StartupXLOG - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: needless complexity in StartupXLOG
Date	July 26, 2021 22:53:20
Msg-id	CA+TgmoaKy9RRm-zaeFF=UCwR7J=R0xEbVvQ-WLR=0mAbOZ2P+A@mail.gmail.com Whole thread Raw
In response to	Re: needless complexity in StartupXLOG (Stephen Frost <sfrost@snowman.net>)
Responses	Re: needless complexity in StartupXLOG Re: needless complexity in StartupXLOG
List	pgsql-hackers

Tree view

On Mon, Jul 26, 2021 at 1:32 PM Stephen Frost <sfrost@snowman.net> wrote:
> Yeah, tend to agree with this too ... but something I find a bit curious
> is the comment:
>
> * Insert a special WAL record to mark the end of
> * recovery, since we aren't doing a checkpoint.
>
> ... immediately after setting promoted = true, and then at the end of
> StartupXLOG() having:
>
> if (promoted)
>         RequestCheckpoint(CHECKPOINT_FORCE);
>
> maybe I'm missing something, but seems like that comment isn't being
> terribly clear.  Perhaps we aren't doing a full checkpoint *there*, but
> sure looks like we're going to do one moments later regardless of
> anything else since we've set promoted to true..?

Yep. So it's a question of whether we allow operations that might
write WAL in the meantime. When we write the checkpoint record right
here, there can't be any WAL from the new server lifetime until the
checkpoint completes. When we write an end-of-recovery record, there
can. And there could actually be quite a bit, because if we do the
checkpoint right in this section of code, it will be a fast
checkpoint, whereas in the code you quoted above, it's a spread
checkpoint, which takes a lot longer. So the question is whether it's
reasonable to give the checkpoint some time to complete or whether it
needs to be completed right now.

> Agreed that simpler logic is better, provided it's correct logic, of
> course.  Finding better ways to test all of this would be really nice.

Yeah, and there again, it's a lot easier to test if we don't have as
many cases. Now no single change is going to fix that, but the number
of flag variables in xlog.c is simply bonkers. This particular stretch
of code uses 3 of them to even decide whether to attempt the test in
question, and all of those are set in complex ways depending on the
values of still more flag variables. The comments where
bgwriterLaunched is set claim that we only do this during archive
recovery, not crash recovery, but the code depends on the value of
ArchiveRecoveryRequested, not InArchiveRecovery. So I wonder if we
can't get the bgwriter to run even during crash recovery in the
scenario described by:

         * It's possible that archive recovery was requested, but we don't
         * know how far we need to replay the WAL before we reach consistency.
         * This can happen for example if a base backup is taken from a
         * running server using an atomic filesystem snapshot, without calling
         * pg_start/stop_backup. Or if you just kill a running primary server
         * and put it into archive recovery by creating a recovery signal
         * file.

If we ran the bgwriter all the time during crash recovery, we'd know
for sure whether that causes any problems. If we only do it like this
in certain corner cases, it's much more likely that we have bugs.
Grumble, grumble.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Alvaro Herrera
Date: 26 July 2021, 22:35:29
Subject: Re: Some code cleanup for pgbench and pg_verifybackup

From: Tom Lane
Date: 26 July 2021, 22:56:01
Subject: Re: [PATCH] proposal for regexp_count, regexp_instr, regexp_substr and regexp_replace

Re: needless complexity in StartupXLOG - Mailing list pgsql-hackers

Previous

Next