Re: New WAL record to detect the checkpoint redo location - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: New WAL record to detect the checkpoint redo location
Date
Msg-id CAA4eK1Kv68f7WegbJF=KK8HAROnHm6EuZ=xL9_130epqw5Q-Ew@mail.gmail.com
Whole thread Raw
In response to Re: New WAL record to detect the checkpoint redo location  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: New WAL record to detect the checkpoint redo location
List pgsql-hackers
On Thu, Sep 21, 2023 at 7:05 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Sep 18, 2023 at 2:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > I've been brainstorming about this today, trying to figure out some
> > ideas to make it work.
>
> Here are some patches.
>
> 0001 refactors XLogInsertRecord to unify a couple of separate tests of
> isLogSwitch, hopefully making it cleaner and cheaper to add more
> special cases.
>
> 0002 is a very minimal patch to add XLOG_CHECKPOINT_REDO without using
> it for anything.
>
> 0003 modifies CreateCheckPoint() to insert an XLOG_CHECKPOINT_REDO
> record for any non-shutdown checkpoint, and modifies
> XLogInsertRecord() to treat that as a new special case, wherein after
> inserting the record the redo pointer is reset while still holding the
> WAL insertion locks.
>

After the 0003 patch, do we need acquire exclusive lock via
WALInsertLockAcquireExclusive() for non-shutdown checkpoints. Even the
comment "We must block concurrent insertions while examining insert
state to determine the checkpoint REDO pointer." seems to indicate
that it is not required. If it is required then we may want to change
the comments and also acquiring the locks twice will have more cost
than acquiring it once and write the new WAL record under that lock.

One minor comment:
+ else if (XLOG_CHECKPOINT_REDO)
+ class = WALINSERT_SPECIAL_CHECKPOINT;
+ }

Isn't the check needs to compare the record type with info?

Your v6-0001* patch looks like an improvement to me even without the
other two patches.

BTW, I would like to mention that there is a slight interaction of
this work with the patch to upgrade/migrate slots [1]. Basically in
[1], to allow slots migration from lower to higher version, we need to
ensure that all the WAL has been consumed by the slots before clean
shutdown. However, during upgrade we can generate few records like
checkpoint which we will ignore for the slot consistency checking as
such records doesn't matter for data consistency after upgrade. We
probably need to add this record to that list. I'll keep an eye on
both the patches so that we don't miss that interaction but mentioned
it here to make others also aware of the same.

[1] -
https://www.postgresql.org/message-id/TYAPR01MB586615579356A84A8CF29A00F5F9A%40TYAPR01MB5866.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Next
From: Maxim Orlov
Date:
Subject: Re: should frontend tools use syncfs() ?