Re: Impact of checkpointer during pg_upgrade - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Impact of checkpointer during pg_upgrade
Date
Msg-id CAA4eK1JKycamKLaepB-gh2v__mDRtZTp+GKZZmOz28RHdEQoNA@mail.gmail.com
Whole thread Raw
In response to Re: Impact of checkpointer during pg_upgrade  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Impact of checkpointer during pg_upgrade
Re: Impact of checkpointer during pg_upgrade
List pgsql-hackers
On Thu, Sep 7, 2023 at 12:55 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> (Just dumping what I have in mind while reading the thread.)
>
> On Sat, Sep 02, 2023 at 10:08:51AM +0530, Amit Kapila wrote:
> > During pg_upgrade, we start the server for the old cluster which can
> > allow the checkpointer to remove the WAL files. It has been noticed
> > that we do generate certain types of WAL records (e.g
> > XLOG_RUNNING_XACTS, XLOG_CHECKPOINT_ONLINE, and XLOG_FPI_FOR_HINT)
> > even during pg_upgrade for old cluster, so additional WAL records
> > could let checkpointer decide that certain WAL segments can be removed
> > (e.g. say wal size crosses max_slot_wal_keep_size_mb) and invalidate
> > the slots. Currently, I can't see any problem with this but for future
> > work where we want to migrate logical slots during an upgrade[1], we
> > need to decide what to do for such cases. The initial idea we had was
> > that if the old cluster has some invalid slots, we won't allow an
> > upgrade unless the user removes such slots or uses some option like
> > --exclude-slots. It is quite possible that slots got invalidated
> > during pg_upgrade due to no user activity. Now, even though the
> > possibility of the same is less I think it is worth considering what
> > should be the behavior.
> >
> > The other possibilities apart from not allowing an upgrade in such a
> > case could be (a) Before starting the old cluster, we fetch the slots
> > directly from the disk using some tool like [2] and make the decisions
> > based on that state; (b) During the upgrade, we don't allow WAL to be
> > removed if it can invalidate slots; (c) Copy/Migrate the invalid slots
> > as well but for that, we need to expose an API to invalidate the
> > slots; (d) somehow distinguish the slots that are invalidated during
> > an upgrade and then simply copy such slots because anyway we ensure
> > that all the WAL required by slot is sent before shutdown.
>
> Checking for any invalid slots at the beginning of the upgrade and
> complain sounds like a good thing to do, because these are not
> expected to begin with, no?  That's similar to a pre-check requirement
> that the slots should have fed the subscribers with all the data
> available up to the shutdown checkpoint when the publisher was stopped
> before running pg_upgrade.  So (a) is a good idea to prevent an
> upgrade based on a state we don't expect from the start, as something
> in check.c, I assume.
>
> On top of that, (b) sounds like a good idea to me anyway to be more
> defensive.  But couldn't we do that just like we do for autovacuum and
> force the GUCs that could remove the slot's WAL to not do their work
> instead?
>

I think if we just make max_slot_wal_keep_size to -1 that should be
sufficient to not let any slots get invalidated during upgrade. Do you
have anything else in mind?

  An upgrade is a special state of the cluster, and I'm not
> much into painting more checks based on IsBinaryUpgrade to prevent WAL
> segments to be removed while we can have a full control of what we
> want with just the GUCs that force the hand of the slots.  That just
> seems simpler to me, and the WAL recycling logic is already complex
> particularly with all the GUCs popping lately to force some conditions
> to do the WAL recycling and/or removal.
>
> During the upgrade, if some segments are removed and some of the slots
> we expect to still be valid once the upgrade is done are marked as
> invalid and become unusable, I think that we should just copy these
> slots, but also update their state data so as they can still be used
> with what we expect, as these could be wanted by the subscribers.
> That's what you mean with (d), I assume. Do you think that it would
> be possible to force the slot's data on the publishers so as they use
> a local LSN based on the new WAL we are resetting at?
>

Yes.

>
> At least that
> seems more friendly to me as this limits the set of manipulations to
> do on the slots for the end-user.  The protection from (b) ought to be
> enough, in itself.  (c) overlaps with (a), especially if we want to be
> able to read or write some of the slot's data offline.
>

If we do (b) either via GUCs or IsBinaryUpgrade check we don't need to
do any of (a), (b), or (d). I feel that would be a minimal and
sufficient fix to prevent any side impact of checkpointer on slots
during an upgrade.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: tender wang
Date:
Subject: Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()
Next
From: Ashutosh Bapat
Date:
Subject: Re: persist logical slots to disk during shutdown checkpoint