Re: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PoC] pg_upgrade: allow to upgrade publisher node
Date
Msg-id CAA4eK1+8btsYhNQvw6QJ4iTw1wFhkFXXABT=ED1eHFvtekRanQ@mail.gmail.com
Whole thread Raw
In response to RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses RE: [PoC] pg_upgrade: allow to upgrade publisher node
List pgsql-hackers
On Wed, Aug 2, 2023 at 1:43 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > 3.
> > + /*
> > + * Get replication slots.
> > + *
> > + * XXX: Which information must be extracted from old node? Currently three
> > + * attributes are extracted because they are used by
> > + * pg_create_logical_replication_slot().
> > + */
> > + appendPQExpBufferStr(query,
> > + "SELECT slot_name, plugin, two_phase "
> > + "FROM pg_catalog.pg_replication_slots "
> > + "WHERE database = current_database() AND temporary = false "
> > + "AND wal_status IN ('reserved', 'extended');");
> >
> > Why are we ignoring the slots that have wal status as WALAVAIL_REMOVED
> > or WALAVAIL_UNRESERVED? I think the slots where wal status is
> > WALAVAIL_REMOVED, the corresponding slots are invalidated at some
> > point. I think such slots can't be used for decoding but these will be
> > dropped along with the subscription or when a user does it manually.
> > So, if we don't copy such slots after the upgrade then there could be
> > a problem in dropping the corresponding subscription. If we don't want
> > to copy over such slots then we need to provide instructions on what
> > users should do in such cases. OTOH, if we want to copy over such
> > slots then we need to find a way to invalidate such slots after copy.
> > Either way, this needs more analysis.
>
> I considered again here. At least WALAVAIL_UNRESERVED should be supported because
> the slot is still usable. It can return reserved or extended.
>
> As for WALAVAIL_REMOVED, I don't think it should be so that I added a description
> to the document.
>
> This feature re-create slots which have same name/plugins as old ones, not replicate
> its state. So if we copy them as-is slots become usable again. If subscribers refer
> the slot and then connect again at that time, changes between 'WALAVAIL_REMOVED'
> may be lost.
>
> Based on above slots must be copied as WALAVAIL_REMOVED, but as you said, we do
> not have a way to control that. the status is calculated by using restart_lsn,
> but there are no function to modify directly.
>
> One approach is adding an SQL funciton which set restart_lsn to aritrary value
> (or 0/0, invalid), but it seems dangerous.
>

So, we have three options here (a) As you have done in the patch,
document this limitation and request user to perform some manual steps
to drop the subscription; (b) don't allow upgrade to proceed if there
are invalid slots in the old cluster; (c) provide a new function like
pg_copy_logical_replication_slot_contents() where we copy the required
contents like invalid status(ReplicationSlotInvalidationCause), etc.

Personally, I would prefer (b) because it will minimize the steps
required to perform by the user after the upgrade and looks cleaner
solution.

Thoughts?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: [RFC] Clang plugin for catching suspicious typedef casting
Next
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby