RE: [PoC] pg_upgrade: allow to upgrade publisher node - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: [PoC] pg_upgrade: allow to upgrade publisher node |
Date | |
Msg-id | OS0PR01MB5716CE2A338136DD1129D2AC9414A@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: [PoC] pg_upgrade: allow to upgrade publisher node (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
RE: [PoC] pg_upgrade: allow to upgrade publisher node
|
List | pgsql-hackers |
On Tuesday, August 15, 2023 11:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> > wrote: > > > > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada > <sawada.mshk@gmail.com> wrote: > > > > Another idea is (which might have already discussed thoguh) that we > check if the latest shutdown checkpoint LSN in the control file matches the > confirmed_flush_lsn in pg_replication_slots view. That way, we can ensure that > the slot has consumed all WAL records before the last shutdown. We don't > need to worry about WAL records generated after starting the old cluster > during the upgrade, at least for logical replication slots. > > > > > > > > > > Right, this is somewhat closer to what Patch is already doing. But > > > remember in this case we need to remember and use the latest > > > checkpoint from the control file before the old cluster is started > > > because otherwise the latest checkpoint location could be even > > > updated during the upgrade. So, instead of reading from WAL, we need > > > to change so that we rely on the control file's latest LSN. > > > > Yes, I was thinking the same idea. > > > > But it works for only replication slots for logical replication. Do we > > want to check if no meaningful WAL records are generated after the > > latest shutdown checkpoint, for manually created slots (or non-logical > > replication slots)? If so, we would need to have something reading WAL > > records in the end. > > > > > > I would prefer this > > > idea than to invent a new API/tool like pg_replslotdata. > > > > +1 Changed the check to compare the latest checkpoint lsn from pg_controldata with the confirmed_flush_lsn in pg_replication_slots view. > > > > > > > > The other point you and Bruce seem to be favoring is that instead of > > > dumping/restoring slots via pg_dump, we remember the required > > > information of slots retrieved during their validation in pg_upgrade > > > itself and use that to create the slots in the new cluster. Though I > > > am not aware of doing similar treatment for other objects we restore > > > in this case it seems reasonable especially because slots are not > > > stored in the catalog and we anyway already need to retrieve the > > > required information to validate them, so trying to again retrieve > > > it via pg_dump doesn't seem useful unless I am missing something. > > > Does this match your understanding? > > > > If there are use cases for --logical-replication-slots-only option > > other than pg_upgrade, it would be good to have it in pg_dump. I was > > just not sure of other use cases. > > > > It was primarily for upgrade purposes only. So, as we can't see a good reason to > go via pg_dump let's do it in upgrade unless someone thinks otherwise. Removed the new option in pg_dump and modified the pg_upgrade directly use the slot info to restore the slot in new cluster. > > > > > > > Yet another thing I am trying to consider is whether we can allow to > > > upgrade slots from 16 or 15 to later versions. As of now, the patch > > > has the following check: > > > getLogicalReplicationSlots() > > > { > > > ... > > > + /* Check whether we should dump or not */ if (fout->remoteVersion > > > + < 170000) return; > > > ... > > > } > > > > > > If we decide to use the existing view pg_replication_slots then can > > > we consider upgrading slots from the prior version to 17? Now, if we > > > want to invent any new API similar to pg_replslotdata then we can't > > > do this because it won't exist in prior versions but OTOH using > > > existing view pg_replication_slots can allow us to fetch slot info > > > from older versions as well. So, I think it is worth considering. > > > > I think that without 0001 patch the replication slots will not be able > > to pass the confirmed_flush_lsn check. > > > > Right, but we can think of backpatching the same. Anyway, we can do that as a > separate work by starting a new thread to see if there is a broader agreement > for backpatching such a change. For now, we can focus on >=v17. > Here is the new version patch which addressed above points. The new version patch also removes the --exclude-logical-replication-slots option due to recent comment. Thanks Kuroda-san for addressing most of the points. Best Regards, Hou zj
Attachment
pgsql-hackers by date: