Re: pg_upgrade and logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pg_upgrade and logical replication
Date
Msg-id CAA4eK1JEW+tJ_k04KrOLRSTn5iWzZdwoGdHPWL6daYGo9XR=ug@mail.gmail.com
Whole thread Raw
In response to Re: pg_upgrade and logical replication  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: pg_upgrade and logical replication
List pgsql-hackers
On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:
> > On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > An easy workaround that I tried is to allow something like
> > >
> > > ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)
> > >
> > > so that the refresh internally happens before the apply worker is started and
> > > you just keep consuming the delta, which works on naive scenario.
> > >
> > > One concern I have with this approach is that the default values for both
> > > "refresh" and "copy_data" for all other subcommands is "true, but we would
> > > probably need a different default value in that exact scenario (as we know we
> > > already have the data).  I think that it would otherwise be safe in my very
> > > specific scenario, assuming that you created the slot beforehand and moved the
> > > slot's LSN at the promotion point, as even if you add non-empty tables to the
> > > publication you will only need the delta whether those were initially empty or
> > > not given your initial physical replica state.
> > >
> >
> > This point is not very clear. Why would one just need delta even for new tables?
>
> Because in my scenario I'm coming from physical replication, so I know that I
> did replicate everything until the promotion LSN.  Any table later added in the
> publication is either already fully replicated until that LSN on the upgraded
> node, so only the delta is needed, or has been created after that LSN.  In the
> latter case, the entirety of the table will be replicated with the logical
> replication as a delta right?
>

That makes sense to me.

> > >  Any other scenario would make
> > > this new option dangerous, if not entirely useless, but not more than any of
> > > the current commands that lead to refreshing a subscription and have the same
> > > options I guess.
> > >
> > > All in all, currently the only way to somewhat safely resume logical
> > > replication after a pg_upgrade is to drop all the subscriptions that were
> > > transferred during pg_upgrade on all databases and recreate them (using the
> > > existing slots on the publisher side obviously), allowing the initial
> > > connection.  But this approach only works in the exact scenario I mentioned
> > > (physical to logical replication, or at least a case where *all* the tables
> > > where logically replicated prior to the pg_ugprade), otherwise you have to
> > > recreate the follower node from scratch using logical repication.
> > >
> >
> > I think if you dropped and recreated the subscriptions by retaining
> > old slots, the replication should resume from where it left off before
> > the upgrade. Which scenario are you concerned about?
>
> I'm concerned about people not coming from physical replication.  If you just
> had some "normal" logical replication, you can't assume that you already have
> all the data from the upstream subscription.  If it was modified and a non
> empty table is added, you might need to copy the data of part of the tables and
> keep replicating for the rest.  It's hard to be sure from a user point of view,
> and even if you knew you have no way to express it.
>

Can't the user create a separate publication for such newly added
tables and a corresponding new subscription on the downstream node?
Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

> > > Is that indeed the current behavior, or did I miss something?
> > >
> > > Is this "resume logical replication on pg_upgraded node" something we want to
> > > support better?  I was thinking that we could add a new pg_dump mode (maybe
> > > only usable during pg_upgrade) that also restores the pg_subscription_rel
> > > content in each subscription or something like that.  If not, should pg_upgrade
> > > keep preserving the subscriptions as it doesn't seem safe to use them, or at
> > > least document the hazards (I didn't find anything about it in the
> > > documentation)?
> > >
> > >
> >
> > There is a mention of this in pg_dump docs. See [1] (When dumping
> > logical replication subscriptions ...)
>
> Indeed, but it's barely saying "It is then up to the user to reactivate the
> subscriptions in a suitable way" and "It might also be appropriate to truncate
> the target tables before initiating a new full table copy".  As I mentioned, I
> don't think there's a suitable way to reactivate the subscription, at least if
> you don't want to miss some records, so truncating all target tables is the
> only fully safe way to proceed.  It seems quite silly to have to do so just
> because pg_upgrade doesn't retain the list of relation per subscription.
>

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16
Next
From: Pavel Stehule
Date:
Subject: Re: Share variable between psql backends in CustomScan