Re: pg_upgrade and logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pg_upgrade and logical replication
Date
Msg-id CAA4eK1KD-hZ3syruxJA6fK-JtSBzL6etkwToPuTmVkrCvT6ASw@mail.gmail.com
Whole thread Raw
In response to Re: pg_upgrade and logical replication  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: pg_upgrade and logical replication
List pgsql-hackers
On Wed, Feb 22, 2023 at 12:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Feb 20, 2023 at 03:07:37PM +0800, Julien Rouhaud wrote:
> > On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:
> > >
> > > I think the current mechanism tries to provide more flexibility to the
> > > users. OTOH, in some of the cases where users don't want to change
> > > anything in the logical replication (both upstream and downstream
> > > function as it is) after the upgrade then they need to do more work. I
> > > think ideally there should be some option in pg_dump that allows us to
> > > dump the contents of pg_subscription_rel as well, so that is easier
> > > for users to continue replication after the upgrade. We can then use
> > > it for binary-upgrade mode as well.
> >
> > Is there really a use case for dumping the content of pg_subscription_rel
> > outside of pg_upgrade?

I think the users who want to take a dump and restore the entire
cluster may need it there for the same reason as pg_upgrade needs it.
TBH, I have not seen such a request but this is what I imagine one
would expect if we provide this functionality via pg_upgrade.

> >  I'm not particularly worried about the publisher going
> > away or changing while pg_upgrade is running , but for a normal pg_dump /
> > pg_restore I don't really see how anyone would actually want to resume logical
> > replication from a pg_dump, especially since it's almost guaranteed that the
> > node will already have consumed data from the publication that won't be in the
> > dump in the first place.
> >
> > Are you ok with the suggested syntax above (probably with extra parens to avoid
> > adding new keywords), or do you have some better suggestion?  I'm a bit worried
> > about adding some O(n) commands, as it can add some noticeable slow-down for
> > pg_upgrade-ing logical replica, but I don't really see how to avoid that.  Note
> > that if we make this option available to end-users, we will have to use the
> > relation name rather than its oid, which will make this option even more
> > expensive when restoring due to the extra lookups.
> >
> > For the pg_upgrade use-case, do you see any reason to not restore the
> > pg_subscription_rel by default?

As I said earlier, one can very well say that giving more flexibility
(in terms of where the publications will be after restore) after a
restore is a better idea. Also, we are doing the same till now without
any major complaints about the same, so it makes sense to keep the
current behavior as default.

> >  Maybe having an option to not restore it would
> > make sense if it indeed add noticeable overhead when publications have a lot of
> > tables?

Yeah, that could be another reason to not do it default.

>
> Since I didn't hear any objection I worked on a POC patch with this approach.
>
> For now when pg_dump is invoked with --binary, it will always emit extra
> commands to restore the relation list.  This command is only allowed when the
> server is started in binary upgrade mode.
>
> The new command is of the form
>
> ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')
>
> with the lsn part being optional.
>

BTW, do we restore the origin and its LSN after the upgrade? Because
without that this won't be sufficient as that is required for apply
worker to ensure that it is in sync with table sync workers.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: samay sharma
Date:
Subject: Re: Documentation for building with meson
Next
From: Michael Paquier
Date:
Subject: Re: Doc update for pg_stat_statements normalization