Thread: pg_upgrade and logical replication

pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

I was working on testing a major upgrade scenario using a mix of physical and
logical replication when I faced some unexpected problem leading to missing
rows.  Note that my motivation is to rely on physical replication / physical
backup to avoid recreating a node from scratch using logical replication, as
the initial sync with logical replication is much more costly and impacting
compared to pg_basebackup / restoring a physical backup, but the same problem
exist if you just pg_upgrade a node that has subscriptions.

The problem is that pg_upgrade creates the subscriptions on the newly upgraded
node using "WITH (connect = false)", which seems expected as you obviously
don't want to try to connect to the publisher at that point.  But then once the
newly upgraded node is restarted and ready to replace the previous one, unless
I'm missing something there's absolutely no possibility to use the created
subscriptions without losing some data from the publisher.

The reason is that the subscription doesn't have a local list of relation to
process until you refresh the subscription, but you can't refresh the
subscription without enabling it (and you can't enable it in a transaction),
which means that you have to let the logical worker start, consume and ignore
all changes that happened on the publisher side until the refresh happens.

An easy workaround that I tried is to allow something like

ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)

so that the refresh internally happens before the apply worker is started and
you just keep consuming the delta, which works on naive scenario.

One concern I have with this approach is that the default values for both
"refresh" and "copy_data" for all other subcommands is "true, but we would
probably need a different default value in that exact scenario (as we know we
already have the data).  I think that it would otherwise be safe in my very
specific scenario, assuming that you created the slot beforehand and moved the
slot's LSN at the promotion point, as even if you add non-empty tables to the
publication you will only need the delta whether those were initially empty or
not given your initial physical replica state.  Any other scenario would make
this new option dangerous, if not entirely useless, but not more than any of
the current commands that lead to refreshing a subscription and have the same
options I guess.

All in all, currently the only way to somewhat safely resume logical
replication after a pg_upgrade is to drop all the subscriptions that were
transferred during pg_upgrade on all databases and recreate them (using the
existing slots on the publisher side obviously), allowing the initial
connection.  But this approach only works in the exact scenario I mentioned
(physical to logical replication, or at least a case where *all* the tables
where logically replicated prior to the pg_ugprade), otherwise you have to
recreate the follower node from scratch using logical repication.

Is that indeed the current behavior, or did I miss something?

Is this "resume logical replication on pg_upgraded node" something we want to
support better?  I was thinking that we could add a new pg_dump mode (maybe
only usable during pg_upgrade) that also restores the pg_subscription_rel
content in each subscription or something like that.  If not, should pg_upgrade
keep preserving the subscriptions as it doesn't seem safe to use them, or at
least document the hazards (I didn't find anything about it in the
documentation)?



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> I was working on testing a major upgrade scenario using a mix of physical and
> logical replication when I faced some unexpected problem leading to missing
> rows.  Note that my motivation is to rely on physical replication / physical
> backup to avoid recreating a node from scratch using logical replication, as
> the initial sync with logical replication is much more costly and impacting
> compared to pg_basebackup / restoring a physical backup, but the same problem
> exist if you just pg_upgrade a node that has subscriptions.
>
> The problem is that pg_upgrade creates the subscriptions on the newly upgraded
> node using "WITH (connect = false)", which seems expected as you obviously
> don't want to try to connect to the publisher at that point.  But then once the
> newly upgraded node is restarted and ready to replace the previous one, unless
> I'm missing something there's absolutely no possibility to use the created
> subscriptions without losing some data from the publisher.
>
> The reason is that the subscription doesn't have a local list of relation to
> process until you refresh the subscription, but you can't refresh the
> subscription without enabling it (and you can't enable it in a transaction),
> which means that you have to let the logical worker start, consume and ignore
> all changes that happened on the publisher side until the refresh happens.
>
> An easy workaround that I tried is to allow something like
>
> ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)
>
> so that the refresh internally happens before the apply worker is started and
> you just keep consuming the delta, which works on naive scenario.
>
> One concern I have with this approach is that the default values for both
> "refresh" and "copy_data" for all other subcommands is "true, but we would
> probably need a different default value in that exact scenario (as we know we
> already have the data).  I think that it would otherwise be safe in my very
> specific scenario, assuming that you created the slot beforehand and moved the
> slot's LSN at the promotion point, as even if you add non-empty tables to the
> publication you will only need the delta whether those were initially empty or
> not given your initial physical replica state.
>

This point is not very clear. Why would one just need delta even for new tables?

>  Any other scenario would make
> this new option dangerous, if not entirely useless, but not more than any of
> the current commands that lead to refreshing a subscription and have the same
> options I guess.
>
> All in all, currently the only way to somewhat safely resume logical
> replication after a pg_upgrade is to drop all the subscriptions that were
> transferred during pg_upgrade on all databases and recreate them (using the
> existing slots on the publisher side obviously), allowing the initial
> connection.  But this approach only works in the exact scenario I mentioned
> (physical to logical replication, or at least a case where *all* the tables
> where logically replicated prior to the pg_ugprade), otherwise you have to
> recreate the follower node from scratch using logical repication.
>

I think if you dropped and recreated the subscriptions by retaining
old slots, the replication should resume from where it left off before
the upgrade. Which scenario are you concerned about?

> Is that indeed the current behavior, or did I miss something?
>
> Is this "resume logical replication on pg_upgraded node" something we want to
> support better?  I was thinking that we could add a new pg_dump mode (maybe
> only usable during pg_upgrade) that also restores the pg_subscription_rel
> content in each subscription or something like that.  If not, should pg_upgrade
> keep preserving the subscriptions as it doesn't seem safe to use them, or at
> least document the hazards (I didn't find anything about it in the
> documentation)?
>
>

There is a mention of this in pg_dump docs. See [1] (When dumping
logical replication subscriptions ...)

[1] - https://www.postgresql.org/docs/devel/app-pgdump.html

-- 
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:
> On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > An easy workaround that I tried is to allow something like
> >
> > ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)
> >
> > so that the refresh internally happens before the apply worker is started and
> > you just keep consuming the delta, which works on naive scenario.
> >
> > One concern I have with this approach is that the default values for both
> > "refresh" and "copy_data" for all other subcommands is "true, but we would
> > probably need a different default value in that exact scenario (as we know we
> > already have the data).  I think that it would otherwise be safe in my very
> > specific scenario, assuming that you created the slot beforehand and moved the
> > slot's LSN at the promotion point, as even if you add non-empty tables to the
> > publication you will only need the delta whether those were initially empty or
> > not given your initial physical replica state.
> >
>
> This point is not very clear. Why would one just need delta even for new tables?

Because in my scenario I'm coming from physical replication, so I know that I
did replicate everything until the promotion LSN.  Any table later added in the
publication is either already fully replicated until that LSN on the upgraded
node, so only the delta is needed, or has been created after that LSN.  In the
latter case, the entirety of the table will be replicated with the logical
replication as a delta right?

> >  Any other scenario would make
> > this new option dangerous, if not entirely useless, but not more than any of
> > the current commands that lead to refreshing a subscription and have the same
> > options I guess.
> >
> > All in all, currently the only way to somewhat safely resume logical
> > replication after a pg_upgrade is to drop all the subscriptions that were
> > transferred during pg_upgrade on all databases and recreate them (using the
> > existing slots on the publisher side obviously), allowing the initial
> > connection.  But this approach only works in the exact scenario I mentioned
> > (physical to logical replication, or at least a case where *all* the tables
> > where logically replicated prior to the pg_ugprade), otherwise you have to
> > recreate the follower node from scratch using logical repication.
> >
>
> I think if you dropped and recreated the subscriptions by retaining
> old slots, the replication should resume from where it left off before
> the upgrade. Which scenario are you concerned about?

I'm concerned about people not coming from physical replication.  If you just
had some "normal" logical replication, you can't assume that you already have
all the data from the upstream subscription.  If it was modified and a non
empty table is added, you might need to copy the data of part of the tables and
keep replicating for the rest.  It's hard to be sure from a user point of view,
and even if you knew you have no way to express it.

> > Is that indeed the current behavior, or did I miss something?
> >
> > Is this "resume logical replication on pg_upgraded node" something we want to
> > support better?  I was thinking that we could add a new pg_dump mode (maybe
> > only usable during pg_upgrade) that also restores the pg_subscription_rel
> > content in each subscription or something like that.  If not, should pg_upgrade
> > keep preserving the subscriptions as it doesn't seem safe to use them, or at
> > least document the hazards (I didn't find anything about it in the
> > documentation)?
> >
> >
>
> There is a mention of this in pg_dump docs. See [1] (When dumping
> logical replication subscriptions ...)

Indeed, but it's barely saying "It is then up to the user to reactivate the
subscriptions in a suitable way" and "It might also be appropriate to truncate
the target tables before initiating a new full table copy".  As I mentioned, I
don't think there's a suitable way to reactivate the subscription, at least if
you don't want to miss some records, so truncating all target tables is the
only fully safe way to proceed.  It seems quite silly to have to do so just
because pg_upgrade doesn't retain the list of relation per subscription.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Fri, Feb 17, 2023 at 04:12:54PM +0530, Amit Kapila wrote:
> > On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > An easy workaround that I tried is to allow something like
> > >
> > > ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)
> > >
> > > so that the refresh internally happens before the apply worker is started and
> > > you just keep consuming the delta, which works on naive scenario.
> > >
> > > One concern I have with this approach is that the default values for both
> > > "refresh" and "copy_data" for all other subcommands is "true, but we would
> > > probably need a different default value in that exact scenario (as we know we
> > > already have the data).  I think that it would otherwise be safe in my very
> > > specific scenario, assuming that you created the slot beforehand and moved the
> > > slot's LSN at the promotion point, as even if you add non-empty tables to the
> > > publication you will only need the delta whether those were initially empty or
> > > not given your initial physical replica state.
> > >
> >
> > This point is not very clear. Why would one just need delta even for new tables?
>
> Because in my scenario I'm coming from physical replication, so I know that I
> did replicate everything until the promotion LSN.  Any table later added in the
> publication is either already fully replicated until that LSN on the upgraded
> node, so only the delta is needed, or has been created after that LSN.  In the
> latter case, the entirety of the table will be replicated with the logical
> replication as a delta right?
>

That makes sense to me.

> > >  Any other scenario would make
> > > this new option dangerous, if not entirely useless, but not more than any of
> > > the current commands that lead to refreshing a subscription and have the same
> > > options I guess.
> > >
> > > All in all, currently the only way to somewhat safely resume logical
> > > replication after a pg_upgrade is to drop all the subscriptions that were
> > > transferred during pg_upgrade on all databases and recreate them (using the
> > > existing slots on the publisher side obviously), allowing the initial
> > > connection.  But this approach only works in the exact scenario I mentioned
> > > (physical to logical replication, or at least a case where *all* the tables
> > > where logically replicated prior to the pg_ugprade), otherwise you have to
> > > recreate the follower node from scratch using logical repication.
> > >
> >
> > I think if you dropped and recreated the subscriptions by retaining
> > old slots, the replication should resume from where it left off before
> > the upgrade. Which scenario are you concerned about?
>
> I'm concerned about people not coming from physical replication.  If you just
> had some "normal" logical replication, you can't assume that you already have
> all the data from the upstream subscription.  If it was modified and a non
> empty table is added, you might need to copy the data of part of the tables and
> keep replicating for the rest.  It's hard to be sure from a user point of view,
> and even if you knew you have no way to express it.
>

Can't the user create a separate publication for such newly added
tables and a corresponding new subscription on the downstream node?
Now, I think it would be a bit tricky if the user already has a
publication defined with FOR ALL TABLES. In that case, we probably
need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
we currently don't have.

> > > Is that indeed the current behavior, or did I miss something?
> > >
> > > Is this "resume logical replication on pg_upgraded node" something we want to
> > > support better?  I was thinking that we could add a new pg_dump mode (maybe
> > > only usable during pg_upgrade) that also restores the pg_subscription_rel
> > > content in each subscription or something like that.  If not, should pg_upgrade
> > > keep preserving the subscriptions as it doesn't seem safe to use them, or at
> > > least document the hazards (I didn't find anything about it in the
> > > documentation)?
> > >
> > >
> >
> > There is a mention of this in pg_dump docs. See [1] (When dumping
> > logical replication subscriptions ...)
>
> Indeed, but it's barely saying "It is then up to the user to reactivate the
> subscriptions in a suitable way" and "It might also be appropriate to truncate
> the target tables before initiating a new full table copy".  As I mentioned, I
> don't think there's a suitable way to reactivate the subscription, at least if
> you don't want to miss some records, so truncating all target tables is the
> only fully safe way to proceed.  It seems quite silly to have to do so just
> because pg_upgrade doesn't retain the list of relation per subscription.
>

I also don't know if there is any other safe way for newly added
tables apart from the above suggestion to create separate publications
but that can work only in specific cases.

-- 
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:
> On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > I'm concerned about people not coming from physical replication.  If you just
> > had some "normal" logical replication, you can't assume that you already have
> > all the data from the upstream subscription.  If it was modified and a non
> > empty table is added, you might need to copy the data of part of the tables and
> > keep replicating for the rest.  It's hard to be sure from a user point of view,
> > and even if you knew you have no way to express it.
> >
>
> Can't the user create a separate publication for such newly added
> tables and a corresponding new subscription on the downstream node?

Yes that seems like a safe way to go, but it relies on users being very careful
if they don't want to get corrupted logical standby, and I think it's
impossible to run any check to make sure that the subscription is adequate?

> Now, I think it would be a bit tricky if the user already has a
> publication defined with FOR ALL TABLES. In that case, we probably
> need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> we currently don't have.

Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
use case.

> >
> > Indeed, but it's barely saying "It is then up to the user to reactivate the
> > subscriptions in a suitable way" and "It might also be appropriate to truncate
> > the target tables before initiating a new full table copy".  As I mentioned, I
> > don't think there's a suitable way to reactivate the subscription, at least if
> > you don't want to miss some records, so truncating all target tables is the
> > only fully safe way to proceed.  It seems quite silly to have to do so just
> > because pg_upgrade doesn't retain the list of relation per subscription.
> >
>
> I also don't know if there is any other safe way for newly added
> tables apart from the above suggestion to create separate publications
> but that can work only in specific cases.

I might be missing something, but what could go wrong if pg_upgrade could emit
a bunch of commands like:

ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';

pg_upgrade already preserves the relation's oid, so we could restore the
exact original state and then enabling the subscription would just work?

We could restrict this form to --binary only so we don't provide a way for
users to mess the data.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:
> > On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > I'm concerned about people not coming from physical replication.  If you just
> > > had some "normal" logical replication, you can't assume that you already have
> > > all the data from the upstream subscription.  If it was modified and a non
> > > empty table is added, you might need to copy the data of part of the tables and
> > > keep replicating for the rest.  It's hard to be sure from a user point of view,
> > > and even if you knew you have no way to express it.
> > >
> >
> > Can't the user create a separate publication for such newly added
> > tables and a corresponding new subscription on the downstream node?
>
> Yes that seems like a safe way to go, but it relies on users being very careful
> if they don't want to get corrupted logical standby, and I think it's
> impossible to run any check to make sure that the subscription is adequate?
>

I can't think of any straightforward way but one can probably take of
dump of data on both nodes using pg_dump and then compare it.

> > Now, I think it would be a bit tricky if the user already has a
> > publication defined with FOR ALL TABLES. In that case, we probably
> > need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> > we currently don't have.
>
> Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
> use case.
>

Okay, but if we would have functionality like EXCEPT (list of tables),
one could do ALTER PUBLICATION .. before doing REFRESH on the
subscriber-side.

> > >
> > > Indeed, but it's barely saying "It is then up to the user to reactivate the
> > > subscriptions in a suitable way" and "It might also be appropriate to truncate
> > > the target tables before initiating a new full table copy".  As I mentioned, I
> > > don't think there's a suitable way to reactivate the subscription, at least if
> > > you don't want to miss some records, so truncating all target tables is the
> > > only fully safe way to proceed.  It seems quite silly to have to do so just
> > > because pg_upgrade doesn't retain the list of relation per subscription.
> > >
> >
> > I also don't know if there is any other safe way for newly added
> > tables apart from the above suggestion to create separate publications
> > but that can work only in specific cases.
>
> I might be missing something, but what could go wrong if pg_upgrade could emit
> a bunch of commands like:
>
> ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
>

How will we know the STATE and LSN of each relation? But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

-- 
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Sat, Feb 18, 2023 at 04:12:52PM +0530, Amit Kapila wrote:
> On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > > Now, I think it would be a bit tricky if the user already has a
> > > publication defined with FOR ALL TABLES. In that case, we probably
> > > need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> > > we currently don't have.
> >
> > Yes, and note that I rely on FOR ALL TABLES for my original physical to logical
> > use case.
> >
>
> Okay, but if we would have functionality like EXCEPT (list of tables),
> one could do ALTER PUBLICATION .. before doing REFRESH on the
> subscriber-side.

Honestly I'm not a huge fan of this approach.  It feels hacky to have such a
feature, and doesn't even solve the problem on its own as you still lose
records when reactivating the subscription unless you also provide an ALTER
SUBSCRIPTION ENABLE WITH (refresh = true, copy_data = false), which will
probably require different defaults than the rest of the ALTER SUBSCRIPTION
subcommands that handle a refresh.

> > > > Indeed, but it's barely saying "It is then up to the user to reactivate the
> > > > subscriptions in a suitable way" and "It might also be appropriate to truncate
> > > > the target tables before initiating a new full table copy".  As I mentioned, I
> > > > don't think there's a suitable way to reactivate the subscription, at least if
> > > > you don't want to miss some records, so truncating all target tables is the
> > > > only fully safe way to proceed.  It seems quite silly to have to do so just
> > > > because pg_upgrade doesn't retain the list of relation per subscription.
> > > >
> > >
> > > I also don't know if there is any other safe way for newly added
> > > tables apart from the above suggestion to create separate publications
> > > but that can work only in specific cases.
> >
> > I might be missing something, but what could go wrong if pg_upgrade could emit
> > a bunch of commands like:
> >
> > ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
> >
>
> How will we know the STATE and LSN of each relation?

In the pg_subscription_rel catalog of the upgraded server?  I didn't look in
detail on how information are updated but I'm assuming that if logical
replication survives after a database restart it shouldn't be a problem to also
fully dump it during pg_upgrade.

> But I think even
> if know that what is the guarantee that publisher side still has still
> retained the corresponding slots?

No guarantee, but if you're just doing a pg_upgrade of a logical replica why
would you drop the replication slot?  In any case the warning you mentioned in
pg_dump documentation would still apply and you would have to reenable it as
needed, the only difference is that you would actually be able to keep your
logical replication after a pg_upgrade if you need.  If you dropped the
replication slot on the publisher side, then simply remove the publications on
the upgraded node too, or create a new one, exactly as you would do with the
current pg_upgrade workflow.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Sun, Feb 19, 2023 at 5:31 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sat, Feb 18, 2023 at 04:12:52PM +0530, Amit Kapila wrote:
> > > > >
> > > >
> > > > I also don't know if there is any other safe way for newly added
> > > > tables apart from the above suggestion to create separate publications
> > > > but that can work only in specific cases.
> > >
> > > I might be missing something, but what could go wrong if pg_upgrade could emit
> > > a bunch of commands like:
> > >
> > > ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
> > >
> >
> > How will we know the STATE and LSN of each relation?
>
> In the pg_subscription_rel catalog of the upgraded server?  I didn't look in
> detail on how information are updated but I'm assuming that if logical
> replication survives after a database restart it shouldn't be a problem to also
> fully dump it during pg_upgrade.
>
> > But I think even
> > if know that what is the guarantee that publisher side still has still
> > retained the corresponding slots?
>
> No guarantee, but if you're just doing a pg_upgrade of a logical replica why
> would you drop the replication slot?  In any case the warning you mentioned in
> pg_dump documentation would still apply and you would have to reenable it as
> needed, the only difference is that you would actually be able to keep your
> logical replication after a pg_upgrade if you need.  If you dropped the
> replication slot on the publisher side, then simply remove the publications on
> the upgraded node too, or create a new one, exactly as you would do with the
> current pg_upgrade workflow.
>

I think the current mechanism tries to provide more flexibility to the
users. OTOH, in some of the cases where users don't want to change
anything in the logical replication (both upstream and downstream
function as it is) after the upgrade then they need to do more work. I
think ideally there should be some option in pg_dump that allows us to
dump the contents of pg_subscription_rel as well, so that is easier
for users to continue replication after the upgrade. We can then use
it for binary-upgrade mode as well.

-- 
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:
> On Sun, Feb 19, 2023 at 5:31 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > > >
> > > > I might be missing something, but what could go wrong if pg_upgrade could emit
> > > > a bunch of commands like:
> > > >
> > > > ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
> > > >
> > >
> > > How will we know the STATE and LSN of each relation?
> >
> > In the pg_subscription_rel catalog of the upgraded server?  I didn't look in
> > detail on how information are updated but I'm assuming that if logical
> > replication survives after a database restart it shouldn't be a problem to also
> > fully dump it during pg_upgrade.
> >
> > > But I think even
> > > if know that what is the guarantee that publisher side still has still
> > > retained the corresponding slots?
> >
> > No guarantee, but if you're just doing a pg_upgrade of a logical replica why
> > would you drop the replication slot?  In any case the warning you mentioned in
> > pg_dump documentation would still apply and you would have to reenable it as
> > needed, the only difference is that you would actually be able to keep your
> > logical replication after a pg_upgrade if you need.  If you dropped the
> > replication slot on the publisher side, then simply remove the publications on
> > the upgraded node too, or create a new one, exactly as you would do with the
> > current pg_upgrade workflow.
> >
> 
> I think the current mechanism tries to provide more flexibility to the
> users. OTOH, in some of the cases where users don't want to change
> anything in the logical replication (both upstream and downstream
> function as it is) after the upgrade then they need to do more work. I
> think ideally there should be some option in pg_dump that allows us to
> dump the contents of pg_subscription_rel as well, so that is easier
> for users to continue replication after the upgrade. We can then use
> it for binary-upgrade mode as well.

Is there really a use case for dumping the content of pg_subscription_rel
outside of pg_upgrade?  I'm not particularly worried about the publisher going
away or changing while pg_upgrade is running , but for a normal pg_dump /
pg_restore I don't really see how anyone would actually want to resume logical
replication from a pg_dump, especially since it's almost guaranteed that the
node will already have consumed data from the publication that won't be in the
dump in the first place.

Are you ok with the suggested syntax above (probably with extra parens to avoid
adding new keywords), or do you have some better suggestion?  I'm a bit worried
about adding some O(n) commands, as it can add some noticeable slow-down for
pg_upgrade-ing logical replica, but I don't really see how to avoid that.  Note
that if we make this option available to end-users, we will have to use the
relation name rather than its oid, which will make this option even more
expensive when restoring due to the extra lookups.

For the pg_upgrade use-case, do you see any reason to not restore the
pg_subscription_rel by default?  Maybe having an option to not restore it would
make sense if it indeed add noticeable overhead when publications have a lot of
tables?



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Mon, Feb 20, 2023 at 03:07:37PM +0800, Julien Rouhaud wrote:
> On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:
> >
> > I think the current mechanism tries to provide more flexibility to the
> > users. OTOH, in some of the cases where users don't want to change
> > anything in the logical replication (both upstream and downstream
> > function as it is) after the upgrade then they need to do more work. I
> > think ideally there should be some option in pg_dump that allows us to
> > dump the contents of pg_subscription_rel as well, so that is easier
> > for users to continue replication after the upgrade. We can then use
> > it for binary-upgrade mode as well.
>
> Is there really a use case for dumping the content of pg_subscription_rel
> outside of pg_upgrade?  I'm not particularly worried about the publisher going
> away or changing while pg_upgrade is running , but for a normal pg_dump /
> pg_restore I don't really see how anyone would actually want to resume logical
> replication from a pg_dump, especially since it's almost guaranteed that the
> node will already have consumed data from the publication that won't be in the
> dump in the first place.
>
> Are you ok with the suggested syntax above (probably with extra parens to avoid
> adding new keywords), or do you have some better suggestion?  I'm a bit worried
> about adding some O(n) commands, as it can add some noticeable slow-down for
> pg_upgrade-ing logical replica, but I don't really see how to avoid that.  Note
> that if we make this option available to end-users, we will have to use the
> relation name rather than its oid, which will make this option even more
> expensive when restoring due to the extra lookups.
>
> For the pg_upgrade use-case, do you see any reason to not restore the
> pg_subscription_rel by default?  Maybe having an option to not restore it would
> make sense if it indeed add noticeable overhead when publications have a lot of
> tables?

Since I didn't hear any objection I worked on a POC patch with this approach.

For now when pg_dump is invoked with --binary, it will always emit extra
commands to restore the relation list.  This command is only allowed when the
server is started in binary upgrade mode.

The new command is of the form

ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')

with the lsn part being optional.  I'm not sure if there should be some new
regression test for that, as it would be a bit costly.  Note that pg_upgrade of
a logical replica isn't covered by any regression test that I could find.

I did test it manually though, and it fixes my original problem, allowing me to
safely resume logical replication by just re-enabling it.  I didn't do any
benchmarking to see how much overhead it adds.

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Feb 22, 2023 at 12:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Feb 20, 2023 at 03:07:37PM +0800, Julien Rouhaud wrote:
> > On Mon, Feb 20, 2023 at 11:07:42AM +0530, Amit Kapila wrote:
> > >
> > > I think the current mechanism tries to provide more flexibility to the
> > > users. OTOH, in some of the cases where users don't want to change
> > > anything in the logical replication (both upstream and downstream
> > > function as it is) after the upgrade then they need to do more work. I
> > > think ideally there should be some option in pg_dump that allows us to
> > > dump the contents of pg_subscription_rel as well, so that is easier
> > > for users to continue replication after the upgrade. We can then use
> > > it for binary-upgrade mode as well.
> >
> > Is there really a use case for dumping the content of pg_subscription_rel
> > outside of pg_upgrade?

I think the users who want to take a dump and restore the entire
cluster may need it there for the same reason as pg_upgrade needs it.
TBH, I have not seen such a request but this is what I imagine one
would expect if we provide this functionality via pg_upgrade.

> >  I'm not particularly worried about the publisher going
> > away or changing while pg_upgrade is running , but for a normal pg_dump /
> > pg_restore I don't really see how anyone would actually want to resume logical
> > replication from a pg_dump, especially since it's almost guaranteed that the
> > node will already have consumed data from the publication that won't be in the
> > dump in the first place.
> >
> > Are you ok with the suggested syntax above (probably with extra parens to avoid
> > adding new keywords), or do you have some better suggestion?  I'm a bit worried
> > about adding some O(n) commands, as it can add some noticeable slow-down for
> > pg_upgrade-ing logical replica, but I don't really see how to avoid that.  Note
> > that if we make this option available to end-users, we will have to use the
> > relation name rather than its oid, which will make this option even more
> > expensive when restoring due to the extra lookups.
> >
> > For the pg_upgrade use-case, do you see any reason to not restore the
> > pg_subscription_rel by default?

As I said earlier, one can very well say that giving more flexibility
(in terms of where the publications will be after restore) after a
restore is a better idea. Also, we are doing the same till now without
any major complaints about the same, so it makes sense to keep the
current behavior as default.

> >  Maybe having an option to not restore it would
> > make sense if it indeed add noticeable overhead when publications have a lot of
> > tables?

Yeah, that could be another reason to not do it default.

>
> Since I didn't hear any objection I worked on a POC patch with this approach.
>
> For now when pg_dump is invoked with --binary, it will always emit extra
> commands to restore the relation list.  This command is only allowed when the
> server is started in binary upgrade mode.
>
> The new command is of the form
>
> ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')
>
> with the lsn part being optional.
>

BTW, do we restore the origin and its LSN after the upgrade? Because
without that this won't be sufficient as that is required for apply
worker to ensure that it is in sync with table sync workers.

-- 
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Sat, Feb 25, 2023 at 11:24:17AM +0530, Amit Kapila wrote:
> On Wed, Feb 22, 2023 at 12:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > > Is there really a use case for dumping the content of pg_subscription_rel
> > > outside of pg_upgrade?
>
> I think the users who want to take a dump and restore the entire
> cluster may need it there for the same reason as pg_upgrade needs it.
> TBH, I have not seen such a request but this is what I imagine one
> would expect if we provide this functionality via pg_upgrade.

But the pg_subscription_rel data are only needed if you want to resume logical
replication from the exact previous state, otherwise you can always refresh the
subscription and it will retrieve the list of relations automatically (dealing
with initial sync and so on).  It's hard to see how it could be happening with
a plain pg_dump.

The only usable scenario I can see would be to disable all subscriptions on the
logical replica, maybe make sure that no one does any write those tables if you
want to eventually switch over on the restored node, do a pg_dump(all), restore
it and then resume the logical replication / subscription(s) on the restored
server.  That's a lot of constraints for something that pg_upgrade deals with
so much more efficiently.  Maybe one plausible use case would be to split a
single logical replica to N servers, one per database / publication or
something like that.  In that case pg_upgrade won't be that useful and if each
target subset is small enough a pg_dump/pg_restore may be a viable option.  But
if that's a viable option then surely creating the logical replica from scratch
using normal logical table sync should be an even better option.

I'm really worried that it's going to be a giant foot-gun that any user should
really avoid.

> > > For the pg_upgrade use-case, do you see any reason to not restore the
> > > pg_subscription_rel by default?
>
> As I said earlier, one can very well say that giving more flexibility
> (in terms of where the publications will be after restore) after a
> restore is a better idea. Also, we are doing the same till now without
> any major complaints about the same, so it makes sense to keep the
> current behavior as default.

I'm a bit dubious that anyone actually tried to run pg_upgrade on a logical
replica and then kept using logical replication, as it's currently impossible
to safely resume replication without truncating all target relations.

As I mentioned before, if we keep the current behavior as a default there
should be an explicit warning in the documentation stating that you need to
truncate all target relations before resuming logical replication as otherwise
you have a guarantee that you will lose data.

> > >  Maybe having an option to not restore it would
> > > make sense if it indeed add noticeable overhead when publications have a lot of
> > > tables?
>
> Yeah, that could be another reason to not do it default.

I will do some benchmark with various number of relations, from high to
unreasonable.

> >
> > Since I didn't hear any objection I worked on a POC patch with this approach.
> >
> > For now when pg_dump is invoked with --binary, it will always emit extra
> > commands to restore the relation list.  This command is only allowed when the
> > server is started in binary upgrade mode.
> >
> > The new command is of the form
> >
> > ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')
> >
> > with the lsn part being optional.
> >
>
> BTW, do we restore the origin and its LSN after the upgrade? Because
> without that this won't be sufficient as that is required for apply
> worker to ensure that it is in sync with table sync workers.

We currently don't, which is yet another sign that no one actually tried to
resume logical replication after a pg_upgrade.  That being said, trying to
pg_upgrade a node that's currently syncing relations seems like a bad idea
(I didn't even think to try), but I guess it should also be supported.  I will
work on that too.  Assuming we add a new option for controlling either plain
pg_dump and/or pg_upgrade behavior, should this option control both
pg_subscription_rel and replication origins and their data or do we need more
granularity?



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Sun, Feb 26, 2023 at 8:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sat, Feb 25, 2023 at 11:24:17AM +0530, Amit Kapila wrote:
> > >
> > > The new command is of the form
> > >
> > > ALTER SUBSCRIPTION name ADD TABLE (relid = X, state = 'Y', lsn = 'Z/Z')
> > >
> > > with the lsn part being optional.
> > >
> >
> > BTW, do we restore the origin and its LSN after the upgrade? Because
> > without that this won't be sufficient as that is required for apply
> > worker to ensure that it is in sync with table sync workers.
>
> We currently don't, which is yet another sign that no one actually tried to
> resume logical replication after a pg_upgrade.  That being said, trying to
> pg_upgrade a node that's currently syncing relations seems like a bad idea
> (I didn't even think to try), but I guess it should also be supported.  I will
> work on that too.  Assuming we add a new option for controlling either plain
> pg_dump and/or pg_upgrade behavior, should this option control both
> pg_subscription_rel and replication origins and their data or do we need more
> granularity?
>

My vote would be to have one option for both. BTW, thinking some more
on this, how will we allow to continue replication after upgrading the
publisher? During upgrade, we don't retain slots, so the replication
won't continue. I think after upgrading subscriber-node, user will
need to upgrade the publisher as well.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Mon, Feb 27, 2023 at 03:39:18PM +0530, Amit Kapila wrote:
>
> BTW, thinking some more
> on this, how will we allow to continue replication after upgrading the
> publisher? During upgrade, we don't retain slots, so the replication
> won't continue. I think after upgrading subscriber-node, user will
> need to upgrade the publisher as well.

The scenario I'm interested in is to rely on logical replication only for the
upgrade, so the end state (and start state) is to go back to physical
replication.  In that case, I would just create new physical replica from the
pg_upgrade'd server and failover to that node, or rsync the previous publisher
node to make it a physical replica.

But even if you want to only rely on logical replication, I'm not sure why you
would want to keep the publisher node as a publisher node?  I think that doing
it this way will lead to a longer downtime compared to doing a failover on the
pg_upgrade'd node, make it a publisher and then move the former publisher node
to a subscriber.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Feb 27, 2023 at 03:39:18PM +0530, Amit Kapila wrote:
> >
> > BTW, thinking some more
> > on this, how will we allow to continue replication after upgrading the
> > publisher? During upgrade, we don't retain slots, so the replication
> > won't continue. I think after upgrading subscriber-node, user will
> > need to upgrade the publisher as well.
>
> The scenario I'm interested in is to rely on logical replication only for the
> upgrade, so the end state (and start state) is to go back to physical
> replication.  In that case, I would just create new physical replica from the
> pg_upgrade'd server and failover to that node, or rsync the previous publisher
> node to make it a physical replica.
>
> But even if you want to only rely on logical replication, I'm not sure why you
> would want to keep the publisher node as a publisher node?  I think that doing
> it this way will lead to a longer downtime compared to doing a failover on the
> pg_upgrade'd node, make it a publisher and then move the former publisher node
> to a subscriber.
>

I am not sure if this is usually everyone follows because it sounds
like a lot of work to me. IIUC, to achieve this, one needs to recreate
all the publications and subscriptions after changing the roles of
publisher and subscriber. Can you please write steps to show exactly
what you have in mind to avoid any misunderstanding?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Tue, Feb 28, 2023 at 08:56:37AM +0530, Amit Kapila wrote:
> On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> >
> > The scenario I'm interested in is to rely on logical replication only for the
> > upgrade, so the end state (and start state) is to go back to physical
> > replication.  In that case, I would just create new physical replica from the
> > pg_upgrade'd server and failover to that node, or rsync the previous publisher
> > node to make it a physical replica.
> >
> > But even if you want to only rely on logical replication, I'm not sure why you
> > would want to keep the publisher node as a publisher node?  I think that doing
> > it this way will lead to a longer downtime compared to doing a failover on the
> > pg_upgrade'd node, make it a publisher and then move the former publisher node
> > to a subscriber.
> >
>
> I am not sure if this is usually everyone follows because it sounds
> like a lot of work to me. IIUC, to achieve this, one needs to recreate
> all the publications and subscriptions after changing the roles of
> publisher and subscriber. Can you please write steps to show exactly
> what you have in mind to avoid any misunderstanding?

Well, as I mentioned I'm *not* interested in a logical-replication-only
scenario.  Logical replication is nice but it will always be less efficient
than physical replication, and some workloads also don't really play well with
it.  So while it can be a huge asset in some cases I'm for now looking at
leveraging logical replication for the purpose of major upgrade only for a
physical replication cluster, so the publications and subscriptions are only
temporary and trashed after use.

That being said I was only saying that if I had to do a major upgrade of a
logical replication cluster this is probably how I would try to do it, to
minimize downtime, even if there are probably *a lot* difficulties to
overcome.



Re: pg_upgrade and logical replication

From
Nikolay Samokhvalov
Date:
On Fri, Feb 17, 2023 at 7:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
>  Any table later added in the
> publication is either already fully replicated until that LSN on the upgraded
> node, so only the delta is needed, or has been created after that LSN.  In the
> latter case, the entirety of the table will be replicated with the logical
> replication as a delta right?


What if we consider a slightly adjusted procedure?

0. Temporarily, forbid running any DDL on the source cluster.
1. On the source, create publication, replication slot and remember
the LSN for it
2. Restore the target cluster to that LSN using restore_target_lsn (PITR)
3. Run pg_upgrade on the target cluster
4. Only now, create subscription to target
5. Wait until logical replication catches up
6. Perform a switchover to the new cluster taking care of lags in sequences, etc
7. Resume DDL when needed

Do you see any data loss happening in this approach?



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:
> On Fri, Feb 17, 2023 at 7:35 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> >  Any table later added in the
> > publication is either already fully replicated until that LSN on the upgraded
> > node, so only the delta is needed, or has been created after that LSN.  In the
> > latter case, the entirety of the table will be replicated with the logical
> > replication as a delta right?
>
> What if we consider a slightly adjusted procedure?
>
> 0. Temporarily, forbid running any DDL on the source cluster.

This is (at least for me) a non starter, as I want an approach that doesn't
impact the primary node, at least not too much.

Also, how would you do that?  If you need some new infrastructure it means that
you can only upgrade nodes starting from pg16+, while my approach can upgrade
any node that supports publications as long as the target version is pg16+.

It also raises some concerns: why prevent any DDL while e.g. creating a
temporary table shouldn't not be a problem, same for renaming some underlying
object, adding indexes...  You would have to curate a list of what exactly is
allowed which is never great.

Also, how exactly would you ensure that indeed DDL were forbidden since a long
enough point in time rather than just "currently" forbidden at the time you do
some check?



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Tue, Feb 28, 2023 at 08:56:37AM +0530, Amit Kapila wrote:
> > On Tue, Feb 28, 2023 at 7:55 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > >
> > > The scenario I'm interested in is to rely on logical replication only for the
> > > upgrade, so the end state (and start state) is to go back to physical
> > > replication.  In that case, I would just create new physical replica from the
> > > pg_upgrade'd server and failover to that node, or rsync the previous publisher
> > > node to make it a physical replica.
> > >
> > > But even if you want to only rely on logical replication, I'm not sure why you
> > > would want to keep the publisher node as a publisher node?  I think that doing
> > > it this way will lead to a longer downtime compared to doing a failover on the
> > > pg_upgrade'd node, make it a publisher and then move the former publisher node
> > > to a subscriber.
> > >
> >
> > I am not sure if this is usually everyone follows because it sounds
> > like a lot of work to me. IIUC, to achieve this, one needs to recreate
> > all the publications and subscriptions after changing the roles of
> > publisher and subscriber. Can you please write steps to show exactly
> > what you have in mind to avoid any misunderstanding?
>
> Well, as I mentioned I'm *not* interested in a logical-replication-only
> scenario.  Logical replication is nice but it will always be less efficient
> than physical replication, and some workloads also don't really play well with
> it.  So while it can be a huge asset in some cases I'm for now looking at
> leveraging logical replication for the purpose of major upgrade only for a
> physical replication cluster, so the publications and subscriptions are only
> temporary and trashed after use.
>
> That being said I was only saying that if I had to do a major upgrade of a
> logical replication cluster this is probably how I would try to do it, to
> minimize downtime, even if there are probably *a lot* difficulties to
> overcome.
>

Okay, but it would be better if you list out your detailed steps. It
would be useful to support the new mechanism in this area if others
also find your steps to upgrade useful.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:
> On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Well, as I mentioned I'm *not* interested in a logical-replication-only
> > scenario.  Logical replication is nice but it will always be less efficient
> > than physical replication, and some workloads also don't really play well with
> > it.  So while it can be a huge asset in some cases I'm for now looking at
> > leveraging logical replication for the purpose of major upgrade only for a
> > physical replication cluster, so the publications and subscriptions are only
> > temporary and trashed after use.
> >
> > That being said I was only saying that if I had to do a major upgrade of a
> > logical replication cluster this is probably how I would try to do it, to
> > minimize downtime, even if there are probably *a lot* difficulties to
> > overcome.
> >
>
> Okay, but it would be better if you list out your detailed steps. It
> would be useful to support the new mechanism in this area if others
> also find your steps to upgrade useful.

Sure.  Here are the overly detailed steps:

 1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
    whatever), let's call the primary node "A" and replica node "B"
 2) ensure WAL level is "logical" on the primary node A
 3) create a logical replication slot on every (connectable) database (or just
    the one you're interested in if you don't want to preserve everything) on A
 4) create a FOR ALL TABLE publication (again for every databases or just the
    one you're interested in)
 5) wait for replication to be reasonably if not entirely up to date
 6) promote the standby node B
 7) retrieve the promotion LSN (from the XXXXXXXX.history file,
    pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
 8) call pg_replication_slot_advance() with that LSN for all previously created
    logical replication slots on A
 9) create a normal subscription on all wanted databases on the promoted node
10) wait for it to catchup if needed on B
12) stop the node B
13) run pg_upgrade on B, creating the new node C
14) start C, run the global ANALYZE and any sanity check needed (hopefully you
    would have validated that your application is compatible with that new
    version before this point)
15) re-enable the subscription on C.  This is currently not possible without
    losing data, the patch fixes that
16) wait for it to catchup if needed
17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
    needed
18) trash B
19) create new nodes D, E... as physical replica from C if needed, possibly
using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
needed
20) switchover to C and trash A (or convert it to another replica if you want)
21) trash the publications on C on all databases

As noted the step 15 is currently problematic, and is also problematic in any
variation of that scenario that doesn't require you to entirely recreate the
node C from scratch using logical replication, which is what I want to avoid.

This isn't terribly complicated but requires to be really careful if you don't
want to end up with an incorrect node C.  This approach is also currently not
entirely ideal, but hopefully logical replication of sequences and DDL will
remove the main sources of downtime when upgrading using logical replication.

My ultimate goal is to provide some tooling to do that in a much simpler way.
Maybe a new "promote to logical" action that would take care of steps 2 to 9.
Users would therefore only have to do this "promotion to logical", and then run
pg_upgrade and create a new physical replication cluster if they want.



Re: pg_upgrade and logical replication

From
Nikolay Samokhvalov
Date:
On Tue, Feb 28, 2023 at 4:43 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:
> > 0. Temporarily, forbid running any DDL on the source cluster.
>
> This is (at least for me) a non starter, as I want an approach that doesn't
> impact the primary node, at least not too much.
...
> Also, how exactly would you ensure that indeed DDL were forbidden since a long
> enough point in time rather than just "currently" forbidden at the time you do
> some check?

Thanks for your response. I didn't expect that DDL part would attract
attention, my message was not about DDL... – the DDL part was there
just to show that the recipe I described is possible for any PG
version that supports logical replication.

Usually, people perform upgrades involving logical using full
initialization at logical level – at least all posts and articles I
could talk about that. Meanwhile, on one hand, for large DBs, logical
copying is hard (slow, holding xmin horizon, etc.), and on the other
hand, physical replica can be transformed to logical (using the trick
with recover_target_lsn, syncing the state with the slot's LSN) and
initialization at physical level works much better for large
databases. But there is a problem with logical replication when we run
pg_upgrade – as discussed in this thread. So I just wanted to mention
that if we change the order of actions and first run pg_upgrade, and
only then create publication, there should not be a problem anymore.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Wed, Mar 01, 2023 at 07:56:47AM -0800, Nikolay Samokhvalov wrote:
> On Tue, Feb 28, 2023 at 4:43 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Tue, Feb 28, 2023 at 08:02:13AM -0800, Nikolay Samokhvalov wrote:
> > > 0. Temporarily, forbid running any DDL on the source cluster.
> >
> > This is (at least for me) a non starter, as I want an approach that doesn't
> > impact the primary node, at least not too much.
> ...
> > Also, how exactly would you ensure that indeed DDL were forbidden since a long
> > enough point in time rather than just "currently" forbidden at the time you do
> > some check?
>
> Thanks for your response. I didn't expect that DDL part would attract
> attention, my message was not about DDL... – the DDL part was there
> just to show that the recipe I described is possible for any PG
> version that supports logical replication.

Well, yes but I already mentioned that in my original email as "dropping all
subscriptions and recreating them" is obviously the same as simply creating
them later.  I don't even think that preventing DDL is necessary.

One really important detail you forgot though is that you need to create the
subscription using "copy_data = false".  Not hard to do, but that's not the
default so it's yet another trap users can fall into when trying to do a major
version upgrade that can lead to a corrupted logical replica.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Mar 1, 2023 at 12:25 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:
> > On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> >
> > Okay, but it would be better if you list out your detailed steps. It
> > would be useful to support the new mechanism in this area if others
> > also find your steps to upgrade useful.
>
> Sure.  Here are the overly detailed steps:
>
>  1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
>     whatever), let's call the primary node "A" and replica node "B"
>  2) ensure WAL level is "logical" on the primary node A
>  3) create a logical replication slot on every (connectable) database (or just
>     the one you're interested in if you don't want to preserve everything) on A
>  4) create a FOR ALL TABLE publication (again for every databases or just the
>     one you're interested in)
>  5) wait for replication to be reasonably if not entirely up to date
>  6) promote the standby node B
>  7) retrieve the promotion LSN (from the XXXXXXXX.history file,
>     pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
>  8) call pg_replication_slot_advance() with that LSN for all previously created
>     logical replication slots on A
>

How are these slots used? Do subscriptions use these slots?

>  9) create a normal subscription on all wanted databases on the promoted node
> 10) wait for it to catchup if needed on B
> 12) stop the node B
> 13) run pg_upgrade on B, creating the new node C
> 14) start C, run the global ANALYZE and any sanity check needed (hopefully you
>     would have validated that your application is compatible with that new
>     version before this point)
> 15) re-enable the subscription on C.  This is currently not possible without
>     losing data, the patch fixes that
> 16) wait for it to catchup if needed
> 17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
>     needed
> 18) trash B
> 19) create new nodes D, E... as physical replica from C if needed, possibly
> using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
> needed
> 20) switchover to C and trash A (or convert it to another replica if you want)
> 21) trash the publications on C on all databases
>
> As noted the step 15 is currently problematic, and is also problematic in any
> variation of that scenario that doesn't require you to entirely recreate the
> node C from scratch using logical replication, which is what I want to avoid.
>
> This isn't terribly complicated but requires to be really careful if you don't
> want to end up with an incorrect node C.  This approach is also currently not
> entirely ideal, but hopefully logical replication of sequences and DDL will
> remove the main sources of downtime when upgrading using logical replication.
>

I think there are good chances that one can make mistakes following
all the above steps unless she is an expert.

> My ultimate goal is to provide some tooling to do that in a much simpler way.
> Maybe a new "promote to logical" action that would take care of steps 2 to 9.
> Users would therefore only have to do this "promotion to logical", and then run
> pg_upgrade and create a new physical replication cluster if they want.
>

Why don't we try to support the direct upgrade of logical replication
nodes? Have you tried to analyze what are the obstacles and whether we
can have solutions for those? For example, one of the challenges is to
support the upgrade of slots, can we copy (from the old cluster) and
recreate them in the new cluster by resetting LSNs? We can also reset
origins during the upgrade of subscribers and recommend to first
upgrade the subscriber node.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Thu, Mar 02, 2023 at 03:47:53PM +0530, Amit Kapila wrote:
> On Wed, Mar 1, 2023 at 12:25 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> >  1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
> >     whatever), let's call the primary node "A" and replica node "B"
> >  2) ensure WAL level is "logical" on the primary node A
> >  3) create a logical replication slot on every (connectable) database (or just
> >     the one you're interested in if you don't want to preserve everything) on A
> >  4) create a FOR ALL TABLE publication (again for every databases or just the
> >     one you're interested in)
> >  5) wait for replication to be reasonably if not entirely up to date
> >  6) promote the standby node B
> >  7) retrieve the promotion LSN (from the XXXXXXXX.history file,
> >     pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
> >  8) call pg_replication_slot_advance() with that LSN for all previously created
> >     logical replication slots on A
> >
>
> How are these slots used? Do subscriptions use these slots?

Yes, as this is the only way to make sure that you replicate everything since
the promotion, and only once.  To be more precise, something like that:

CREATE SUBSCRIPTION db_xxx_subscription
   CONNECTION 'dbname=db_xxx user=...'
   PUBLICATION sub_for_db_xxx
   WITH (create_slot = false,
         slot_name = 'slot_for_db_xxx',
         copy_data = false);

> >  9) create a normal subscription on all wanted databases on the promoted node
> > 10) wait for it to catchup if needed on B
> > 12) stop the node B
> > 13) run pg_upgrade on B, creating the new node C
> > 14) start C, run the global ANALYZE and any sanity check needed (hopefully you
> >     would have validated that your application is compatible with that new
> >     version before this point)
> > 15) re-enable the subscription on C.  This is currently not possible without
> >     losing data, the patch fixes that
> > 16) wait for it to catchup if needed
> > 17) create any missing relation and do the ALTER SUBSCRIPTION ... REFRESH if
> >     needed
> > 18) trash B
> > 19) create new nodes D, E... as physical replica from C if needed, possibly
> > using cheaper approach like pg_start_backup() / rsync / pg_stop_backup if
> > needed
> > 20) switchover to C and trash A (or convert it to another replica if you want)
> > 21) trash the publications on C on all databases
> >
> > As noted the step 15 is currently problematic, and is also problematic in any
> > variation of that scenario that doesn't require you to entirely recreate the
> > node C from scratch using logical replication, which is what I want to avoid.
> >
> > This isn't terribly complicated but requires to be really careful if you don't
> > want to end up with an incorrect node C.  This approach is also currently not
> > entirely ideal, but hopefully logical replication of sequences and DDL will
> > remove the main sources of downtime when upgrading using logical replication.
> >
>
> I think there are good chances that one can make mistakes following
> all the above steps unless she is an expert.

Assuming we do fix pg_upgrade behavior with subscriptions, there isn't much
room for error compared to other scenario:

- pg_upgrade has been there for ages and contains a lot of sanity checks.
  People already use it and AFAIK it's not a major pain point, apart from the
  cases where it can be slow
- ALTER SUBSCRIPTIOn ... REFRESH will complain if tables are missing locally
- similarly, the logical replica will complain if you're missing some other DDL
  locally
- you only create replica if you had some in the first place, so it's something
  you should already know how to do.  If not, you didn't have any before the
  upgrade and you still won't have after

> > My ultimate goal is to provide some tooling to do that in a much simpler way.
> > Maybe a new "promote to logical" action that would take care of steps 2 to 9.
> > Users would therefore only have to do this "promotion to logical", and then run
> > pg_upgrade and create a new physical replication cluster if they want.
> >
>
> Why don't we try to support the direct upgrade of logical replication
> nodes? Have you tried to analyze what are the obstacles and whether we
> can have solutions for those? For example, one of the challenges is to
> support the upgrade of slots, can we copy (from the old cluster) and
> recreate them in the new cluster by resetting LSNs? We can also reset
> origins during the upgrade of subscribers and recommend to first
> upgrade the subscriber node.

I'm not sure I get your question.  This whole thread is about direct upgrade of
logical replication nodes, at least the subscribers, and what is currently
preventing it.

For the publisher nodes, that may be something nice to support (I'm assuming it
could be useful for more complex replication setups) but I'm not interested in
that at the moment as my goal is to reduce downtime for major upgrade of
physical replica, thus *not* doing pg_upgrade of the primary node, whether
physical or logical.  I don't see why it couldn't be done later on, if/when
someone has a use case for it.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Mar 2, 2023 at 4:21 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Mar 02, 2023 at 03:47:53PM +0530, Amit Kapila wrote:
> >
> > Why don't we try to support the direct upgrade of logical replication
> > nodes? Have you tried to analyze what are the obstacles and whether we
> > can have solutions for those? For example, one of the challenges is to
> > support the upgrade of slots, can we copy (from the old cluster) and
> > recreate them in the new cluster by resetting LSNs? We can also reset
> > origins during the upgrade of subscribers and recommend to first
> > upgrade the subscriber node.
>
> I'm not sure I get your question.  This whole thread is about direct upgrade of
> logical replication nodes, at least the subscribers, and what is currently
> preventing it.
>

It is only about subscribers and nothing about publishers.

> For the publisher nodes, that may be something nice to support (I'm assuming it
> could be useful for more complex replication setups) but I'm not interested in
> that at the moment as my goal is to reduce downtime for major upgrade of
> physical replica, thus *not* doing pg_upgrade of the primary node, whether
> physical or logical.  I don't see why it couldn't be done later on, if/when
> someone has a use case for it.
>

I thought there is value if we provide a way to upgrade both publisher
and subscriber. Now, you came up with a use case linking it to a
physical replica where allowing an upgrade of only subscriber nodes is
useful. It is possible that users find your steps easy to perform and
didn't find them error-prone but it may be better to get some
authentication of the same. I haven't yet analyzed all the steps in
detail but let's see what others think.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit.kapila16@gmail.com> wrote:

> For the publisher nodes, that may be something nice to support (I'm assuming it
> could be useful for more complex replication setups) but I'm not interested in
> that at the moment as my goal is to reduce downtime for major upgrade of
> physical replica, thus *not* doing pg_upgrade of the primary node, whether
> physical or logical.  I don't see why it couldn't be done later on, if/when
> someone has a use case for it.
>

I thought there is value if we provide a way to upgrade both publisher
and subscriber.

it's still unclear to me whether it's actually achievable on the publisher side, as running pg_upgrade leaves a "hole" in the WAL stream and resets the timeline, among other possible difficulties. Now I don't know much about logical replication internals so I'm clearly not the best person to answer those questions.

Now, you came up with a use case linking it to a
physical replica where allowing an upgrade of only subscriber nodes is
useful. It is possible that users find your steps easy to perform and
didn't find them error-prone but it may be better to get some
authentication of the same. I haven't yet analyzed all the steps in
detail but let's see what others think.

It's been quite some time since and no one seemed to chime in or object. IMO doing a major version upgrade with limited downtime (so something faster than stopping postgres and running pg_upgrade) has always been difficult and never prevented anyone from doing it, so I don't think that it should be a blocker for what I'm suggesting here, especially since the current behavior of pg_upgrade on a subscriber node is IMHO broken.

Is there something that can be done for pg16? I was thinking that having a fix for the normal and easy case could be acceptable: only allowing pg_upgrade to optionally, and not by default, preserve the subscription relations IFF all subscriptions only have tables in ready state. Different states should be transient, and it's easy to check as a user beforehand and also easy to check during pg_upgrade, so it seems like an acceptable limitations (which I personally see as a good sanity check, but YMMV). It could be lifted in later releases if wanted anyway.

It's unclear to me whether this limited scope would also require to preserve the replication origins, but having looked at the code I don't think it would be much of a problem as the local LSN doesn't have to be preserved. In both cases I would prefer a single option (e. g. --preserve-logical-subscription-state or something like that) to avoid too much complications. Similarly, I still don't see any sensible use case for allowing such option in a normal pg_dump so I'd rather not expose that. 

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sat, 4 Mar 2023, 14:13 Amit Kapila, <amit.kapila16@gmail.com> wrote:
>>
>>
>> > For the publisher nodes, that may be something nice to support (I'm assuming it
>> > could be useful for more complex replication setups) but I'm not interested in
>> > that at the moment as my goal is to reduce downtime for major upgrade of
>> > physical replica, thus *not* doing pg_upgrade of the primary node, whether
>> > physical or logical.  I don't see why it couldn't be done later on, if/when
>> > someone has a use case for it.
>> >
>>
>> I thought there is value if we provide a way to upgrade both publisher
>> and subscriber.
>
>
> it's still unclear to me whether it's actually achievable on the publisher side, as running pg_upgrade leaves a
"hole"in the WAL stream and resets the timeline, among other possible difficulties. Now I don't know much about logical
replicationinternals so I'm clearly not the best person to answer those questions. 
>

I think that is the part we need to analyze and see what are the
challenges there. One part of the challenge is that we need to
preserve slots that have some WAL locations like restart_lsn,
confirmed_flush and we need WAL from those locations for decoding. I
haven't analyzed this but isn't it possible to that on clean shutdown
we confirm that all the WAL has been sent and confirmed by the logical
subscriber in which case I think truncating WAL in pg_upgrade
shouldn't be a problem?

>> Now, you came up with a use case linking it to a
>> physical replica where allowing an upgrade of only subscriber nodes is
>> useful. It is possible that users find your steps easy to perform and
>> didn't find them error-prone but it may be better to get some
>> authentication of the same. I haven't yet analyzed all the steps in
>> detail but let's see what others think.
>
>
> It's been quite some time since and no one seemed to chime in or object. IMO doing a major version upgrade with
limiteddowntime (so something faster than stopping postgres and running pg_upgrade) has always been difficult and never
preventedanyone from doing it, so I don't think that it should be a blocker for what I'm suggesting here, especially
sincethe current behavior of pg_upgrade on a subscriber node is IMHO broken. 
>
> Is there something that can be done for pg16? I was thinking that having a fix for the normal and easy case could be
acceptable:only allowing pg_upgrade to optionally, and not by default, preserve the subscription relations IFF all
subscriptionsonly have tables in ready state. Different states should be transient, and it's easy to check as a user
beforehandand also easy to check during pg_upgrade, so it seems like an acceptable limitations (which I personally see
asa good sanity check, but YMMV). It could be lifted in later releases if wanted anyway. 
>
> It's unclear to me whether this limited scope would also require to preserve the replication origins, but having
lookedat the code I don't think it would be much of a problem as the local LSN doesn't have to be preserved. 
>

I think we need to preserve replication origins as they help us to
determine the WAL location from where to start the streaming after the
upgrade. If we don't preserve those then from which location will the
subscriber start streaming? We don't want to replicate the WAL which
has already been sent.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Mar 09, 2023 at 12:05:36PM +0530, Amit Kapila wrote:
> On Wed, Mar 8, 2023 at 12:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Is there something that can be done for pg16? I was thinking that having a
> > fix for the normal and easy case could be acceptable: only allowing
> > pg_upgrade to optionally, and not by default, preserve the subscription
> > relations IFF all subscriptions only have tables in ready state. Different
> > states should be transient, and it's easy to check as a user beforehand and
> > also easy to check during pg_upgrade, so it seems like an acceptable
> > limitations (which I personally see as a good sanity check, but YMMV). It
> > could be lifted in later releases if wanted anyway.
> >
> > It's unclear to me whether this limited scope would also require to
> > preserve the replication origins, but having looked at the code I don't
> > think it would be much of a problem as the local LSN doesn't have to be
> > preserved.
> >
>
> I think we need to preserve replication origins as they help us to
> determine the WAL location from where to start the streaming after the
> upgrade. If we don't preserve those then from which location will the
> subscriber start streaming?

It would start from the slot's information on the publisher side, but I guess
there's no guarantee that this will be accurate in all cases.

> We don't want to replicate the WAL which
> has already been sent.

Yeah I agree.  I added support to also preserve the subscription's replication
origin information, a new --preserve-subscription-state (better naming welcome)
documented option for pg_upgrade to optionally ask for this new mode, and a
similar (but undocumented) option for pg_dump that only works with
--binary-upgrade and added a check in pg_upgrade that all relations are in 'r'
(ready) mode.  Patch v2 attached.

Attachment

Re: pg_upgrade and logical replication

From
Masahiko Sawada
Date:
On Wed, Mar 1, 2023 at 3:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Wed, Mar 01, 2023 at 11:51:49AM +0530, Amit Kapila wrote:
> > On Tue, Feb 28, 2023 at 10:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > Well, as I mentioned I'm *not* interested in a logical-replication-only
> > > scenario.  Logical replication is nice but it will always be less efficient
> > > than physical replication, and some workloads also don't really play well with
> > > it.  So while it can be a huge asset in some cases I'm for now looking at
> > > leveraging logical replication for the purpose of major upgrade only for a
> > > physical replication cluster, so the publications and subscriptions are only
> > > temporary and trashed after use.
> > >
> > > That being said I was only saying that if I had to do a major upgrade of a
> > > logical replication cluster this is probably how I would try to do it, to
> > > minimize downtime, even if there are probably *a lot* difficulties to
> > > overcome.
> > >
> >
> > Okay, but it would be better if you list out your detailed steps. It
> > would be useful to support the new mechanism in this area if others
> > also find your steps to upgrade useful.
>
> Sure.  Here are the overly detailed steps:
>
>  1) setup a normal physical replication cluster (pg_basebackup, restoring PITR,
>     whatever), let's call the primary node "A" and replica node "B"
>  2) ensure WAL level is "logical" on the primary node A
>  3) create a logical replication slot on every (connectable) database (or just
>     the one you're interested in if you don't want to preserve everything) on A
>  4) create a FOR ALL TABLE publication (again for every databases or just the
>     one you're interested in)
>  5) wait for replication to be reasonably if not entirely up to date
>  6) promote the standby node B
>  7) retrieve the promotion LSN (from the XXXXXXXX.history file,
>     pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn()...)
>  8) call pg_replication_slot_advance() with that LSN for all previously created
>     logical replication slots on A
>  9) create a normal subscription on all wanted databases on the promoted node
> 10) wait for it to catchup if needed on B
> 12) stop the node B
> 13) run pg_upgrade on B, creating the new node C
> 14) start C, run the global ANALYZE and any sanity check needed (hopefully you
>     would have validated that your application is compatible with that new
>     version before this point)

I might be missing something but is there any reason why you created a
subscription before pg_upgrade?

Steps like doing pg_upgrade, then creating missing tables, and then
creating a subscription (with copy_data = false) could be an
alternative way to support upgrading the server from the physical
standby?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Mar 23, 2023 at 04:27:28PM +0900, Masahiko Sawada wrote:
>
> I might be missing something but is there any reason why you created a
> subscription before pg_upgrade?
>
> Steps like doing pg_upgrade, then creating missing tables, and then
> creating a subscription (with copy_data = false) could be an
> alternative way to support upgrading the server from the physical
> standby?

As I already answered to Nikolay, and explained in my very first email, yes
it's possible to create the subscriptions after running pg_upgrade.  I
personally prefer to do it first to make sure that the logical replication is
actually functional, so I can still easily do a pg_rewind or something to fix
things without having to trash the newly built (and promoted) replica.

But that exact scenario is a corner case, as in any other scenario pg_upgrade
leaves the subscription in an unrecoverable state, where you have to truncate
all the underlying tables first and start from scratch doing an initial sync.
This kind of defeats the purpose of pg_upgrade.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Mar 09, 2023 at 04:34:56PM +0800, Julien Rouhaud wrote:
> 
> Yeah I agree.  I added support to also preserve the subscription's replication
> origin information, a new --preserve-subscription-state (better naming welcome)
> documented option for pg_upgrade to optionally ask for this new mode, and a
> similar (but undocumented) option for pg_dump that only works with
> --binary-upgrade and added a check in pg_upgrade that all relations are in 'r'
> (ready) mode.  Patch v2 attached.

I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
(Allow pg_dump to include/exclude child tables automatically).  While at it I
also tried to improve the documentation, explaining how that option could be
useful and what is the drawback of not using it (linking to the pg_dump note
about the same) if you plan to reactivate subscription(s) after an upgrade.

Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Julien,

> I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
> (Allow pg_dump to include/exclude child tables automatically).

Thank you for making the patch.
FYI - it could not be applied due to recent commits. SUBOPT_* and attributes
in SubscriptionInfo was added these days.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Apr 06, 2023 at 04:49:59AM +0000, Hayato Kuroda (Fujitsu) wrote:
> Dear Julien,
>
> > I'm attaching a v3 to fix a recent conflict with pg_dump due to a563c24c9574b7
> > (Allow pg_dump to include/exclude child tables automatically).
>
> Thank you for making the patch.
> FYI - it could not be applied due to recent commits. SUBOPT_* and attributes
> in SubscriptionInfo was added these days.

Thanks a lot for warning me!

While rebasing and testing the patch, I realized that I forgot to git-add a
chunk, so I want ahead and added some minimal TAP tests to make sure that the
feature and various checks work as expected, also demonstrating that you can
safely resume after running pg_upgrade a logical replication setup where only
some of the tables are added to a publication, where new rows and new tables
are added to the publication while pg_upgrade is running (for the new table you
obviously need to make sure that the same relation exist on the subscriber side
but that's orthogonal to this patch).

While doing so, I also realized that the subscription's underlying replication
origin remote LSN is only set after some activity is seen *after* the initial
sync, so I also added a new check in pg_upgrade to make sure that all remote
origin tied to a subscription have a valid remote_lsn when the new option is
used.  Documentation is updated to cover that, same for the TAP tests.

v4 attached.

Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Julien,

Thank you for updating the patch. I checked yours.
Followings are general or non-minor questions:

1.
Feature freeze for PG16 has already come. So I think there is no reason to rush
making the patch. Based on above, could you allow to upgrade while synchronizing
data? Personally it can be added as 0002 patch which extends the feature. Or
have you already found any problem?

2.
I have a questions about the SQL interface:

ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

Here the oid of the table is directly specified, but is it really kept between
old and new node? Similar command ALTER PUBLICATION requires the name of table,
not the oid.

3.
Currently getSubscriptionRels() is called from the getSubscriptions(), but I could
not find the reason why we must do like that. Other functions like
getPublicationTables() is directly called from getSchemaData(), so they should
be followed. Additionaly, I found two problems.

* Only tables that to be dumped should be included. See getPublicationTables().
* dropStmt for subscription relations seems not to be needed.
* Maybe security label and comments should be also dumped.

Followings are minor comments.


4. parse_subscription_options

```
+                       opts->state = defGetString(defel)[0];
```

[0] is not needed.

5. AlterSubscription

```
+                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate, stmt->options,
+                                                                                  supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
+
```

SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
reject it?

6. dumpSubscription()

```
+       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+               subinfo->suboriginremotelsn)
+       {
+               appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+       }
```

{} is not needed.

7. pg_dump.h

```
+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+       Oid             srrelid;
+       char    srsubstate;
+       char   *srsublsn;
+} SubRelInfo;
```

This typedef must be added to typedefs.list.

8. check_for_subscription_state

```
            nb = atooid(PQgetvalue(res, 0, 0));
            if (nb != 0)
            {
                is_error = true;
                pg_log(PG_WARNING,
                       "\nWARNING:  %d subscription have invalid remote_lsn",
                       nb);
            }
```

I think no need to use atooid. Additionaly, isn't it better to show the name of
subscriptions which have invalid remote_lsn?

```
        nb = atooid(PQgetvalue(res, 0, 0));
        if (nb != 0)
        {
            is_error = true;
            pg_log(PG_WARNING,
                   "\nWARNING: database \"%s\" has %d subscription "
                   "relations(s) in non-ready state", active_db->db_name, nb);
        }
```

Same as above.

9. parseCommandLine

```
+       user_opts.preserve_subscriptions = false;
```

I think this initialization is not needed because it is default.

And maybe you missed to run pgindent.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v4-0001 (not the test code)

(There are some overlaps here with what Kuroda-san already posted
yesterday because we were looking at the same patch code. Also, a few
of my comments might become moot points if refactoring will be done
according to Kuroda-san's "general" questions).

======
Commit message

1.
To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional commands to be able to restore the content of pg_subscription_rel,
and addition LSN parameter in the subscription creation to restore the
underlying replication origin remote LSN.  The LSN parameter is only accepted
in CREATE SUBSCRIPTION in binary upgrade mode.

~

SUGGESTION
To fix this problem, this patch teaches pg_dump in binary upgrade mode
to emit additional ALTER SUBSCRIPTION commands to facilitate restoring
the content of pg_subscription_rel, and provides an additional LSN
parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. The new ALTER SUBSCRIPTION syntax and
new LSN parameter are not exposed to the user -- they are only
accepted in binary upgrade mode.

======
src/sgml/ref/pgupgrade.sgml

2.
+     <varlistentry>
+      <term><option>--preserve-subscription-state</option></term>
+      <listitem>
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactived.
+        If that option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>
+      </listitem>
+     </varlistentry>

~

2a.
"If that option isn't used" --> "If this option isn't used"

~

2b.
The link renders strangely. It just says:

See the subscription part in the [section called "Notes"] for more information.

Maybe the link part can be rewritten so that it renders more nicely,
and also makes mention of pg_dump.

~

2c.
Maybe it is more readable to have the "isn't used" and "is used" parts
as separate paragraphs?

~

2d.
Typo /reactived/reactivated/ ??

======
src/backend/commands/subscriptioncmds.c

3.
+#define SUBOPT_RELID 0x00008000
+#define SUBOPT_STATE 0x00010000

Maybe 'SUBOPT_RELSTATE' is a better name for this per-relation state option?

~~~

4. SubOpts

+ Oid relid;
+ char state;
 } SubOpts;

(similar to #3)

Maybe 'relstate' is a better name for this per-relation state?

~~~

5. parse_subscription_options

+ else if (IsSet(supported_opts, SUBOPT_STATE) &&
+ strcmp(defel->defname, "state") == 0)
+ {

(similar to #3)

Maybe called this option "relstate".

~

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state used")));

IIUC this is syntax not supposed to be reachable by user input. Maybe
there is some merit in making the errors similar looking to the normal
options, but OTOH it could also be misleading.

This might as well just be: Assert(strlen(state_str) == 1 &&
*state_str == SUBREL_STATE_READY);
or even simply: Assert(IsBinaryUpgrade);

~~~

7. CreateSubscription

+ if(IsBinaryUpgrade)
+ supported_opts |= SUBOPT_LSN;
  parse_subscription_options(pstate, stmt->options, supported_opts, &opts);

7a.
Missing whitespace after the "if".

~

7b.
I wonder if this was deserving of a comment something like "The LSN
option is for internal use only"...

~~~

8. CreateSubscription

+ originid = replorigin_create(originname);
+
+ if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
+ replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
+ false /* backward */ ,
+ false /* WAL log */ );

I think the  'IsBinaryUpgrade' check is redundant here because
SUBOPT_LSN is not possible to be set unless that is true anyhow.

~~~

9. AlterSubscription

+ AddSubscriptionRelState(subid, opts.relid, opts.state,
+ opts.lsn);

This line wrapping of AddSubscriptionRelState seems unnecessary.

======
src/bin/pg_dump/pg_backup.h

10.
+
+ bool preserve_subscriptions;
 } DumpOptions;


Maybe name this field "preserve_subscription_state" for consistency
with the option name.

======
src/bin/pg_dump/pg_dump.c

11. dumpSubscription

  if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
+ {
+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+ "(relid = %u, state = '%c'",
+ qsubname,
+ subinfo->subrels[i].srrelid,
+ subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }
+

Maybe I misunderstood something -- Shouldn't this new ALTER
SUBSCRIPTION TABLE cmd only be happening when the option
dopt->preserve_subscriptions is true?

======
src/bin/pg_dump/pg_dump.h

12. SubRelInfo

+/*
+ * The SubRelInfo struct is used to represent subscription relation.
+ */
+typedef struct _SubRelInfo
+{
+ Oid srrelid;
+ char srsubstate;
+ char   *srsublsn;
+} SubRelInfo;
+

12a.
"represent subscription relation" --> "represent a subscription relation"

~

12b.
Should include the indent file typdefs.list in the patch, and add this
new typedef to it.

======
src/bin/pg_upgrade/check.c

13. check_for_subscription_state

+/*
+ * check_for_subscription_state()
+ *
+ * Verify that all subscriptions have a valid remote_lsn and doesn't contain
+ * any table in a state different than ready.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

SUGGESTION
Verify that all subscriptions have a valid remote_lsn and do not
contain any tables with srsubstate other than READY ('r').

~~~

14.
+ /* No subscription before pg10. */
+ if (GET_MAJOR_VERSION(cluster->major_version < 1000))
+ return;

14a.
The existing checking code seems slightly different to this because
the other check_XXX calls are guarded by the GET_MAJOR_VERSION before
being called.

~

14b.
Furthermore, I was confused about the combination when the < PG10 and
user_opts.preserve_subscriptions is true. Since this is just a return
(not an error) won't the subsequent pg_dump still attempt to use that
option (--preserve-subscriptions) even though we already know it
cannot work?

Would it be better to give an ERROR saying -preserve-subscriptions is
incompatible with the old PG version?

~~~

15.

+ pg_log(PG_WARNING,
+    "\nWARNING:  %d subscription have invalid remote_lsn",
+    nb);

15a.
"have invalid" --> "has invalid"

~

15b.
I guess it would be more useful if the message can include the names
of the failing subscription and/or the relation that was in the wrong
state. Maybe that means moving all this checking logic into the
pg_dump code?

======
src/bin/pg_upgrade/option.c

16. parseCommandLine

  user_opts.transfer_mode = TRANSFER_MODE_COPY;
+ user_opts.preserve_subscriptions = false;

This initial assignment is not needed because user_opts is static.

======
src/bin/pg_upgrade/pg_upgrade.h

17.
  char    *socketdir; /* directory to use for Unix sockets */
+ bool preserve_subscriptions; /* fully transfer subscription state */
 } UserOpts;

Maybe name this field 'preserve_subscription_state' to match the option.

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote:
>
> Thank you for updating the patch. I checked yours.
> Followings are general or non-minor questions:

Thanks!

> 1.
> Feature freeze for PG16 has already come. So I think there is no reason to rush
> making the patch. Based on above, could you allow to upgrade while synchronizing
> data? Personally it can be added as 0002 patch which extends the feature. Or
> have you already found any problem?

I didn't really look into it, mostly because I don't think it's a sensible
use case.  Logical sync of a relation is a heavy and time consuming operation
that requires to retain the xmin for quite some time.  This can already lead to
some bad effect on the publisher, so adding a pg_upgrade in the middle of that
would just make things worse.  Upgrading a subscriber is a rare event that has
to be well planned (you need to test your application with the new version and
so on), initial sync of relation shouldn't happen continually, so having to
wait for the sync to be finished doesn't seem like a source of problem but
might instead avoid some for users who may not fully realize the implications.

If someone has a scenario where running pg_upgrade in the middle of a logical
sync is mandatory I can try to look at it, but for now I just don't see a good
reason to add even more complexity to this part of the code, especially since
adding regression tests seems a bit troublesome.

> 2.
> I have a questions about the SQL interface:
>
> ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
>
> Here the oid of the table is directly specified, but is it really kept between
> old and new node?

Yes, pg_upgrade does need to preserve relation's oid.

> Similar command ALTER PUBLICATION requires the name of table,
> not the oid.

Yes, but those are user facing commands, while ALTER SUBSCRIPTION name ADD
TABLE is only used internally for pg_upgrade.  My goal is to make this command
a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
existing requirements.  If that's really a problem I can use the name instead
but I didn't hear any argument against it for now.

> 3.
> Currently getSubscriptionRels() is called from the getSubscriptions(), but I could
> not find the reason why we must do like that. Other functions like
> getPublicationTables() is directly called from getSchemaData(), so they should
> be followed.

I think you're right, doing a single getSubscriptionRels() rather than once
per subscription should be more efficient.

> Additionaly, I found two problems.
>
> * Only tables that to be dumped should be included. See getPublicationTables().

This is only done during pg_upgrade where all tables are dumped, so there
shouldn't be any need to filter the list.

> * dropStmt for subscription relations seems not to be needed.

I'm not sure I understand this one.  I agree that a dropStmt isn't needed, and
there's no such thing in the patch.  Are you saying that you agree with it?

> * Maybe security label and comments should be also dumped.

Subscription's security labels and comments are already dumped (well should be
dumped, AFAICS pg_dump was never taught to look at shared security label on
objects other than databases but still try to emit them, pg_dumpall instead
handles pg_authid and pg_tablespace), and we can't add security label or
comment on subscription's relations so I don't think this patch is missing
something?

So unless I'm missing something it looks like shared security label handling is
partly broken, but that's orthogonal to this patch.

> Followings are minor comments.
>
>
> 4. parse_subscription_options
>
> ```
> +                       opts->state = defGetString(defel)[0];
> ```
>
> [0] is not needed.

It still needs to be dereferenced, I personally find [0] a bit clearer in that
situation but I'm not opposed to a plain *.

> 5. AlterSubscription
>
> ```
> +                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
> +                               parse_subscription_options(pstate, stmt->options,
> +                                                                                  supported_opts, &opts);
> +
> +                               /* relid and state should always be provided. */
> +                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
> +                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
> +
> ```
>
> SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
> reject it?

If you mean have an Assert for that I agree.  It's not supposed to be used by
users so I don't think having non debug check is sensible, as any user provided
value has no reason to be correct anyway.

> 6. dumpSubscription()
>
> ```
> +       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
> +               subinfo->suboriginremotelsn)
> +       {
> +               appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
> +       }
> ```
>
> {} is not needed.

Yes, but the condition being on two lines it makes it more readable.  I think a
lot of code uses curly braces in similar case already.

> 7. pg_dump.h
>
> ```
> +/*
> + * The SubRelInfo struct is used to represent subscription relation.
> + */
> +typedef struct _SubRelInfo
> +{
> +       Oid             srrelid;
> +       char    srsubstate;
> +       char   *srsublsn;
> +} SubRelInfo;
> ```
>
> This typedef must be added to typedefs.list.

Right!

> 8. check_for_subscription_state
>
> ```
>             nb = atooid(PQgetvalue(res, 0, 0));
>             if (nb != 0)
>             {
>                 is_error = true;
>                 pg_log(PG_WARNING,
>                        "\nWARNING:  %d subscription have invalid remote_lsn",
>                        nb);
>             }
> ```
>
> I think no need to use atooid. Additionaly, isn't it better to show the name of
> subscriptions which have invalid remote_lsn?

Agreed.

> ```
>         nb = atooid(PQgetvalue(res, 0, 0));
>         if (nb != 0)
>         {
>             is_error = true;
>             pg_log(PG_WARNING,
>                    "\nWARNING: database \"%s\" has %d subscription "
>                    "relations(s) in non-ready state", active_db->db_name, nb);
>         }
> ```
>
> Same as above.

Agreed.

> 9. parseCommandLine
>
> ```
> +       user_opts.preserve_subscriptions = false;
> ```
>
> I think this initialization is not needed because it is default.

It's not strictly needed because of C rules but I think it doesn't really hurt
to make it explicit and not have to remember what the standard says.

> And maybe you missed to run pgindent.

I indeed haven't.  There will probably be a global pgindent done soon so I will
do one for this patch afterwards.



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for the v4-0001 test code only.

======

1.
All the comments look alike, so it is hard to know what is going on.
If each of the main test parts could be highlighted then the test code
would be easier to read IMO.

Something like below:

# ==========
# TEST CASE: Check that pg_upgrade refuses to upgrade a subscription
when the replication origin is not set.
#
# replication origin's remote_lsn isn't set if data was not replicated after the
# initial sync.

...

# ==========
# TEST CASE: Check that pg_upgrade refuses to upgrade a subscription
with non-ready tables.

...

# ==========
# TEST CASE: Check that pg_upgrade works when all subscription tables are ready.

...

# ==========
# TEST CASE: Change the publication while the old subscriber is offline.
#
# Stop the old subscriber, insert a row in each table while it's down, and add
# t2 to the publication.

...

# ==========
# TEST CASE: Enable the subscription.

...

# ==========
# TEST CASE: Refresh the subscription to get the newly published table t2.
#
# Only the missing row on t2 show be replicated.

~~~

2.
+# replication origin's remote_lsn isn't set if not data is replicated after the
+# initial sync

wording:
/if not data is replicated/if data is not replicated/

~~~

3.
# Make sure the replication origin is set

I was not sure if all of the SELECT COUNT(*) checking is needed
because it just seems normal pub/sub functionality. There is no
pg_upgrade happening, so really it seemed the purpose of this part was
mainly to set the origin so that it will not be a blocker for
ready-state tests that follow this code. Maybe this can just be
incorporated into the following test part.

~~~

4.
# There should be no new replicated rows before enabling the subscription
$result = $new_sub->safe_psql('postgres',
    "SELECT count(*) FROM t1");
is ($result, qq(2), "Table t1 should still have 2 rows on the new subscriber");

4a.
TBH, I felt it might be easier to follow if the SQL was checking for
WHERE (text = "while old_sub is down") etc, rather than just using
SELECT COUNT(*), and then trusting the comments to describe what the
different counts mean.

~

4b.
All these messages like "Table t1 should still have 2 rows on the new
subscriber" don't seem very helpful. e.g. They are not saying anything
about WHAT this is testing or WHY it should still have 2 rows.

~~~

5.
# Refresh the subscription, only the missing row on t2 show be replicated

/show/should/

------
Kind Regards,
Peter Smith.
Fujitsu Australia.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
On Thu, Apr 13, 2023 at 10:51:10AM +0800, Julien Rouhaud wrote:
>
> On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote:
> >
> > 5. AlterSubscription
> >
> > ```
> > +                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
> > +                               parse_subscription_options(pstate, stmt->options,
> > +                                                                                  supported_opts, &opts);
> > +
> > +                               /* relid and state should always be provided. */
> > +                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
> > +                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
> > +
> > ```
> >
> > SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
> > reject it?
>
> If you mean have an Assert for that I agree.  It's not supposed to be used by
> users so I don't think having non debug check is sensible, as any user provided
> value has no reason to be correct anyway.

After looking at the code I remember that I kept the lsn optional in ALTER
SUBSCRIPTION name ADD TABLE command processing.  For now pg_upgrade checks that
all subscriptions have a valid remote_lsn so there should indeed always be a
value different from InvalidLSN/none specified, but it's still unclear to me
whether this check will eventually be weakened or not, so for now I think it's
better to keep AlterSubscription accept this case, here and in all other code
paths.

If there's a hard objection I will just make the lsn mandatory.

> > 9. parseCommandLine
> >
> > ```
> > +       user_opts.preserve_subscriptions = false;
> > ```
> >
> > I think this initialization is not needed because it is default.
>
> It's not strictly needed because of C rules but I think it doesn't really hurt
> to make it explicit and not have to remember what the standard says.

So I looked at nearby code and other option do rely on zero-initialized global
variables, so I agree that this initialization should be removed.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Apr 13, 2023 at 12:42:05PM +1000, Peter Smith wrote:
> Here are some review comments for patch v4-0001 (not the test code)

Thanks!

>
> (There are some overlaps here with what Kuroda-san already posted
> yesterday because we were looking at the same patch code. Also, a few
> of my comments might become moot points if refactoring will be done
> according to Kuroda-san's "general" questions).

Ok, for the record, the parts I don't reply to are things I fully agree with
and already changed locally.

> ======
> Commit message
>
> 1.
> To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
> additional commands to be able to restore the content of pg_subscription_rel,
> and addition LSN parameter in the subscription creation to restore the
> underlying replication origin remote LSN.  The LSN parameter is only accepted
> in CREATE SUBSCRIPTION in binary upgrade mode.
>
> ~
>
> SUGGESTION
> To fix this problem, this patch teaches pg_dump in binary upgrade mode
> to emit additional ALTER SUBSCRIPTION commands to facilitate restoring
> the content of pg_subscription_rel, and provides an additional LSN
> parameter for CREATE SUBSCRIPTION to restore the underlying
> replication origin remote LSN. The new ALTER SUBSCRIPTION syntax and
> new LSN parameter are not exposed to the user -- they are only
> accepted in binary upgrade mode.

Thanks, I eventually adapted a bit more the suggested wording:

To fix this problem, this patch teaches pg_dump in binary upgrade mode to emit
additional ALTER SUBSCRIPTION subcommands that will restore the content of
pg_subscription_rel, and also provides an additional LSN parameter for CREATE
SUBSCRIPTION to restore the underlying replication origin remote LSN.  The new
ALTER SUBSCRIPTION subcommand and the new LSN parameter are not exposed to
users and only accepted in binary upgrade mode.

The new ALTER SUBSCRIPTION subcommand has the following syntax:

> 2b.
> The link renders strangely. It just says:
>
> See the subscription part in the [section called "Notes"] for more information.
>
> Maybe the link part can be rewritten so that it renders more nicely,
> and also makes mention of pg_dump.

Yes I saw that.  I didn't try to look at it yet but that's indeed what I wanted
to do eventually.

> ======
> src/backend/commands/subscriptioncmds.c
>
> 3.
> +#define SUBOPT_RELID 0x00008000
> +#define SUBOPT_STATE 0x00010000
>
> Maybe 'SUBOPT_RELSTATE' is a better name for this per-relation state option?

I looked at it but part of the existing code is already using state as a
variable name, to be consistent with pg_subscription_rel.srsubstate.  I think
it's better to use the same pattern in this patch.

> 6.
> + if (strlen(state_str) != 1)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid relation state used")));
>
> IIUC this is syntax not supposed to be reachable by user input. Maybe
> there is some merit in making the errors similar looking to the normal
> options, but OTOH it could also be misleading.

It doesn't cost much and may be helpful for debugging so I will use error
messages similar to the error facing ones.

> This might as well just be: Assert(strlen(state_str) == 1 &&
> *state_str == SUBREL_STATE_READY);
> or even simply: Assert(IsBinaryUpgrade);

As I mentioned in a previous email, it's still unclear to me whether the
restriction on the srsubstate will be weakened or not, so I prefer to keep such
part of the code generic and have the restriction centralized in the pg_upgrade
check.

I added some Assert(IsBinaryUpgrade) in those code path as it may not be
evident in this place that it's a requirement.


> 7. CreateSubscription
>
> + if(IsBinaryUpgrade)
> + supported_opts |= SUBOPT_LSN;
>   parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
> 7b.
> I wonder if this was deserving of a comment something like "The LSN
> option is for internal use only"...

I was thinking that being valid only for IsBinaryUpgrade would be enough?

> 8. CreateSubscription
>
> + originid = replorigin_create(originname);
> +
> + if (IsBinaryUpgrade && IsSet(opts.lsn, SUBOPT_LSN))
> + replorigin_advance(originid, opts.lsn, InvalidXLogRecPtr,
> + false /* backward */ ,
> + false /* WAL log */ );
>
> I think the  'IsBinaryUpgrade' check is redundant here because
> SUBOPT_LSN is not possible to be set unless that is true anyhow.

It's indeed redundant for now, but it's also used as a safeguard if some code
is changed.  Maybe just having an assert(IsBinaryUpgrade) would be better
though.

While looking at it I noticed that this code was never reached, as I should
have checked IsSet(opts.specified_opts, ...).  I fixed that and added a TAP
test to make sure that the restored remote_lsn is the same as on the old
subscription node.

> 9. AlterSubscription
>
> + AddSubscriptionRelState(subid, opts.relid, opts.state,
> + opts.lsn);
>
> This line wrapping of AddSubscriptionRelState seems unnecessary.

Without it the line reaches 81 characters :(

> ======
> src/bin/pg_dump/pg_backup.h
>
> 10.
> +
> + bool preserve_subscriptions;
>  } DumpOptions;
>
>
> Maybe name this field "preserve_subscription_state" for consistency
> with the option name.

That's what I thought when I first wrote that code but I quickly had to use a
shorter name to avoid bloating the line length everywhere.

> ======
> src/bin/pg_dump/pg_dump.c
>
> 11. dumpSubscription
>
>   if (subinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)
> + {
> + for (i = 0; i < subinfo->nrels; i++)
> + {
> + appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
> + "(relid = %u, state = '%c'",
> + qsubname,
> + subinfo->subrels[i].srrelid,
> + subinfo->subrels[i].srsubstate);
> +
> + if (subinfo->subrels[i].srsublsn[0] != '\0')
> + appendPQExpBuffer(query, ", LSN = '%s'",
> +   subinfo->subrels[i].srsublsn);
> +
> + appendPQExpBufferStr(query, ");");
> + }
> +
>
> Maybe I misunderstood something -- Shouldn't this new ALTER
> SUBSCRIPTION TABLE cmd only be happening when the option
> dopt->preserve_subscriptions is true?

It indirectly is, as in that case subinfo->nrels is guaranteed to be 0.  I just
tried to keep the code simpler and avoid too many nested conditions.

> 12b.
> Should include the indent file typdefs.list in the patch, and add this
> new typedef to it.

FTR I checked and there wasn't too many noise when running pgindent on the
touched files, so I already locally added the new typedef and ran pgindent.

> 14.
> + /* No subscription before pg10. */
> + if (GET_MAJOR_VERSION(cluster->major_version < 1000))
> + return;
>
> 14a.
> The existing checking code seems slightly different to this because
> the other check_XXX calls are guarded by the GET_MAJOR_VERSION before
> being called.

No opinion on that, so I moved all the checks on the caller side.


> 14b.
> Furthermore, I was confused about the combination when the < PG10 and
> user_opts.preserve_subscriptions is true. Since this is just a return
> (not an error) won't the subsequent pg_dump still attempt to use that
> option (--preserve-subscriptions) even though we already know it
> cannot work?

Will it error out though?  I haven't tried but I think it will just silently do
nothing, which maybe isn't ideal, but may be somewhat expected if you try to
preserve something that doesn't exist.

> Would it be better to give an ERROR saying -preserve-subscriptions is
> incompatible with the old PG version?

I'm not opposed to adding some error, but I don't really know where it would
really be suitable.  Maybe in the same code path explicitly error out if the
preserve subscription option is used with a pg10- source server?

> 15b.
> I guess it would be more useful if the message can include the names
> of the failing subscription and/or the relation that was in the wrong
> state. Maybe that means moving all this checking logic into the
> pg_dump code?

I think it's better to have the checks only once, so in pg_upgrade, but I'm not
strongly opposed to duplicate those tests if there's any complaint.  In the
meantime I rephrased the warning to give the name of the problematic
subscription (but not the list of relation, as it's more likely to be a long
list and it's easy to check manually afterwards and/or wait for all sync to
finish).



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Julien,

> I didn't really look into it, mostly because I don't think it's a sensible
> use case.  Logical sync of a relation is a heavy and time consuming operation
> that requires to retain the xmin for quite some time.  This can already lead to
> some bad effect on the publisher, so adding a pg_upgrade in the middle of that
> would just make things worse.  Upgrading a subscriber is a rare event that has
> to be well planned (you need to test your application with the new version and
> so on), initial sync of relation shouldn't happen continually, so having to
> wait for the sync to be finished doesn't seem like a source of problem but
> might instead avoid some for users who may not fully realize the implications.
>
> If someone has a scenario where running pg_upgrade in the middle of a logical
> sync is mandatory I can try to look at it, but for now I just don't see a good
> reason to add even more complexity to this part of the code, especially since
> adding regression tests seems a bit troublesome.

I do not have any scenarios which run pg_upgrade while synchronization because I
agree that upgrading can be well planned. So it may be OK not to add it in order
to keep the patch simpler.

> > Here the oid of the table is directly specified, but is it really kept between
> > old and new node?
>
> Yes, pg_upgrade does need to preserve relation's oid.

I confirmed and agreed. dumpTableSchema() dumps an additional function
pg_catalog.binary_upgrade_set_next_heap_pg_class_oid() before each CREATE TABLE
statements. The function force the table to have the specified OID.

> > Similar command ALTER PUBLICATION requires the name of table,
> > not the oid.
>
> Yes, but those are user facing commands, while ALTER SUBSCRIPTION name
> ADD
> TABLE is only used internally for pg_upgrade.  My goal is to make this command
> a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
> existing requirements.  If that's really a problem I can use the name instead
> but I didn't hear any argument against it for now.

OK, make sense.

>
> > 3.
> > Currently getSubscriptionRels() is called from the getSubscriptions(), but I
> could
> > not find the reason why we must do like that. Other functions like
> > getPublicationTables() is directly called from getSchemaData(), so they should
> > be followed.
>
> I think you're right, doing a single getSubscriptionRels() rather than once
> per subscription should be more efficient.

Yes, we do not have to divide reading pg_subscription_rel per subscriptions.

> > Additionaly, I found two problems.
> >
> > * Only tables that to be dumped should be included. See getPublicationTables().
>
> This is only done during pg_upgrade where all tables are dumped, so there
> shouldn't be any need to filter the list.
>
> > * dropStmt for subscription relations seems not to be needed.
>
> I'm not sure I understand this one.  I agree that a dropStmt isn't needed, and
> there's no such thing in the patch.  Are you saying that you agree with it?

Sorry for unclear suggestion. I meant to say that we could keep current style even
if getSubscriptionRels() is called separately. Your understanding which it is not
needed is right.

> > * Maybe security label and comments should be also dumped.
>
> Subscription's security labels and comments are already dumped (well should be
> dumped, AFAICS pg_dump was never taught to look at shared security label on
> objects other than databases but still try to emit them, pg_dumpall instead
> handles pg_authid and pg_tablespace), and we can't add security label or
> comment on subscription's relations so I don't think this patch is missing
> something?
>
> So unless I'm missing something it looks like shared security label handling is
> partly broken, but that's orthogonal to this patch.
>
> > Followings are minor comments.
> >
> >
> > 4. parse_subscription_options
> >
> > ```
> > +                       opts->state = defGetString(defel)[0];
> > ```
> >
> > [0] is not needed.
>
> It still needs to be dereferenced, I personally find [0] a bit clearer in that
> situation but I'm not opposed to a plain *.

Sorry, I was confused. You are right.

> > 5. AlterSubscription
> >
> > ```
> > +                               supported_opts = SUBOPT_RELID |
> SUBOPT_STATE | SUBOPT_LSN;
> > +                               parse_subscription_options(pstate,
> stmt->options,
> > +
> supported_opts, &opts);
> > +
> > +                               /* relid and state should always be
> provided. */
> > +                               Assert(IsSet(opts.specified_opts,
> SUBOPT_RELID));
> > +                               Assert(IsSet(opts.specified_opts,
> SUBOPT_STATE));
> > +
> > ```
> >
> > SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
> > reject it?
>
> If you mean have an Assert for that I agree.  It's not supposed to be used by
> users so I don't think having non debug check is sensible, as any user provided
> value has no reason to be correct anyway.

Yes, I meant to request to add an Assert. Maybe you can add:
Assert(IsSet(opts.specified_opts, SUBOPT_LSN) && !XLogRecPtrIsInvalid(opts.lsn));

>
After looking at the code I remember that I kept the lsn optional in ALTER
SUBSCRIPTION name ADD TABLE command processing.  For now pg_upgrade checks that
all subscriptions have a valid remote_lsn so there should indeed always be a
value different from InvalidLSN/none specified, but it's still unclear to me
whether this check will eventually be weakened or not, so for now I think it's
better to keep AlterSubscription accept this case, here and in all other code
paths.

If there's a hard objection I will just make the lsn mandatory.
>

I have tested, but srsublsn became NULL if copy_data was specified as off.
This is because when copy_data is false, all tuples in pg_subscription_rels are filled
as state = 'r' and srsublsn = NULL, and tablesync workers will never boot.
See CreateSubscription().
Doesn't it mean that there is a possibility that LSN option is not specified while
ALTER SUBSCRIPTION ADD TABLE?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Julien,

I found a cfbot failure on macOS [1]. According to the log,
"SELECT count(*) FROM t2" was executed before synchronization was done.

```
[09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
```

With the patch present, wait_for_catchup() is executed after REFRESH, but
it may not be sufficient because it does not check pg_subscription_rel.
wait_for_subscription_sync() seems better for the purpose.


[1]: https://cirrus-ci.com/task/6563827802701824

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:
>
> 1.
> All the comments look alike, so it is hard to know what is going on.
> If each of the main test parts could be highlighted then the test code
> would be easier to read IMO.
>
> Something like below:
> [...]

I added a bit more comments about what's is being tested.  I'm not sure that a
big TEST CASE prefix is necessary, as it's not really multiple separated test
cases and other stuff can be tested in between.  Also AFAICT no other TAP test
current needs this kind of banner, even if they're testing more complex
scenario.

> 2.
> +# replication origin's remote_lsn isn't set if not data is replicated after the
> +# initial sync
>
> wording:
> /if not data is replicated/if data is not replicated/

I actually mean "if no data", which is a bit different than what you suggest.
Fixed.

> 3.
> # Make sure the replication origin is set
>
> I was not sure if all of the SELECT COUNT(*) checking is needed
> because it just seems normal pub/sub functionality. There is no
> pg_upgrade happening, so really it seemed the purpose of this part was
> mainly to set the origin so that it will not be a blocker for
> ready-state tests that follow this code. Maybe this can just be
> incorporated into the following test part.

Since this patch is transferring internal details about subscriptions I prefer
to be thorough about what is tested, when data is actually being replicated and
so on so if something is broken (relation added to the wrong subscription,
wrong oid or something) it should immediately show what's happening.

> 4a.
> TBH, I felt it might be easier to follow if the SQL was checking for
> WHERE (text = "while old_sub is down") etc, rather than just using
> SELECT COUNT(*), and then trusting the comments to describe what the
> different counts mean.

I prefer the plain count as it's a simple way to make sure that the state is
exactly what's wanted.  If for some reason the patch leads to previous row
being replicated again, such a test wouldn't reveal it.  Sure, it could be
broken enough so that one old row is replicated twice and the new row isn't
replicated, but it seems so unlikely that I don't think that testing the whole
table content is necessary.

> 4b.
> All these messages like "Table t1 should still have 2 rows on the new
> subscriber" don't seem very helpful. e.g. They are not saying anything
> about WHAT this is testing or WHY it should still have 2 rows.

I don't think that those messages are supposed to say what or why something is
tested, just give a quick context / reference on the test in case it's broken.
The comments are there to explain in more details what is tested and/or why.

> 5.
> # Refresh the subscription, only the missing row on t2 show be replicated
>
> /show/should/

Fixed.



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Fri, Apr 14, 2023 at 04:19:35AM +0000, Hayato Kuroda (Fujitsu) wrote:
>
> I have tested, but srsublsn became NULL if copy_data was specified as off.
> This is because when copy_data is false, all tuples in pg_subscription_rels are filled
> as state = 'r' and srsublsn = NULL, and tablesync workers will never boot.
> See CreateSubscription().
> Doesn't it mean that there is a possibility that LSN option is not specified while
> ALTER SUBSCRIPTION ADD TABLE?

It shouldn't be the case for now, as pg_upgrade will check first if there's a
invalid remote_lsn and refuse to proceed if that's the case.  Also, the
remote_lsn should be set as soon as some data is replicated, so unless you add
a table that's never modified to a publication you should be able to run
pg_upgrade at some point, once there's replicated DML on such a table.

I'm personally fine with the current restrictions, but I don't really use
logical replication in any project so maybe I'm not objective enough.  For now
I'd rather keep things as-is, and later improve on it if some people want to
lift such restrictions (and such restrictions can actually be lifted).



Re: pg_upgrade and logical replication

From
Julien Rouhaud
Date:
Hi,

On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:
>
> I found a cfbot failure on macOS [1]. According to the log,
> "SELECT count(*) FROM t2" was executed before synchronization was done.
>
> ```
> [09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
> ```
>
> With the patch present, wait_for_catchup() is executed after REFRESH, but
> it may not be sufficient because it does not check pg_subscription_rel.
> wait_for_subscription_sync() seems better for the purpose.

Fixed, thanks!

v5 attached with all previously mentioned fixes.

Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Julien,

Thank you for updating the patch! Followings are my comments.

01. documentation

In this page steps to upgrade server with pg_upgrade is aligned. Should we write
down about subscriber? IIUC, it is sufficient to just add to "Run pg_upgrade",
like "Apart from streaming replication standby, subscriber node can be upgrade
via pg_upgrade. At that time we strongly recommend to use --preserve-subscription-state".

02. AlterSubscription

I agreed that oid must be preserved between nodes, but I'm still afraid that
given oid is unconditionally trusted and added to pg_subscription_rel.
I think we can check the existenec of the relation via SearchSysCache1(RELOID,
ObjectIdGetDatum(relid)). Of cource the check is optional, so it should be
executed only when USE_ASSERT_CHECKING is on. Thought?

03. main

Currently --preserve-subscription-state and --no-subscriptions can be used
together, but the situation is quite unnatural. Shouldn't we exclude them?

04. getSubscriptionTables


```
+       SubRelInfo *rels = NULL;
```

The variable is used only inside the loop, so the definition should be also moved.

05. getSubscriptionTables

```
+                       nrels = atooid(PQgetvalue(res, i, i_nrels));
```

atoi() should be used instead of atooid().

06. getSubscriptionTables

```
+                       subinfo = findSubscriptionByOid(cur_srsubid);
+
+                       nrels = atooid(PQgetvalue(res, i, i_nrels));
+                       rels = pg_malloc(nrels * sizeof(SubRelInfo));
+
+                       subinfo->subrels = rels;
+                       subinfo->nrels = nrels;
```

Maybe it never occurs, but findSubscriptionByOid() can return NULL. At that time
accesses to their attributes will lead the Segfault. Some handling is needed.

07. dumpSubscription

Hmm, SubRelInfos are still dumped at the dumpSubscription(). I think this style
breaks the manner of pg_dump. I think another dump function is needed. Please
see dumpPublicationTable() and dumpPublicationNamespace(). If you have a reason
to use the style, some comments to describe it is needed.

08. _SubRelInfo

If you will address above comment, DumpableObject must be added as new attribute.

09. check_for_subscription_state

```
+                       for (int i = 0; i < ntup; i++)
+                       {
+                               is_error = true;
+                               pg_log(PG_WARNING,
+                                          "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+                                          PQgetvalue(res, 0, 0));
+                       }
```

The second argument should be i to report the name of subscription more than 2.

10. 003_subscription.pl

```
$old_sub->wait_for_subscription_sync($publisher, 'sub');

my $result = $old_sub->safe_psql('postgres',
    "SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
```

I think there is a possibility to cause a timing issue, because the SELECT may
be executed before srsubstate is changed from 's' to 'r'. Maybe poll_query_until()
can be used instead.

11. 003_subscription.pl

```
command_ok(
    [
        'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
        '-D',         $new_sub->data_dir, '-b', $bindir,
        '-B',         $bindir,            '-s', $new_sub->host,
        '-p',         $old_sub->port,     '-P', $new_sub->port,
        $mode,
        '--preserve-subscription-state',
        '--check',
    ],
    'run of pg_upgrade --check for old instance with correct sub rel');
```

Missing check of pg_upgrade_output.d?

And maybe you missed to run pgperltidy.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for the v5-0001 patch code.

======
General

1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])

I was a bit confused by this relation 'state' mentioned in multiple
places. IIUC the pg_upgrade logic is going to reject anything with a
non-READY (not 'r') state anyhow, so what is the point of having all
the extra grammar/parse_subscription_options etc to handle setting the
state when only possible value must be 'r'?

~~~

2. state V relstate

I still feel code readbility suffers a bit by calling some fields/vars
a generic 'state' instead of the more descriptive 'relstate'. Maybe
it's just me.

Previously commented same (see [1]#3, #4, #5)

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+       <para>
+        Fully preserve the logical subscription state if any.  That includes
+        the underlying replication origin with their remote LSN and the list of
+        relations in each subscription so that replication can be simply
+        resumed if the subscriptions are reactivated.
+       </para>

I think the "if any" part is not necessary. If you remove those words,
then the rest of the sentence can be simplified.

SUGGESTION
Fully preserve the logical subscription state, which includes the
underlying replication origin's remote LSN, and the list of relations
in each subscription. This allows replication to simply resume when
the subscriptions are reactivated.

~~~

4.
+       <para>
+        If this option isn't used, it is up to the user to reactivate the
+        subscriptions in a suitable way; see the subscription part in <xref
+        linkend="pg-dump-notes"/> for more information.
+       </para>

The link still renders strangely as previously reported (see [1]#2b).

~~~

5.
+       <para>
+        If this option is used and any of the subscription on the old cluster
+        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
+        in a state different from <literal>r</literal> (ready), the
+        <application>pg_upgrade</application> run will error.
+       </para>

5a.
/subscription/subscriptions/

~

5b
"has any relation in a state different from r" --> "has any relation
with state other than r"

======
src/backend/commands/subscriptioncmds.c

6.
+ if (strlen(state_str) != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid relation state: %s", state_str)));

Is this relation state validation overly simplistic, by only checking
for length 1? Shouldn't this just be asserting the relstate must be
'r'?

======
src/bin/pg_dump/pg_dump.c

7. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about the given subscription's relations
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ SubscriptionInfo *subinfo;
+ SubRelInfo *rels = NULL;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int i_nrels;
+ int i,
+ cur_rel = 0,
+ ntups,
+ last_srsubid = InvalidOid;

Why some above are single int declarations and some are compound int
declarations? Why not make them all consistent?

~

8.
+ appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
+   " count(*) OVER (PARTITION BY srsubid) AS nrels"
+   " FROM pg_subscription_rel"
+   " ORDER BY srsubid");

Should this SQL be schema-qualified like pg_catalog.pg_subscription_rel?

~

9.
+ for (i = 0; i < ntups; i++)
+ {
+ int cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));

Should 'cur_srsubid' be declared Oid to match the atooid?

~~~

10. getSubscriptions

+ if (PQgetisnull(res, i, i_suboriginremotelsn))
+ subinfo[i].suboriginremotelsn = NULL;
+ else
+ subinfo[i].suboriginremotelsn =
+ pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
+
+ /*
+ * For now assume there's no relation associated with the
+ * subscription. Later code might update this field and allocate
+ * subrels as needed.
+ */
+ subinfo[i].nrels = 0;

The wording "For now assume there's no" kind of gives an ambiguous
interpretation for this comment. IMO it sounds like this is the
"current" logic but some future PG version may behave differently - I
don't think that is the intended meaning at all.

SUGGESTION.
Here we just initialize nrels to say there are 0 relations associated
with the subscription. If necessary, subsequent logic will update this
field and allocate the subrels.

~~~

11. dumpSubscription

+ for (i = 0; i < subinfo->nrels; i++)
+ {
+ appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
+   "(relid = %u, state = '%c'",
+   qsubname,
+   subinfo->subrels[i].srrelid,
+   subinfo->subrels[i].srsubstate);
+
+ if (subinfo->subrels[i].srsublsn[0] != '\0')
+ appendPQExpBuffer(query, ", LSN = '%s'",
+   subinfo->subrels[i].srsublsn);
+
+ appendPQExpBufferStr(query, ");");
+ }

I previously asked ([1]#11) about how can this ALTER SUBSCRIPTION
TABLE code happen unless 'preserve_subscriptions' is true, and you
confirmed "It indirectly is, as in that case subinfo->nrels is
guaranteed to be 0. I just tried to keep the code simpler and avoid
too many nested conditions."

~

If you are worried about too many nested conditions then a simple
Assert(dopt->preserve_subscriptions); might be good to have here.

======
src/bin/pg_upgrade/check.c

12. check_and_dump_old_cluster

+ /* PG 10 introduced subscriptions. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
+ user_opts.preserve_subscriptions)
+ {
+ check_for_subscription_state(&old_cluster);
+ }

12a.
All the other checks in this function seem to be in decreasing order
of PG version so maybe this check should be moved to follow that same
pattern.

~

12b.
Also won't it be better to give some error or notice of some kind if
the option/version are incompatible? I think this was mentioned in a
previous review.

e.g.

if (user_opts.preserve_subscriptions)
{
    if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
        <pg_log or pg_fatal goes here...>;
    check_for_subscription_state(&old_cluster);
}

~~~

13. check_for_subscription_state

+ for (int i = 0; i < ntup; i++)
+ {
+ is_error = true;
+ pg_log(PG_WARNING,
+    "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
+    PQgetvalue(res, 0, 0));
+ }

13a.
This WARNING does not mention the database, but a similar warning
later about the non-ready state does mention the database. Probably
they should be consistent.

~

13b.
Something seems amiss. Here the is_error is assigned true; But later
when you test is_error that is for logging the ready-state problem.
Isn't there another missing pg_fatal for this invalid remote_lsn case?

======
src/bin/pg_upgrade/option.c

14. usage

+ printf(_("  --preserve-subscription-state preserve the subscription
state fully\n"));

Why say "fully"? How is "preserve the subscription state fully"
different to "preserve the subscription state" from the user's POV?

------
[1] My previous v4 code review -
https://www.postgresql.org/message-id/CAHut%2BPuThBY%3DMSYHRgUa6iv6tyCmnqU78itZ%2Bf4rMM2b124vqQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
On Mon, Apr 24, 2023 at 4:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:
> >
> > 1.
> > All the comments look alike, so it is hard to know what is going on.
> > If each of the main test parts could be highlighted then the test code
> > would be easier to read IMO.
> >
> > Something like below:
> > [...]
>
> I added a bit more comments about what's is being tested.  I'm not sure that a
> big TEST CASE prefix is necessary, as it's not really multiple separated test
> cases and other stuff can be tested in between.  Also AFAICT no other TAP test
> current needs this kind of banner, even if they're testing more complex
> scenario.

Hmm, I think there are plenty of examples of subscription TAP tests
having some kind of highlighted comments as suggested, for better
readability.

e.g. See src/test/subscription
t/014_binary.pl
t/015_stream.pl
t/016_stream_subxact.pl
t/018_stream_subxact_abort.pl
t/021_twophase.pl
t/022_twophase_cascade.pl
t/023_twophase_stream.pl
t/028_row_filter.pl
t/030_origin.pl
t/031_column_list.pl
t/032_subscribe_use_index.pl

A simple #################### to separate the main test parts is all
that is needed.


> > 4b.
> > All these messages like "Table t1 should still have 2 rows on the new
> > subscriber" don't seem very helpful. e.g. They are not saying anything
> > about WHAT this is testing or WHY it should still have 2 rows.
>
> I don't think that those messages are supposed to say what or why something is
> tested, just give a quick context / reference on the test in case it's broken.
> The comments are there to explain in more details what is tested and/or why.
>

But, why can’t they do both? They can be a quick reference *and* at
the same time give some more meaning to the error log.  Otherwise,
these messages might as well just say ‘ref1’, ‘ref2’, ‘ref3’...

------
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 24 Apr 2023 at 12:52, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:
> >
> > I found a cfbot failure on macOS [1]. According to the log,
> > "SELECT count(*) FROM t2" was executed before synchronization was done.
> >
> > ```
> > [09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
> > ```
> >
> > With the patch present, wait_for_catchup() is executed after REFRESH, but
> > it may not be sufficient because it does not check pg_subscription_rel.
> > wait_for_subscription_sync() seems better for the purpose.
>
> Fixed, thanks!

I had a high level look at the patch, few comments:
1) New ereport style can be used by removing the brackets around errcode:
1.a)
+                               ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                                errmsg("invalid
relation identifier used: %s", rel_str)));
+                       }

1.b)
+                       if (strlen(state_str) != 1)
+                               ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                                errmsg("invalid
relation state: %s", state_str)));

1.c)
+               case ALTER_SUBSCRIPTION_ADD_TABLE:
+                       {
+                               if (!IsBinaryUpgrade)
+                                       ereport(ERROR,
+
(errcode(ERRCODE_SYNTAX_ERROR)),
+                                                       errmsg("ALTER
SUBSCRIPTION ... ADD TABLE is not supported"));


2) Since this is a single statement, the braces are not required in this case:
2.a)
+       if (!fout->dopt->binary_upgrade ||
!fout->dopt->preserve_subscriptions ||
+               fout->remoteVersion < 100000)
+       {
+               return;
+       }

2.b) Similarly here too
+       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+               subinfo->suboriginremotelsn)
+       {
+               appendPQExpBuffer(query, ", lsn = '%s'",
subinfo->suboriginremotelsn);
+       }

3) Since this comment is a very short comment, this can be changed
into a single line comment:
+       /*
+        * Get subscription relation fields.
+        */

4) Since cur_rel will be initialized in "if (cur_srsubid !=
last_srsubid)", it need not be initialized here:
+       int                     i,
+                               cur_rel = 0,
+                               ntups,

5) SubRelInfo should be placed above SubRemoveRels:
+++ b/src/tools/pgindent/typedefs.list
@@ -2647,6 +2647,7 @@ SubqueryScan
 SubqueryScanPath
 SubqueryScanState
 SubqueryScanStatus
+SubRelInfo
 SubscriptExecSetup

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 24 Apr 2023 at 12:52, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Tue, Apr 18, 2023 at 01:40:51AM +0000, Hayato Kuroda (Fujitsu) wrote:
> >
> > I found a cfbot failure on macOS [1]. According to the log,
> > "SELECT count(*) FROM t2" was executed before synchronization was done.
> >
> > ```
> > [09:24:21.018](0.132s) not ok 18 - Table t2 should now have 3 rows on the new subscriber
> > ```
> >
> > With the patch present, wait_for_catchup() is executed after REFRESH, but
> > it may not be sufficient because it does not check pg_subscription_rel.
> > wait_for_subscription_sync() seems better for the purpose.
>
> Fixed, thanks!
>
> v5 attached with all previously mentioned fixes.

Few comments:
1) Should we document this command:
+               case ALTER_SUBSCRIPTION_ADD_TABLE:
+                       {
+                               if (!IsBinaryUpgrade)
+                                       ereport(ERROR,
+
(errcode(ERRCODE_SYNTAX_ERROR)),
+                                                       errmsg("ALTER
SUBSCRIPTION ... ADD TABLE is not supported"));
+
+                               supported_opts = SUBOPT_RELID |
SUBOPT_STATE | SUBOPT_LSN;
+                               parse_subscription_options(pstate,
stmt->options,
+
            supported_opts, &opts);
+
+                               /* relid and state should always be provided. */
+                               Assert(IsSet(opts.specified_opts,
SUBOPT_RELID));
+                               Assert(IsSet(opts.specified_opts,
SUBOPT_STATE));
+
+                               AddSubscriptionRelState(subid,
opts.relid, opts.state,
+
         opts.lsn);
+

Should we document something like:
This command is for use by in-place upgrade utilities. Its use for
other purposes is not recommended or supported. The behavior of the
option may change in future releases without notice.

2) Similarly in pg_dump too:
@@ -431,6 +431,7 @@ main(int argc, char **argv)
                {"table-and-children", required_argument, NULL, 12},
                {"exclude-table-and-children", required_argument, NULL, 13},
                {"exclude-table-data-and-children", required_argument,
NULL, 14},
+               {"preserve-subscription-state", no_argument,
&dopt.preserve_subscriptions, 1},


Should we document something like:
This command is for use by in-place upgrade utilities. Its use for
other purposes is not recommended or supported. The behavior of the
option may change in future releases without notice.

3) This same error is possible for ready state table but with invalid
remote_lsn, should we include this too in the error message:
+       if (is_error)
+               pg_fatal("--preserve-subscription-state is incompatible with "
+                                "subscription relations in non-ready state");
+
+       check_ok();
+}

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:
> 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
>
> I was a bit confused by this relation 'state' mentioned in multiple
> places. IIUC the pg_upgrade logic is going to reject anything with a
> non-READY (not 'r') state anyhow, so what is the point of having all
> the extra grammar/parse_subscription_options etc to handle setting the
> state when only possible value must be 'r'?

We are just talking about the handling of an extra DefElem in an
extensible grammar pattern, so adding the state field does not
represent much maintenance work.  I'm OK with the addition of this
field in the data set dumped, FWIW, on the ground that it can be
useful for debugging purposes when looking at --binary-upgrade dumps,
and because we aim at copying catalog contents from one cluster to
another.

Anyway, I am not convinced that we have any need for a parse-able
grammar at all, because anything that's presented on this thread is
aimed at being used only for the internal purpose of an upgrade in a
--binary-upgrade dump with a direct catalog copy in mind, and having a
grammar would encourage abuses of it outside of this context.  I think
that we should aim for simpler than what's proposed by the patch,
actually, with either a single SQL function à-la-binary_upgrade() that
adds the contents of a relation.  Or we can be crazier and just create
INSERT queries for pg_subscription_rel to provide an exact copy of the
catalog contents.  A SQL function would be more consistent with other
objects types that use similar tricks, see
binary_upgrade_create_empty_extension() that does something similar
for some pg_extension records.  So, this function would require in
input 4 arguments:
- The subscription name or OID.
- The relation OID.
- Its LSN.
- Its sync state.

> 2. state V relstate
>
> I still feel code readbility suffers a bit by calling some fields/vars
> a generic 'state' instead of the more descriptive 'relstate'. Maybe
> it's just me.
>
> Previously commented same (see [1]#3, #4, #5)

Agreed to be more careful with the naming here.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:
> > 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
> >
> > I was a bit confused by this relation 'state' mentioned in multiple
> > places. IIUC the pg_upgrade logic is going to reject anything with a
> > non-READY (not 'r') state anyhow, so what is the point of having all
> > the extra grammar/parse_subscription_options etc to handle setting the
> > state when only possible value must be 'r'?
>
> We are just talking about the handling of an extra DefElem in an
> extensible grammar pattern, so adding the state field does not
> represent much maintenance work.  I'm OK with the addition of this
> field in the data set dumped, FWIW, on the ground that it can be
> useful for debugging purposes when looking at --binary-upgrade dumps,
> and because we aim at copying catalog contents from one cluster to
> another.
>
> Anyway, I am not convinced that we have any need for a parse-able
> grammar at all, because anything that's presented on this thread is
> aimed at being used only for the internal purpose of an upgrade in a
> --binary-upgrade dump with a direct catalog copy in mind, and having a
> grammar would encourage abuses of it outside of this context.  I think
> that we should aim for simpler than what's proposed by the patch,
> actually, with either a single SQL function à-la-binary_upgrade() that
> adds the contents of a relation.  Or we can be crazier and just create
> INSERT queries for pg_subscription_rel to provide an exact copy of the
> catalog contents.  A SQL function would be more consistent with other
> objects types that use similar tricks, see
> binary_upgrade_create_empty_extension() that does something similar
> for some pg_extension records.  So, this function would require in
> input 4 arguments:
> - The subscription name or OID.
> - The relation OID.
> - Its LSN.
> - Its sync state.
>

+1 for doing it via function (something like
binary_upgrade_create_sub_rel_state). We already have the internal
function AddSubscriptionRelState() that can do the core work.

Like the publisher-side upgrade patch [1], I think we should allow
upgrading subscriptions by default instead with some flag like
--preserve-subscription-state. If required, we can introduce --exclude
option for upgrade. Having it just for pg_dump sounds reasonable to
me.

[1] -
https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Apr 27, 2023 at 1:18 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> 03. main
>
> Currently --preserve-subscription-state and --no-subscriptions can be used
> together, but the situation is quite unnatural. Shouldn't we exclude them?
>

Right, that makes sense to me.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Sep 04, 2023 at 11:51:14AM +0530, Amit Kapila wrote:
> +1 for doing it via function (something like
> binary_upgrade_create_sub_rel_state). We already have the internal
> function AddSubscriptionRelState() that can do the core work.

It is one of these patches that I have let aside for too long, and it
solves a use-case of its own.  I think that I could hack that pretty
quickly given that Julien has done a bunch of the ground work.  Would
you agree with that?

> Like the publisher-side upgrade patch [1], I think we should allow
> upgrading subscriptions by default instead with some flag like
> --preserve-subscription-state. If required, we can introduce --exclude
> option for upgrade. Having it just for pg_dump sounds reasonable to
> me.
>
> [1] -
https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com

In the interface of the publisher for pg_upgrade agreed on and set in
stone?  I certainly agree to have a consistent upgrade experience for
the two sides of logical replication, publications and subscriptions.
Also, I'd rather have a filtering option at the same time as the
upgrade option to give more control to users from the start.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Sep 4, 2023 at 11:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:
> > > 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
> > >
> > > I was a bit confused by this relation 'state' mentioned in multiple
> > > places. IIUC the pg_upgrade logic is going to reject anything with a
> > > non-READY (not 'r') state anyhow, so what is the point of having all
> > > the extra grammar/parse_subscription_options etc to handle setting the
> > > state when only possible value must be 'r'?
> >
> > We are just talking about the handling of an extra DefElem in an
> > extensible grammar pattern, so adding the state field does not
> > represent much maintenance work.  I'm OK with the addition of this
> > field in the data set dumped, FWIW, on the ground that it can be
> > useful for debugging purposes when looking at --binary-upgrade dumps,
> > and because we aim at copying catalog contents from one cluster to
> > another.
> >
> > Anyway, I am not convinced that we have any need for a parse-able
> > grammar at all, because anything that's presented on this thread is
> > aimed at being used only for the internal purpose of an upgrade in a
> > --binary-upgrade dump with a direct catalog copy in mind, and having a
> > grammar would encourage abuses of it outside of this context.  I think
> > that we should aim for simpler than what's proposed by the patch,
> > actually, with either a single SQL function à-la-binary_upgrade() that
> > adds the contents of a relation.  Or we can be crazier and just create
> > INSERT queries for pg_subscription_rel to provide an exact copy of the
> > catalog contents.  A SQL function would be more consistent with other
> > objects types that use similar tricks, see
> > binary_upgrade_create_empty_extension() that does something similar
> > for some pg_extension records.  So, this function would require in
> > input 4 arguments:
> > - The subscription name or OID.
> > - The relation OID.
> > - Its LSN.
> > - Its sync state.
> >
>
> +1 for doing it via function (something like
> binary_upgrade_create_sub_rel_state). We already have the internal
> function AddSubscriptionRelState() that can do the core work.
>

One more related point:
@@ -4814,9 +4923,31 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)
  if (strcmp(subinfo->subpasswordrequired, "t") != 0)
  appendPQExpBuffer(query, ", password_required = false");

+ if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
+ subinfo->suboriginremotelsn)
+ {
+ appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
+ }

Even during Create Subscription, we can use an existing function
(pg_replication_origin_advance()) or a set of functions to advance the
origin instead of introducing a new option.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Sep 4, 2023 at 12:15 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Sep 04, 2023 at 11:51:14AM +0530, Amit Kapila wrote:
> > +1 for doing it via function (something like
> > binary_upgrade_create_sub_rel_state). We already have the internal
> > function AddSubscriptionRelState() that can do the core work.
>
> It is one of these patches that I have let aside for too long, and it
> solves a use-case of its own.  I think that I could hack that pretty
> quickly given that Julien has done a bunch of the ground work.  Would
> you agree with that?
>

Yeah, I agree that could be hacked quickly but note I haven't reviewed
in detail if there are other design issues in this patch. Note that we
thought first to support the upgrade of the publisher node, otherwise,
immediately after upgrading the subscriber and publisher, the
subscriptions won't work and start giving errors as they are dependent
on slots in the publisher. One other point that needs some thought is
that the LSN positions we are going to copy in the catalog may no
longer be valid after the upgrade (of the publisher) because we reset
WAL. Does that need some special consideration or are we okay with
that in all cases?  As of now, things are quite safe as documented in
pg_dump doc page that it will be the user's responsibility to set up
replication after dump/restore. I think it would be really helpful if
you could share your thoughts on the publisher-side matter as we are
facing a few tricky questions to be answered. For example, see a new
thread [1].

> > Like the publisher-side upgrade patch [1], I think we should allow
> > upgrading subscriptions by default instead with some flag like
> > --preserve-subscription-state. If required, we can introduce --exclude
> > option for upgrade. Having it just for pg_dump sounds reasonable to
> > me.
> >
> > [1] -
https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com
>
> In the interface of the publisher for pg_upgrade agreed on and set in
> stone?  I certainly agree to have a consistent upgrade experience for
> the two sides of logical replication, publications and subscriptions.
> Also, I'd rather have a filtering option at the same time as the
> upgrade option to give more control to users from the start.
>

The point raised by Jonathan for not having an option for pg_upgrade
is that it will be easy for users, otherwise, users always need to
enable this option. Consider a replication setup, wouldn't users want
by default it to be upgraded? Asking them to do that via an option
would be an inconvenience. So, that was the reason, we wanted to have
an --exclude option and by default allow slots to be upgraded. I think
the same theory applies here.

[1] - https://www.postgresql.org/message-id/CAA4eK1LV3%2B76CSOAk0h8Kv0AKb-OETsJHe6Sq6172-7DZXf0Qg%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 19 Jul 2023 at 12:47, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:
> > 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
> >
> > I was a bit confused by this relation 'state' mentioned in multiple
> > places. IIUC the pg_upgrade logic is going to reject anything with a
> > non-READY (not 'r') state anyhow, so what is the point of having all
> > the extra grammar/parse_subscription_options etc to handle setting the
> > state when only possible value must be 'r'?
>
> We are just talking about the handling of an extra DefElem in an
> extensible grammar pattern, so adding the state field does not
> represent much maintenance work.  I'm OK with the addition of this
> field in the data set dumped, FWIW, on the ground that it can be
> useful for debugging purposes when looking at --binary-upgrade dumps,
> and because we aim at copying catalog contents from one cluster to
> another.
>
> Anyway, I am not convinced that we have any need for a parse-able
> grammar at all, because anything that's presented on this thread is
> aimed at being used only for the internal purpose of an upgrade in a
> --binary-upgrade dump with a direct catalog copy in mind, and having a
> grammar would encourage abuses of it outside of this context.  I think
> that we should aim for simpler than what's proposed by the patch,
> actually, with either a single SQL function à-la-binary_upgrade() that
> adds the contents of a relation.  Or we can be crazier and just create
> INSERT queries for pg_subscription_rel to provide an exact copy of the
> catalog contents.  A SQL function would be more consistent with other
> objects types that use similar tricks, see
> binary_upgrade_create_empty_extension() that does something similar
> for some pg_extension records.  So, this function would require in
> input 4 arguments:
> - The subscription name or OID.
> - The relation OID.
> - Its LSN.
> - Its sync state.

Added a SQL function to handle the insertion and removed the "ALTER
SUBSCRIPTION ... ADD TABLE" command that was added.
Attached patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Sep 04, 2023 at 02:12:58PM +0530, Amit Kapila wrote:
> Yeah, I agree that could be hacked quickly but note I haven't reviewed
> in detail if there are other design issues in this patch. Note that we
> thought first to support the upgrade of the publisher node, otherwise,
> immediately after upgrading the subscriber and publisher, the
> subscriptions won't work and start giving errors as they are dependent
> on slots in the publisher. One other point that needs some thought is
> that the LSN positions we are going to copy in the catalog may no
> longer be valid after the upgrade (of the publisher) because we reset
> WAL. Does that need some special consideration or are we okay with
> that in all cases?

In pg_upgrade, copy_xact_xlog_xid() puts the new node ahead of the old
cluster by 8 segments on TLI 1, so how would be it a problem if the
subscribers keep a remote confirmed LSN lower than that in their
catalogs?  (You've mentioned that to me offline, but I forgot the
details in the code.)

> As of now, things are quite safe as documented in
> pg_dump doc page that it will be the user's responsibility to set up
> replication after dump/restore. I think it would be really helpful if
> you could share your thoughts on the publisher-side matter as we are
> facing a few tricky questions to be answered. For example, see a new
> thread [1].

In my experience, users are quite used to upgrade standbys *first*,
even in simple scenarios like minor upgrades, because that's the only
way to do things safely.  For example, updating and/or upgrading
primaries before the standbys could be a problem if an update
introduces a slight change in the WAL record format that could be
generated by the primary but not be processed by a standby, and we've
done such tweaks in some records in the past for some bug fixes that
had to be backpatched to stable branches.

IMO, the upgrade of subscriber nodes and the upgrade of publisher
nodes need to be treated as two independent processing problems, dealt
with separately.

As you have mentioned me earlier offline, these two have, from what I
understand. one dependency: during a publisher upgrade we need to make
sure that there are no invalid slots when beginning to run pg_upgrade,
and that the confirmed LSN of all the slots used by the subscribers
match with the shutdown checkpoint's LSN, ensuring that the
subscribers would not lose any data because everything's already been
consumed by them when the publisher gets to be upgraded.

> The point raised by Jonathan for not having an option for pg_upgrade
> is that it will be easy for users, otherwise, users always need to
> enable this option. Consider a replication setup, wouldn't users want
> by default it to be upgraded? Asking them to do that via an option
> would be an inconvenience. So, that was the reason, we wanted to have
> an --exclude option and by default allow slots to be upgraded. I think
> the same theory applies here.
>
> [1] - https://www.postgresql.org/message-id/CAA4eK1LV3%2B76CSOAk0h8Kv0AKb-OETsJHe6Sq6172-7DZXf0Qg%40mail.gmail.com

I saw this thread, and have some thoughts to share.  Will reply there.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 27 Apr 2023 at 13:18, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Julien,
>
> Thank you for updating the patch! Followings are my comments.
>
> 01. documentation
>
> In this page steps to upgrade server with pg_upgrade is aligned. Should we write
> down about subscriber? IIUC, it is sufficient to just add to "Run pg_upgrade",
> like "Apart from streaming replication standby, subscriber node can be upgrade
> via pg_upgrade. At that time we strongly recommend to use --preserve-subscription-state".

Now this option has been removed and made default

> 02. AlterSubscription
>
> I agreed that oid must be preserved between nodes, but I'm still afraid that
> given oid is unconditionally trusted and added to pg_subscription_rel.
> I think we can check the existenec of the relation via SearchSysCache1(RELOID,
> ObjectIdGetDatum(relid)). Of cource the check is optional, so it should be
> executed only when USE_ASSERT_CHECKING is on. Thought?

Modified

> 03. main
>
> Currently --preserve-subscription-state and --no-subscriptions can be used
> together, but the situation is quite unnatural. Shouldn't we exclude them?

This option is removed now, so this scenario will not happen

> 04. getSubscriptionTables
>
>
> ```
> +       SubRelInfo *rels = NULL;
> ```
>
> The variable is used only inside the loop, so the definition should be also moved.

This logic is changed slightly, so it needs to be kept outside

> 05. getSubscriptionTables
>
> ```
> +                       nrels = atooid(PQgetvalue(res, i, i_nrels));
> ```
>
> atoi() should be used instead of atooid().

Modified

> 06. getSubscriptionTables
>
> ```
> +                       subinfo = findSubscriptionByOid(cur_srsubid);
> +
> +                       nrels = atooid(PQgetvalue(res, i, i_nrels));
> +                       rels = pg_malloc(nrels * sizeof(SubRelInfo));
> +
> +                       subinfo->subrels = rels;
> +                       subinfo->nrels = nrels;
> ```
>
> Maybe it never occurs, but findSubscriptionByOid() can return NULL. At that time
> accesses to their attributes will lead the Segfault. Some handling is needed.

This should not happen, added a fatal error in this case.

> 07. dumpSubscription
>
> Hmm, SubRelInfos are still dumped at the dumpSubscription(). I think this style
> breaks the manner of pg_dump. I think another dump function is needed. Please
> see dumpPublicationTable() and dumpPublicationNamespace(). If you have a reason
> to use the style, some comments to describe it is needed.

Modified

> 08. _SubRelInfo
>
> If you will address above comment, DumpableObject must be added as new attribute.

Modified

> 09. check_for_subscription_state
>
> ```
> +                       for (int i = 0; i < ntup; i++)
> +                       {
> +                               is_error = true;
> +                               pg_log(PG_WARNING,
> +                                          "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
> +                                          PQgetvalue(res, 0, 0));
> +                       }
> ```
>
> The second argument should be i to report the name of subscription more than 2.

Modified

> 10. 003_subscription.pl
>
> ```
> $old_sub->wait_for_subscription_sync($publisher, 'sub');
>
> my $result = $old_sub->safe_psql('postgres',
>     "SELECT COUNT(*) FROM pg_subscription_rel WHERE srsubstate != 'r'");
> is ($result, qq(0), "All tables in pg_subscription_rel should be in ready state");
> ```
>
> I think there is a possibility to cause a timing issue, because the SELECT may
> be executed before srsubstate is changed from 's' to 'r'. Maybe poll_query_until()
> can be used instead.

Modified

> 11. 003_subscription.pl
>
> ```
> command_ok(
>         [
>                 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
>                 '-D',         $new_sub->data_dir, '-b', $bindir,
>                 '-B',         $bindir,            '-s', $new_sub->host,
>                 '-p',         $old_sub->port,     '-P', $new_sub->port,
>                 $mode,
>                 '--preserve-subscription-state',
>                 '--check',
>         ],
>         'run of pg_upgrade --check for old instance with correct sub rel');
> ```
>
> Missing check of pg_upgrade_output.d?

Modified

> And maybe you missed to run pgperltidy.

It has been run for the new patch.

The attached v7 patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 10 May 2023 at 13:29, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for the v5-0001 patch code.
>
> ======
> General
>
> 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
>
> I was a bit confused by this relation 'state' mentioned in multiple
> places. IIUC the pg_upgrade logic is going to reject anything with a
> non-READY (not 'r') state anyhow, so what is the point of having all
> the extra grammar/parse_subscription_options etc to handle setting the
> state when only possible value must be 'r'?
>

This command has been removed, this code has been removed

>
> 2. state V relstate
>
> I still feel code readbility suffers a bit by calling some fields/vars
> a generic 'state' instead of the more descriptive 'relstate'. Maybe
> it's just me.
>
> Previously commented same (see [1]#3, #4, #5)

Few of the code has been removed, I have modified wherever possible

> ======
> doc/src/sgml/ref/pgupgrade.sgml
>
> 3.
> +       <para>
> +        Fully preserve the logical subscription state if any.  That includes
> +        the underlying replication origin with their remote LSN and the list of
> +        relations in each subscription so that replication can be simply
> +        resumed if the subscriptions are reactivated.
> +       </para>
>
> I think the "if any" part is not necessary. If you remove those words,
> then the rest of the sentence can be simplified.
>
> SUGGESTION
> Fully preserve the logical subscription state, which includes the
> underlying replication origin's remote LSN, and the list of relations
> in each subscription. This allows replication to simply resume when
> the subscriptions are reactivated.
>
This has been removed now.

>
> 4.
> +       <para>
> +        If this option isn't used, it is up to the user to reactivate the
> +        subscriptions in a suitable way; see the subscription part in <xref
> +        linkend="pg-dump-notes"/> for more information.
> +       </para>
>
> The link still renders strangely as previously reported (see [1]#2b).
>
This has been removed now
>
> 5.
> +       <para>
> +        If this option is used and any of the subscription on the old cluster
> +        has an unknown <varname>remote_lsn</varname> (0/0), or has any relation
> +        in a state different from <literal>r</literal> (ready), the
> +        <application>pg_upgrade</application> run will error.
> +       </para>
>
> 5a.
> /subscription/subscriptions/

Modified

> 5b
> "has any relation in a state different from r" --> "has any relation
> with state other than r"

Modified slightly

> ======
> src/backend/commands/subscriptioncmds.c
>
> 6.
> + if (strlen(state_str) != 1)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid relation state: %s", state_str)));
>
> Is this relation state validation overly simplistic, by only checking
> for length 1? Shouldn't this just be asserting the relstate must be
> 'r'?

This code has been removed

> ======
> src/bin/pg_dump/pg_dump.c
>
> 7. getSubscriptionTables
>
> +/*
> + * getSubscriptionTables
> + *   get information about the given subscription's relations
> + */
> +void
> +getSubscriptionTables(Archive *fout)
> +{
> + SubscriptionInfo *subinfo;
> + SubRelInfo *rels = NULL;
> + PQExpBuffer query;
> + PGresult   *res;
> + int i_srsubid;
> + int i_srrelid;
> + int i_srsubstate;
> + int i_srsublsn;
> + int i_nrels;
> + int i,
> + cur_rel = 0,
> + ntups,
> + last_srsubid = InvalidOid;
>
> Why some above are single int declarations and some are compound int
> declarations? Why not make them all consistent?

Modified

> ~
>
> 8.
> + appendPQExpBuffer(query, "SELECT srsubid, srrelid, srsubstate, srsublsn,"
> +   " count(*) OVER (PARTITION BY srsubid) AS nrels"
> +   " FROM pg_subscription_rel"
> +   " ORDER BY srsubid");
>
> Should this SQL be schema-qualified like pg_catalog.pg_subscription_rel?

Modified

> ~
>
> 9.
> + for (i = 0; i < ntups; i++)
> + {
> + int cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
>
> Should 'cur_srsubid' be declared Oid to match the atooid?

Modified

> ~~~
>
> 10. getSubscriptions
>
> + if (PQgetisnull(res, i, i_suboriginremotelsn))
> + subinfo[i].suboriginremotelsn = NULL;
> + else
> + subinfo[i].suboriginremotelsn =
> + pg_strdup(PQgetvalue(res, i, i_suboriginremotelsn));
> +
> + /*
> + * For now assume there's no relation associated with the
> + * subscription. Later code might update this field and allocate
> + * subrels as needed.
> + */
> + subinfo[i].nrels = 0;
>
> The wording "For now assume there's no" kind of gives an ambiguous
> interpretation for this comment. IMO it sounds like this is the
> "current" logic but some future PG version may behave differently - I
> don't think that is the intended meaning at all.
>
> SUGGESTION.
> Here we just initialize nrels to say there are 0 relations associated
> with the subscription. If necessary, subsequent logic will update this
> field and allocate the subrels.

This part of logic has been removed now as it is no more required

> ~~~
>
> 11. dumpSubscription
>
> + for (i = 0; i < subinfo->nrels; i++)
> + {
> + appendPQExpBuffer(query, "\nALTER SUBSCRIPTION %s ADD TABLE "
> +   "(relid = %u, state = '%c'",
> +   qsubname,
> +   subinfo->subrels[i].srrelid,
> +   subinfo->subrels[i].srsubstate);
> +
> + if (subinfo->subrels[i].srsublsn[0] != '\0')
> + appendPQExpBuffer(query, ", LSN = '%s'",
> +   subinfo->subrels[i].srsublsn);
> +
> + appendPQExpBufferStr(query, ");");
> + }
>
> I previously asked ([1]#11) about how can this ALTER SUBSCRIPTION
> TABLE code happen unless 'preserve_subscriptions' is true, and you
> confirmed "It indirectly is, as in that case subinfo->nrels is
> guaranteed to be 0. I just tried to keep the code simpler and avoid
> too many nested conditions."

 I have added the same check used that is used to get the subscription
tables to avoid confusion.

> ~
>
> If you are worried about too many nested conditions then a simple
> Assert(dopt->preserve_subscriptions); might be good to have here.
>
> ======
> src/bin/pg_upgrade/check.c
>
> 12. check_and_dump_old_cluster
>
> + /* PG 10 introduced subscriptions. */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1000 &&
> + user_opts.preserve_subscriptions)
> + {
> + check_for_subscription_state(&old_cluster);
> + }
>
> 12a.
> All the other checks in this function seem to be in decreasing order
> of PG version so maybe this check should be moved to follow that same
> pattern.

Modified

> ~
>
> 12b.
> Also won't it be better to give some error or notice of some kind if
> the option/version are incompatible? I think this was mentioned in a
> previous review.
>
> e.g.
>
> if (user_opts.preserve_subscriptions)
> {
>     if (GET_MAJOR_VERSION(old_cluster.major_version) < 1000)
>         <pg_log or pg_fatal goes here...>;
>     check_for_subscription_state(&old_cluster);
> }

This has been removed now

> ~~~
>
> 13. check_for_subscription_state
>
> + for (int i = 0; i < ntup; i++)
> + {
> + is_error = true;
> + pg_log(PG_WARNING,
> +    "\nWARNING:  subscription \"%s\" has an invalid remote_lsn",
> +    PQgetvalue(res, 0, 0));
> + }
>
> 13a.
> This WARNING does not mention the database, but a similar warning
> later about the non-ready state does mention the database. Probably
> they should be consistent.

Modified

> ~
>
> 13b.
> Something seems amiss. Here the is_error is assigned true; But later
> when you test is_error that is for logging the ready-state problem.
> Isn't there another missing pg_fatal for this invalid remote_lsn case?

Modified

> ======
> src/bin/pg_upgrade/option.c
>
> 14. usage
>
> + printf(_("  --preserve-subscription-state preserve the subscription
> state fully\n"));
>
> Why say "fully"? How is "preserve the subscription state fully"
> different to "preserve the subscription state" from the user's POV?

This has been removed now

These are handled as part of v7 posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 4 Sept 2023 at 13:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 4, 2023 at 11:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 12:47 PM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Wed, May 10, 2023 at 05:59:24PM +1000, Peter Smith wrote:
> > > > 1. ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
> > > >
> > > > I was a bit confused by this relation 'state' mentioned in multiple
> > > > places. IIUC the pg_upgrade logic is going to reject anything with a
> > > > non-READY (not 'r') state anyhow, so what is the point of having all
> > > > the extra grammar/parse_subscription_options etc to handle setting the
> > > > state when only possible value must be 'r'?
> > >
> > > We are just talking about the handling of an extra DefElem in an
> > > extensible grammar pattern, so adding the state field does not
> > > represent much maintenance work.  I'm OK with the addition of this
> > > field in the data set dumped, FWIW, on the ground that it can be
> > > useful for debugging purposes when looking at --binary-upgrade dumps,
> > > and because we aim at copying catalog contents from one cluster to
> > > another.
> > >
> > > Anyway, I am not convinced that we have any need for a parse-able
> > > grammar at all, because anything that's presented on this thread is
> > > aimed at being used only for the internal purpose of an upgrade in a
> > > --binary-upgrade dump with a direct catalog copy in mind, and having a
> > > grammar would encourage abuses of it outside of this context.  I think
> > > that we should aim for simpler than what's proposed by the patch,
> > > actually, with either a single SQL function à-la-binary_upgrade() that
> > > adds the contents of a relation.  Or we can be crazier and just create
> > > INSERT queries for pg_subscription_rel to provide an exact copy of the
> > > catalog contents.  A SQL function would be more consistent with other
> > > objects types that use similar tricks, see
> > > binary_upgrade_create_empty_extension() that does something similar
> > > for some pg_extension records.  So, this function would require in
> > > input 4 arguments:
> > > - The subscription name or OID.
> > > - The relation OID.
> > > - Its LSN.
> > > - Its sync state.
> > >
> >
> > +1 for doing it via function (something like
> > binary_upgrade_create_sub_rel_state). We already have the internal
> > function AddSubscriptionRelState() that can do the core work.
> >

Modified

> One more related point:
> @@ -4814,9 +4923,31 @@ dumpSubscription(Archive *fout, const
> SubscriptionInfo *subinfo)
>   if (strcmp(subinfo->subpasswordrequired, "t") != 0)
>   appendPQExpBuffer(query, ", password_required = false");
>
> + if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
> + subinfo->suboriginremotelsn)
> + {
> + appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
> + }
>
> Even during Create Subscription, we can use an existing function
> (pg_replication_origin_advance()) or a set of functions to advance the
> origin instead of introducing a new option.

Added a function binary_upgrade_sub_replication_origin_advance which
will: a) check if the subscription exists, b) get the replication name
for subscription and c) advance the replication origin.

These are handled as part of v7 posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 10 May 2023 at 13:39, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, Apr 24, 2023 at 4:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Hi,
> >
> > On Thu, Apr 13, 2023 at 03:26:56PM +1000, Peter Smith wrote:
> > >
> > > 1.
> > > All the comments look alike, so it is hard to know what is going on.
> > > If each of the main test parts could be highlighted then the test code
> > > would be easier to read IMO.
> > >
> > > Something like below:
> > > [...]
> >
> > I added a bit more comments about what's is being tested.  I'm not sure that a
> > big TEST CASE prefix is necessary, as it's not really multiple separated test
> > cases and other stuff can be tested in between.  Also AFAICT no other TAP test
> > current needs this kind of banner, even if they're testing more complex
> > scenario.
>
> Hmm, I think there are plenty of examples of subscription TAP tests
> having some kind of highlighted comments as suggested, for better
> readability.
>
> e.g. See src/test/subscription
> t/014_binary.pl
> t/015_stream.pl
> t/016_stream_subxact.pl
> t/018_stream_subxact_abort.pl
> t/021_twophase.pl
> t/022_twophase_cascade.pl
> t/023_twophase_stream.pl
> t/028_row_filter.pl
> t/030_origin.pl
> t/031_column_list.pl
> t/032_subscribe_use_index.pl
>
> A simple #################### to separate the main test parts is all
> that is needed.

Modified

>
> > > 4b.
> > > All these messages like "Table t1 should still have 2 rows on the new
> > > subscriber" don't seem very helpful. e.g. They are not saying anything
> > > about WHAT this is testing or WHY it should still have 2 rows.
> >
> > I don't think that those messages are supposed to say what or why something is
> > tested, just give a quick context / reference on the test in case it's broken.
> > The comments are there to explain in more details what is tested and/or why.
> >
>
> But, why can’t they do both? They can be a quick reference *and* at
> the same time give some more meaning to the error log.  Otherwise,
> these messages might as well just say ‘ref1’, ‘ref2’, ‘ref3’...

Modified

These are handled as part of v7 posted at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg%40mail.gmail.com

Regards,
Vignesh



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

Thank you for updating the patch! Here are some comments.

Sorry if there are duplicate comments - the thread revived recently so I might
lose my memory.

01. General

Is there a possibility that apply worker on old cluster connects to the
publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
refuse TCP/IP connections from remotes and port number is also changed, so we can
assume that subscriber does not connect to. But IIUC such settings may not affect
to the connection source, so that the apply worker may try to connect to the
publisher. Also, is there any hazards if it happens?

02. Upgrade functions

Two functions - binary_upgrade_create_sub_rel_state and binary_upgrade_sub_replication_origin_advance
should be located at pg_upgrade_support.c. Also, CHECK_IS_BINARY_UPGRADE() macro
can be used.

03. Parameter combinations

IIUC getSubscriptionTables() should be exitted quickly if --no-subscriptions is
specified, whereas binary_upgrade_create_sub_rel_state() is failed.


04. I failed my test

I executed attached script but failed to upgrade:

```
Restoring database schemas in the new cluster                 
  postgres                                                    
*failure*

Consult the last few lines of "data_N3/pg_upgrade_output.d/20230912T054546.320/log/pg_upgrade_dump_5.log" for
the probable cause of the failure.
Failure, exiting
```

I checked the log and found that binary_upgrade_create_sub_rel_state() does not
support skipping the fourth argument:

```
pg_restore: from TOC entry 4059; 16384 16387 SUBSCRIPTION TABLE sub sub postgres
pg_restore: error: could not execute query: ERROR:  function binary_upgrade_create_sub_rel_state(unknown, integer,
unknown)does not exist
 
LINE 1: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r'...
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
Command was: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r');
```

IIUC if we allow to skip arguments, we must define wrappers like pg_copy_logical_replication_slot_*.
Another approach is that pg_dump always dumps srsublsn even if it is NULL.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Attachment

RE: pg_upgrade and logical replication

From
"Zhijie Hou (Fujitsu)"
Date:
On Monday, September 11, 2023 6:32 PM vignesh C <vignesh21@gmail.com> wrote:
> 
> 
> The attached v7 patch has the changes for the same.

Thanks for updating the patch, here are few comments:


1.

+/*
+ * binary_upgrade_sub_replication_origin_advance
+ *
+ * Update the remote_lsn for the subscriber's replication origin.
+ */
+Datum
+binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
+{

Is there any usage apart from pg_upgrade for this function, if not, I think
we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
better to rename it to a general one.

2.

+ * Verify that all subscriptions have a valid remote_lsn and don't contain
+ * any table in srsubstate different than ready ('r').
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

I think we'd better follow the same style of
check_for_isn_and_int8_passing_mismatch() to record the invalid things in a
file.


3.

+        if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
+        {
+            appendPQExpBuffer(query,
+                              "SELECT binary_upgrade_create_sub_rel_state('%s', %u, '%c'",
+                              subrinfo->dobj.name,

I think we'd better consider using appendStringLiteral or related function for
the dobj.name here to make sure the string convertion is safe.


4.

The following commit message may need update:
"binary_upgrade_create_sub_rel_state SQL function, and also provides an
additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
replication origin remote LSN. "

I think we have changed to another approach which doesn't provide new parameter
in DDL.


5. 
+    /* Fetch the existing tuple. */
+    tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
+                              CStringGetDatum(subname));

Since we don't modify the tuple here, SearchSysCache2 seems enough.


6. 
+                                    "LEFT JOIN pg_catalog.pg_database d"
+                                    "  ON d.oid = s.subdbid "
+                                    "WHERE coalesce(remote_lsn, '0/0') = '0/0'");

For the subscriptions that were just created and finished the table sync but
haven't applied any changes, their remote_lsn will also be 0/0. Do we
need to report ERROR in this case ?

Best Regards,
Hou zj

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Sep 11, 2023 at 05:19:27PM +0530, vignesh C wrote:
> Added a function binary_upgrade_sub_replication_origin_advance which
> will: a) check if the subscription exists, b) get the replication name
> for subscription and c) advance the replication origin.
>
> These are handled as part of v7 posted at [1].
> [1] - https://www.postgresql.org/message-id/CALDaNm1ZrbHaWpJwwNhDTJocRKWd3rEkgJazuDdZ9Z-WdvonFg%40mail.gmail.com

Thanks.  I can see that some of the others have already provided
comments about this version.  I have some comments on top of that.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Tue, Sep 12, 2023 at 01:22:50PM +0000, Zhijie Hou (Fujitsu) wrote:
> +/*
> + * binary_upgrade_sub_replication_origin_advance
> + *
> + * Update the remote_lsn for the subscriber's replication origin.
> + */
> +Datum
> +binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
> +{
>
> Is there any usage apart from pg_upgrade for this function, if not, I think
> we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
> better to rename it to a general one.

I was equally surprised by the choice of the patch regarding the
location of these functions, so I agree with your point that these
functions should be in pg_upgrade_support.c.  All the sub-routines
these two functions rely on are defined in some headers already, so
there seem to be nothing new required for pg_upgrade_support.c.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Vignesh,
>
> Thank you for updating the patch! Here are some comments.
>
> Sorry if there are duplicate comments - the thread revived recently so I might
> lose my memory.
>
> 01. General
>
> Is there a possibility that apply worker on old cluster connects to the
> publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
> refuse TCP/IP connections from remotes and port number is also changed, so we can
> assume that subscriber does not connect to. But IIUC such settings may not affect
> to the connection source, so that the apply worker may try to connect to the
> publisher. Also, is there any hazards if it happens?

Yes, there is a possibility that the apply worker gets started and new
transaction data is being synced from the publisher. I have made a fix
not to start the launcher process in binary ugprade mode as we don't
want the launcher to start apply worker during upgrade.

> 02. Upgrade functions
>
> Two functions - binary_upgrade_create_sub_rel_state and binary_upgrade_sub_replication_origin_advance
> should be located at pg_upgrade_support.c. Also, CHECK_IS_BINARY_UPGRADE() macro
> can be used.

Modified

> 03. Parameter combinations
>
> IIUC getSubscriptionTables() should be exitted quickly if --no-subscriptions is
> specified, whereas binary_upgrade_create_sub_rel_state() is failed.

Modified

>
> 04. I failed my test
>
> I executed attached script but failed to upgrade:
>
> ```
> Restoring database schemas in the new cluster
>   postgres
> *failure*
>
> Consult the last few lines of "data_N3/pg_upgrade_output.d/20230912T054546.320/log/pg_upgrade_dump_5.log" for
> the probable cause of the failure.
> Failure, exiting
> ```
>
> I checked the log and found that binary_upgrade_create_sub_rel_state() does not
> support skipping the fourth argument:
>
> ```
> pg_restore: from TOC entry 4059; 16384 16387 SUBSCRIPTION TABLE sub sub postgres
> pg_restore: error: could not execute query: ERROR:  function binary_upgrade_create_sub_rel_state(unknown, integer,
unknown)does not exist
 
> LINE 1: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r'...
>                ^
> HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
> Command was: SELECT binary_upgrade_create_sub_rel_state('sub', 16384, 'r');
> ```
>
> IIUC if we allow to skip arguments, we must define wrappers like pg_copy_logical_replication_slot_*.
> Another approach is that pg_dump always dumps srsublsn even if it is NULL.
Modified

The attached v8 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 12 Sept 2023 at 18:52, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Monday, September 11, 2023 6:32 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > The attached v7 patch has the changes for the same.
>
> Thanks for updating the patch, here are few comments:
>
>
> 1.
>
> +/*
> + * binary_upgrade_sub_replication_origin_advance
> + *
> + * Update the remote_lsn for the subscriber's replication origin.
> + */
> +Datum
> +binary_upgrade_sub_replication_origin_advance(PG_FUNCTION_ARGS)
> +{
>
> Is there any usage apart from pg_upgrade for this function, if not, I think
> we'd better move this function to pg_upgrade_support.c. If yes, I think maybe
> better to rename it to a general one.

Moved to pg_upgrade_support.c and renamed to binary_upgrade_replorigin_advance

> 2.
>
> + * Verify that all subscriptions have a valid remote_lsn and don't contain
> + * any table in srsubstate different than ready ('r').
> + */
> +static void
> +check_for_subscription_state(ClusterInfo *cluster)
>
> I think we'd better follow the same style of
> check_for_isn_and_int8_passing_mismatch() to record the invalid things in a
> file.

Modfied

>
> 3.
>
> +               if (fout->dopt->binary_upgrade && fout->remoteVersion >= 100000)
> +               {
> +                       appendPQExpBuffer(query,
> +                                                         "SELECT binary_upgrade_create_sub_rel_state('%s', %u,
'%c'",
> +                                                         subrinfo->dobj.name,
>
> I think we'd better consider using appendStringLiteral or related function for
> the dobj.name here to make sure the string convertion is safe.
>

Modified

> 4.
>
> The following commit message may need update:
> "binary_upgrade_create_sub_rel_state SQL function, and also provides an
> additional LSN parameter for CREATE SUBSCRIPTION to restore the underlying
> replication origin remote LSN. "
>
> I think we have changed to another approach which doesn't provide new parameter
> in DDL.

Modified

>
> 5.
> +       /* Fetch the existing tuple. */
> +       tup = SearchSysCacheCopy2(SUBSCRIPTIONNAME, MyDatabaseId,
> +                                                         CStringGetDatum(subname));
>
> Since we don't modify the tuple here, SearchSysCache2 seems enough.
>
>
> 6.
> +                                                                       "LEFT JOIN pg_catalog.pg_database d"
> +                                                                       "  ON d.oid = s.subdbid "
> +                                                                       "WHERE coalesce(remote_lsn, '0/0') =
'0/0'");
>
> For the subscriptions that were just created and finished the table sync but
> haven't applied any changes, their remote_lsn will also be 0/0. Do we
> need to report ERROR in this case ?
I will handle this in the next version.

Thanks for the comments, the v8 patch attached at [1] has the changes
for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm1JzqTreCUrhNu5E1gq7Q8r_u3%2BFrisyT7moOED%3DUdoCg%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 15 Sept 2023 at 15:08, vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Vignesh,
> >
> > Thank you for updating the patch! Here are some comments.
> >
> > Sorry if there are duplicate comments - the thread revived recently so I might
> > lose my memory.
> >
> > 01. General
> >
> > Is there a possibility that apply worker on old cluster connects to the
> > publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
> > refuse TCP/IP connections from remotes and port number is also changed, so we can
> > assume that subscriber does not connect to. But IIUC such settings may not affect
> > to the connection source, so that the apply worker may try to connect to the
> > publisher. Also, is there any hazards if it happens?
>
> Yes, there is a possibility that the apply worker gets started and new
> transaction data is being synced from the publisher. I have made a fix
> not to start the launcher process in binary ugprade mode as we don't
> want the launcher to start apply worker during upgrade.

Another approach to solve this as suggested by one of my colleague
Hou-san would be to set max_logical_replication_workers = 0 while
upgrading. I will evaluate this and update the next version of patch
accordingly.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Fri, Sep 15, 2023 at 04:51:57PM +0530, vignesh C wrote:
> Another approach to solve this as suggested by one of my colleague
> Hou-san would be to set max_logical_replication_workers = 0 while
> upgrading. I will evaluate this and update the next version of patch
> accordingly.

In the context of an upgrade, any node started is isolated with its
own port and a custom unix domain directory with connections allowed
only through this one.

Saying that, I don't see why forcing max_logical_replication_workers
to be 0 would be necessarily a bad thing to prevent unnecessary
activity on the backend.  This should be a separate patch built on
top of the main one, IMO.

Looking forward to seeing the rebased version you've mentioned, btw ;)
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 19 Sept 2023 at 11:49, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Sep 15, 2023 at 04:51:57PM +0530, vignesh C wrote:
> > Another approach to solve this as suggested by one of my colleague
> > Hou-san would be to set max_logical_replication_workers = 0 while
> > upgrading. I will evaluate this and update the next version of patch
> > accordingly.
>
> In the context of an upgrade, any node started is isolated with its
> own port and a custom unix domain directory with connections allowed
> only through this one.
>
> Saying that, I don't see why forcing max_logical_replication_workers
> to be 0 would be necessarily a bad thing to prevent unnecessary
> activity on the backend.  This should be a separate patch built on
> top of the main one, IMO.

Here is a patch to set max_logical_replication_workers as 0 while the
server is started to prevent the launcher from being started. Since
this configuration is present from v10, no need for any version check.
I have done upgrade tests for v10-master, v11-master, ... v16-master
and found it to be working fine.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Tue, Sep 19, 2023 at 07:14:49PM +0530, vignesh C wrote:
> Here is a patch to set max_logical_replication_workers as 0 while the
> server is started to prevent the launcher from being started. Since
> this configuration is present from v10, no need for any version check.
> I have done upgrade tests for v10-master, v11-master, ... v16-master
> and found it to be working fine.

The project policy is to support pg_upgrade for 10 years, and 9.6 was
released in 2016:
https://www.postgresql.org/docs/9.6/release-9-6.html

>      snprintf(cmd, sizeof(cmd),
> -             "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s\" start",
> +             "\"%s/pg_ctl\" -w -l \"%s/%s\" -D \"%s\" -o \"-p %d -b%s %s%s%s\" start",
>               cluster->bindir,
>               log_opts.logdir,
>               SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
>               (cluster == &new_cluster) ?
>               " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
> +             " -c max_logical_replication_workers=0",
>               cluster->pgopts ? cluster->pgopts : "", socket_string);
>
>      /*

And this code path is used to start postmaster instances for old and
new clusters.  So it seems to me that it is incorrect if this is not
conditional based on the cluster version.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:
>
> The attached v8 version patch has the changes for the same.
>

Is the check to ensure remote_lsn is valid correct in function
check_for_subscription_state()? How about the case where the apply
worker didn't receive any change but just marked the relation as
'ready'?

Also, the patch seems to be allowing subscription relations from PG
>=10 to be migrated but how will that work if the corresponding
publisher is also upgraded without slots? Won't the corresponding
workers start failing as soon as you restart the upgrade server? Do we
need to document the steps for users?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:
> Also, the patch seems to be allowing subscription relations from PG
> >=10 to be migrated but how will that work if the corresponding
> publisher is also upgraded without slots? Won't the corresponding
> workers start failing as soon as you restart the upgrade server? Do we
> need to document the steps for users?

Hmm?  How is that related to the upgrade of the subscribers?  And how
is that different from the case where a subscriber tries to connect
back to a publisher where a slot has been dropped?  There is no need
of pg_upgrade to reach such a state:
ERROR:  could not start WAL streaming: ERROR:  replication slot "popo" does not exist
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:
> On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
>> Is there a possibility that apply worker on old cluster connects to the
>> publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
>> refuse TCP/IP connections from remotes and port number is also changed, so we can
>> assume that subscriber does not connect to. But IIUC such settings may not affect
>> to the connection source, so that the apply worker may try to connect to the
>> publisher. Also, is there any hazards if it happens?
>
> Yes, there is a possibility that the apply worker gets started and new
> transaction data is being synced from the publisher. I have made a fix
> not to start the launcher process in binary ugprade mode as we don't
> want the launcher to start apply worker during upgrade.

Hmm.  I was wondering if 0001 is the right way to handle this case,
but at the end I'm OK to paint one extra isBinaryUpgrade in the code
path where apply launchers are registered.  I don't think that the
patch is complete, though.  A comment should be added in pg_upgrade's
server.c, exactly start_postmaster(), to tell that -b also stops apply
workers.  I am attaching a version updated as of the attached, that
I'd be OK to apply.

I don't really think that we need to worry about a subscriber
connecting back to a publisher in this case, though?  I mean, each
postmaster instance started by pg_upgrade restricts the access to the
instance with unix_socket_directories set to a custom path and
permissions at 0700, and a subscription's connection string does not
know the unix path used by pg_upgrade.  I certainly agree that
stopping these processes could lead to inconsistencies in the data the
subscribers have been holding though, if we are not careful, so
preventing them from running is a good practice anyway.

I have also reviewed 0002.  As a whole, I think that I'm OK with the
main approach of the patch in pg_dump to use a new type of dumpable
object for subscription relations that are dumped with their upgrade
functions after.  This still needs more work, and more documentation.
Also, perhaps we should really have an option to control if this part
of the copy happens or not.  With a --no-subscription-relations for
pg_dump at least?

+{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',

During a development cycle, any new function added needs to use an OID
in range 8000-9999.  Running unused_oids will suggest new random OIDs.

FWIW, I am not convinced that there is a need for two functions to add
an entry to pg_subscription_rel, with sole difference between both the
handling of a valid or invalid LSN.  We should have only one function
that's able to handle NULL for the LSN.  So let's remove rel_state_a
and rel_state_b, and have a single rel_state().  The description of
the SQL functions is inconsistent with the other binary upgrade ones,
I would suggest for the two functions:
"for use by pg_upgrade (relation for pg_subscription_rel)"
"for use by pg_upgrade (remote_lsn for origin)"

+   i_srsublsn = PQfnumber(res, "srsublsn");
[...]
+       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));

In getSubscriptionTables(), this should check for PQgetisnull()
because we would have a NULL value for InvalidXLogRecPtr in the
catalog.  Using a char* for srsublsn is OK, but just assign NULL to
it, then just pass a hardcoded NULL value to the function as we do in
other places.  So I don't quite get why this is not the same handling
as suboriginremotelsn.

getSubscriptionTables() is entirely skipped if we don't want any
subscriptions, if we deal with a server of 9.6 or older or if we don't
do binary upgrades, which is OK.

+/*
+ * getSubscriptionTables
+ *      get information about subscription membership for dumpable tables.
+ */
This commit is slightly misleading and should mention that this is an
upgrade-only path?

The code for dumpSubscriptionTable() is a copy-paste of
dumpPublicationTable(), but a lot of what you are doing here is
actually pointless if we are not in binary mode?  Why should this code
path not taken only under dataOnly?  I mean, this is a code path we
should never take except if we are in binary mode.  This should have
at least a cross-check to make sure that we never have a
DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.

+    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
+    {
+        appendPQExpBufferStr(query,
+                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
+        appendStringLiteralAH(query, subinfo->dobj.name, fout);
+        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
+    }

Hmm..  Could it be actually useful even for debugging to still have
this query if suboriginremotelsn is an InvalidXLogRecPtr?  I think
that this should have a comment of the kind "\n-- For binary upgrade,
blah".  At least it would not be a bad thing to enforce a correct
state from the start, removing the NULL check for the second argument
in binary_upgrade_replorigin_advance().

+    /* We need to check for pg_replication_origin_status only once. */
Perhaps it would be better to explain why?

+                       "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
Why a COALESCE here?  Cannot this stuff just use NULL?

+    fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
Could it be possible to include the schema of the relation in this log?

+static void check_for_subscription_state(ClusterInfo *cluster);
I'd be tempted to move that into a patch on its own, actually, for a
cleaner history.

+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
New as of 2023.

+# Check that after upgradation of the subscriber server, the incremental
+# changes added to the publisher are replicated.
[..]
+   For upgradation of the subscriptions, all the subscriptions on the old
+   cluster must have a valid <varname>remote_lsn</varname>, and all the

Upgradation?  I think that this should be reworded:
"All the subscriptions of an old cluster require a valid remote_lsn
during an upgrade."

A CI run is reporting the following compilation warnings:
[04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
[04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
[04:21:15.290]  4655 |   subrinfo[cur_rel].subinfo = subinfo;

+ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
+    "pg_upgrade_output.d/ not removed after pg_upgrade failure");
Not sure that there's a need for this check.  Okay, that's cheap.

And, err.  We are going to need an option to control if the slot data
is copied, and a bit more documentation in pg_upgrade to explain how
things happen when the copy happens.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:
> Is the check to ensure remote_lsn is valid correct in function
> check_for_subscription_state()? How about the case where the apply
> worker didn't receive any change but just marked the relation as
> 'ready'?

I may be missing, of course, but a relation is switched to
SUBREL_STATE_READY only once a sync happened and its state was
SUBREL_STATE_SYNCDONE, implying that SubscriptionRelState->lsn is
never InvalidXLogRecPtr, no?

For instance, nothing happens when a
Assert(!XLogRecPtrIsInvalid(rstate->lsn)) is added in
process_syncing_tables_for_apply().
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Sep 21, 2023 at 11:37 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:
> > Is the check to ensure remote_lsn is valid correct in function
> > check_for_subscription_state()? How about the case where the apply
> > worker didn't receive any change but just marked the relation as
> > 'ready'?
>
> I may be missing, of course, but a relation is switched to
> SUBREL_STATE_READY only once a sync happened and its state was
> SUBREL_STATE_SYNCDONE, implying that SubscriptionRelState->lsn is
> never InvalidXLogRecPtr, no?
>

The check in the patch is about the logical replication worker's
origin's LSN. The value of SubscriptionRelState->lsn won't matter for
the check.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Sep 21, 2023 at 4:39 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Sep 20, 2023 at 04:54:36PM +0530, Amit Kapila wrote:
> > Also, the patch seems to be allowing subscription relations from PG
> > >=10 to be migrated but how will that work if the corresponding
> > publisher is also upgraded without slots? Won't the corresponding
> > workers start failing as soon as you restart the upgrade server? Do we
> > need to document the steps for users?
>
> Hmm?  How is that related to the upgrade of the subscribers?
>

It is because after upgrade of both publisher and subscriber, the
subscriptions won't work. Both publisher and subscriber should work,
otherwise, the logical replication set up won't work. I think we can
probably do this, if we can document clearly how the user can make
their logical replication set up work after upgrade.

>
>  And how
> is that different from the case where a subscriber tries to connect
> back to a publisher where a slot has been dropped?
>

It is different because we don't drop slots automatically anywhere else.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:
> It is because after upgrade of both publisher and subscriber, the
> subscriptions won't work. Both publisher and subscriber should work,
> otherwise, the logical replication set up won't work. I think we can
> probably do this, if we can document clearly how the user can make
> their logical replication set up work after upgrade.

Yeah, well, this comes back to my original point that the upgrade of
publisher nodes and subscriber nodes should be treated as two
different problems or we're mixing apples and oranges (and a node
could have both subscriber and publishers).  While being able to
support both is a must, it is going to be a two-step process at the
end, with the subscribers done first and the publishers done after.
That's also kind of the point that Julien makes in top message of this
thread.

I agree that docs are lacking in the proposed patch in terms of
restrictions, assumptions and process flow, but taken in isolation the
problem of the publishers is not something that this patch has to take
care of.  I'd certainly agree that it should mention, at least and if
merged first, to be careful if upgrading the publishers as its slots
are currently removed.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Sep 20, 2023 at 09:38:56AM +0900, Michael Paquier wrote:
> And this code path is used to start postmaster instances for old and
> new clusters.  So it seems to me that it is incorrect if this is not
> conditional based on the cluster version.

Avoiding the startup of bgworkers during pg_upgrade is something that
worries me a bit, actually, as it could be useful in some cases like
monitoring?  That would be fancy, for sure..  For now and seeing a
lack of consensus on this larger matter, I'd like to propose a check
for IsBinaryUpgrade into ApplyLauncherRegister() instead as it makes
no real sense to start apply workers in this context.  That would be
equivalent to max_logical_replication_workers = 0.

Amit, Vignesh, would the attached be OK for both of you?

(Vignesh has posted a slightly different version of this patch on a
different thread, but the subscriber part should be part of this
thread with the subscribers, I assume.)
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Sep 22, 2023 at 4:36 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:
> > It is because after upgrade of both publisher and subscriber, the
> > subscriptions won't work. Both publisher and subscriber should work,
> > otherwise, the logical replication set up won't work. I think we can
> > probably do this, if we can document clearly how the user can make
> > their logical replication set up work after upgrade.
>
> Yeah, well, this comes back to my original point that the upgrade of
> publisher nodes and subscriber nodes should be treated as two
> different problems or we're mixing apples and oranges (and a node
> could have both subscriber and publishers).  While being able to
> support both is a must, it is going to be a two-step process at the
> end, with the subscribers done first and the publishers done after.
> That's also kind of the point that Julien makes in top message of this
> thread.
>
> I agree that docs are lacking in the proposed patch in terms of
> restrictions, assumptions and process flow, but taken in isolation the
> problem of the publishers is not something that this patch has to take
> care of.
>

I also don't think that this patch has to solve the problem of
publishers in any way but as per my understanding, if due to some
reason we are not able to do the upgrade of publishers, this can add
more steps for users than they have to do now for logical replication
set up after upgrade. This is because now after restoring the
subscription rel's and origin, as soon as we start replication after
creating the slots on the publisher, we will never be able to
guarantee data consistency. So, they need to drop the entire
subscription setup including truncating the relations, and then set it
up from scratch which also means they need to somehow remember or take
a dump of the current subscription setup. According to me, the key
point is to have a mechanism to set up slots correctly to allow
replication (or subscriptions) to work after the upgrade. Without
that, it appears to me that we are restoring a subscription where it
can start from some random LSN and can easily lead to data consistency
issues where it can miss some of the updates.

This is the primary reason why I prioritized to work on the publisher
side before getting this patch done, otherwise, the solution for this
patch was relatively clear. I am not sure but I guess this could be
the reason why originally we left it in the current state, otherwise,
restoring subscription rel's or origin doesn't seem to be too much of
an additional effort than what we are doing now.

--
With Regards,
Amit Kapila.



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Michael,

> I'd like to propose a check
> for IsBinaryUpgrade into ApplyLauncherRegister() instead as it makes
> no real sense to start apply workers in this context.  That would be
> equivalent to max_logical_replication_workers = 0.

Personally, I prefer to change max_logical_replication_workers. Mainly there are
two reasons:

1. Your approach must be back-patched to older versions which support logical
   replication feature, but the oldest one (PG10) has already been unsupported.
   We should not modify such a branch.
2. Also, "max_logical_replication_workers = 0" approach would be consistent
   with what we are doing now and for upgrade of publisher patch.
   Please see the previous discussion [1].

[1]: https://www.postgresql.org/message-id/CAA4eK1%2BWBphnmvMpjrxceymzuoMuyV2_pMGaJq-zNODiJqAa7Q%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Sep 25, 2023 at 05:35:18AM +0000, Hayato Kuroda (Fujitsu) wrote:
> Personally, I prefer to change max_logical_replication_workers. Mainly there are
> two reasons:
>
> 1. Your approach must be back-patched to older versions which support logical
>    replication feature, but the oldest one (PG10) has already been unsupported.
>    We should not modify such a branch.

This suggestion would be only for HEAD as it changes the behavior of -b.

> 2. Also, "max_logical_replication_workers = 0" approach would be consistent
>    with what we are doing now and for upgrade of publisher patch.
>    Please see the previous discussion [1].

Yeah, you're right.  Consistency would be good across the board, and
we'd need to take care of the old clusters as well, so the GUC
enforcement would be needed as well.  It does not strike me that this
extra IsBinaryUpgrade would hurt anyway?  Forcing the hand of the
backend has the merit of allowing the removal of the tweak with
max_logical_replication_workers at some point in the future.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Sep 25, 2023 at 10:05:41AM +0530, Amit Kapila wrote:
> I also don't think that this patch has to solve the problem of
> publishers in any way but as per my understanding, if due to some
> reason we are not able to do the upgrade of publishers, this can add
> more steps for users than they have to do now for logical replication
> set up after upgrade. This is because now after restoring the
> subscription rel's and origin, as soon as we start replication after
> creating the slots on the publisher, we will never be able to
> guarantee data consistency. So, they need to drop the entire
> subscription setup including truncating the relations, and then set it
> up from scratch which also means they need to somehow remember or take
> a dump of the current subscription setup. According to me, the key
> point is to have a mechanism to set up slots correctly to allow
> replication (or subscriptions) to work after the upgrade. Without
> that, it appears to me that we are restoring a subscription where it
> can start from some random LSN and can easily lead to data consistency
> issues where it can miss some of the updates.

Sure, that's assuming that the publisher side is upgraded.  FWIW, my
take is that there's room to move forward with this patch anyway in
favor of cases like rollover upgrades to the subscriber.

> This is the primary reason why I prioritized to work on the publisher
> side before getting this patch done, otherwise, the solution for this
> patch was relatively clear. I am not sure but I guess this could be
> the reason why originally we left it in the current state, otherwise,
> restoring subscription rel's or origin doesn't seem to be too much of
> an additional effort than what we are doing now.

By "additional effort", you are referring to what the patch is doing,
with the binary dump of pg_subscription_rel, right?
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Sep 25, 2023 at 10:05:41AM +0530, Amit Kapila wrote:
> > I also don't think that this patch has to solve the problem of
> > publishers in any way but as per my understanding, if due to some
> > reason we are not able to do the upgrade of publishers, this can add
> > more steps for users than they have to do now for logical replication
> > set up after upgrade. This is because now after restoring the
> > subscription rel's and origin, as soon as we start replication after
> > creating the slots on the publisher, we will never be able to
> > guarantee data consistency. So, they need to drop the entire
> > subscription setup including truncating the relations, and then set it
> > up from scratch which also means they need to somehow remember or take
> > a dump of the current subscription setup. According to me, the key
> > point is to have a mechanism to set up slots correctly to allow
> > replication (or subscriptions) to work after the upgrade. Without
> > that, it appears to me that we are restoring a subscription where it
> > can start from some random LSN and can easily lead to data consistency
> > issues where it can miss some of the updates.
>
> Sure, that's assuming that the publisher side is upgraded.
>

At some point, user needs to upgrade publisher and subscriber could
itself have some publications defined which means the downstream
subscribers will have the same problem.

>  FWIW, my
> take is that there's room to move forward with this patch anyway in
> favor of cases like rollover upgrades to the subscriber.
>
> > This is the primary reason why I prioritized to work on the publisher
> > side before getting this patch done, otherwise, the solution for this
> > patch was relatively clear. I am not sure but I guess this could be
> > the reason why originally we left it in the current state, otherwise,
> > restoring subscription rel's or origin doesn't seem to be too much of
> > an additional effort than what we are doing now.
>
> By "additional effort", you are referring to what the patch is doing,
> with the binary dump of pg_subscription_rel, right?
>

Yes.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The attached v8 version patch has the changes for the same.
> >
>
> Is the check to ensure remote_lsn is valid correct in function
> check_for_subscription_state()? How about the case where the apply
> worker didn't receive any change but just marked the relation as
> 'ready'?

I agree that remote_lsn will not be valid in the case when all the
tables are in ready state and there are no changes to be sent by the
walsender to the worker. I was not sure if this check is required in
this case in the check_for_subscription_state function. I was thinking
that this check could be removed.
I'm also checking why the tables should only be in ready state, the
check that is there in the same function, can we support upgrades when
the tables are in syncdone state or not. I will post my analysis once
I have finished checking on the same.

Regards,
Vignesh



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Michael,

> > 1. Your approach must be back-patched to older versions which support logical
> >    replication feature, but the oldest one (PG10) has already been
> unsupported.
> >    We should not modify such a branch.
>
> This suggestion would be only for HEAD as it changes the behavior of -b.
>
> > 2. Also, "max_logical_replication_workers = 0" approach would be consistent
> >    with what we are doing now and for upgrade of publisher patch.
> >    Please see the previous discussion [1].
>
> Yeah, you're right.  Consistency would be good across the board, and
> we'd need to take care of the old clusters as well, so the GUC
> enforcement would be needed as well.  It does not strike me that this
> extra IsBinaryUpgrade would hurt anyway?  Forcing the hand of the
> backend has the merit of allowing the removal of the tweak with
> max_logical_replication_workers at some point in the future.

Hmm, our initial motivation is to suppress registering the launcher, and adding
a GUC setting is sufficient for it. Indeed, registering a launcher may be harmful,
but it seems not the goal of this thread (changing -b workflow in HEAD is not
sufficient alone for the issue). I'm not sure it should be included in patch sets
here.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote:
> On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:
>> Sure, that's assuming that the publisher side is upgraded.
>
> At some point, user needs to upgrade publisher and subscriber could
> itself have some publications defined which means the downstream
> subscribers will have the same problem.

Not always.  I take it as a valid case that one may want to create a
logical setup only for the sake of an upgrade, and trashes the
publisher after a failover to an upgraded subscriber node after the
latter has done a sync up of the data that's been added to the
relations tracked by the publications while the subscriber was
pg_upgrade'd.

>>> This is the primary reason why I prioritized to work on the publisher
>>> side before getting this patch done, otherwise, the solution for this
>>> patch was relatively clear. I am not sure but I guess this could be
>>> the reason why originally we left it in the current state, otherwise,
>>> restoring subscription rel's or origin doesn't seem to be too much of
>>> an additional effort than what we are doing now.
>>
>> By "additional effort", you are referring to what the patch is doing,
>> with the binary dump of pg_subscription_rel, right?
>>
>
> Yes.

Okay.  I'd like to move on with this stuff, then.  At least it helps
in maintaining data integrity when doing an upgrade with a logical
setup.  The patch still needs more polishing, though..
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 26 Sept 2023 at 10:58, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > The attached v8 version patch has the changes for the same.
> > >
> >
> > Is the check to ensure remote_lsn is valid correct in function
> > check_for_subscription_state()? How about the case where the apply
> > worker didn't receive any change but just marked the relation as
> > 'ready'?
>
> I agree that remote_lsn will not be valid in the case when all the
> tables are in ready state and there are no changes to be sent by the
> walsender to the worker. I was not sure if this check is required in
> this case in the check_for_subscription_state function. I was thinking
> that this check could be removed.
> I'm also checking why the tables should only be in ready state, the
> check that is there in the same function, can we support upgrades when
> the tables are in syncdone state or not. I will post my analysis once
> I have finished checking on the same.

Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
will check if the apply worker has some LSN records that need to be
applied to reach the LSN of the table. Once the required WAL is
applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
SUBREL_STATE_READY state. Since there is a chance that in this case
the apply worker has to apply some transactions to get all the tables
in READY state, I felt the minimum requirement should be that at least
all the tables should be in READY state for the upgradation of the
subscriber.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Sep 27, 2023 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 26 Sept 2023 at 10:58, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 20 Sept 2023 at 16:54, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Sep 15, 2023 at 3:08 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > The attached v8 version patch has the changes for the same.
> > > >
> > >
> > > Is the check to ensure remote_lsn is valid correct in function
> > > check_for_subscription_state()? How about the case where the apply
> > > worker didn't receive any change but just marked the relation as
> > > 'ready'?
> >
> > I agree that remote_lsn will not be valid in the case when all the
> > tables are in ready state and there are no changes to be sent by the
> > walsender to the worker. I was not sure if this check is required in
> > this case in the check_for_subscription_state function. I was thinking
> > that this check could be removed.
> > I'm also checking why the tables should only be in ready state, the
> > check that is there in the same function, can we support upgrades when
> > the tables are in syncdone state or not. I will post my analysis once
> > I have finished checking on the same.
>
> Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
> will check if the apply worker has some LSN records that need to be
> applied to reach the LSN of the table. Once the required WAL is
> applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
> SUBREL_STATE_READY state. Since there is a chance that in this case
> the apply worker has to apply some transactions to get all the tables
> in READY state, I felt the minimum requirement should be that at least
> all the tables should be in READY state for the upgradation of the
> subscriber.
>

I don't think this theory is completely correct because the pending
WAL can be applied even after an upgrade.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 25 Sept 2023 at 10:05, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 22, 2023 at 4:36 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Thu, Sep 21, 2023 at 02:35:55PM +0530, Amit Kapila wrote:
> > > It is because after upgrade of both publisher and subscriber, the
> > > subscriptions won't work. Both publisher and subscriber should work,
> > > otherwise, the logical replication set up won't work. I think we can
> > > probably do this, if we can document clearly how the user can make
> > > their logical replication set up work after upgrade.
> >
> > Yeah, well, this comes back to my original point that the upgrade of
> > publisher nodes and subscriber nodes should be treated as two
> > different problems or we're mixing apples and oranges (and a node
> > could have both subscriber and publishers).  While being able to
> > support both is a must, it is going to be a two-step process at the
> > end, with the subscribers done first and the publishers done after.
> > That's also kind of the point that Julien makes in top message of this
> > thread.
> >
> > I agree that docs are lacking in the proposed patch in terms of
> > restrictions, assumptions and process flow, but taken in isolation the
> > problem of the publishers is not something that this patch has to take
> > care of.
> >
>
> I also don't think that this patch has to solve the problem of
> publishers in any way but as per my understanding, if due to some
> reason we are not able to do the upgrade of publishers, this can add
> more steps for users than they have to do now for logical replication
> set up after upgrade. This is because now after restoring the
> subscription rel's and origin, as soon as we start replication after
> creating the slots on the publisher, we will never be able to
> guarantee data consistency. So, they need to drop the entire
> subscription setup including truncating the relations, and then set it
> up from scratch which also means they need to somehow remember or take
> a dump of the current subscription setup. According to me, the key
> point is to have a mechanism to set up slots correctly to allow
> replication (or subscriptions) to work after the upgrade. Without
> that, it appears to me that we are restoring a subscription where it
> can start from some random LSN and can easily lead to data consistency
> issues where it can miss some of the updates.
>
> This is the primary reason why I prioritized to work on the publisher
> side before getting this patch done, otherwise, the solution for this
> patch was relatively clear. I am not sure but I guess this could be
> the reason why originally we left it in the current state, otherwise,
> restoring subscription rel's or origin doesn't seem to be too much of
> an additional effort than what we are doing now.

I have tried to analyze the steps for upgrading the subscriber with
HEAD and with the upgrade patches, Here are the steps for the same:
Current steps to upgrade subscriber in HEAD:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
4) Alter the subscriptions to point to new slots in the subscriber
5) Enable the subscriptions
6) Alter subscription to refresh the publications

Steps to upgrade If we commit only the subscriber upgrade patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) truncate the tables
Note: We will have to drop the subscriptions as we have made changes
to the pg_subscription_rel
4) But drop subscription will throw error:
postgres=# DROP SUBSCRIPTION test1 cascade;
ERROR:  could not drop replication slot "test1" on publisher: ERROR:
replication slot "test1" does not exist
5) Alter the subscription to set slot_name to none
6) Make a note of all the subscriptions that are present
7) drop the subscriptions
8) Create the subscriptions

The number of steps will increase in this case.

Steps to upgrade If we commit publisher upgrade patch first and then
the subscriber upgrade patch patch:
1) Upgrade the subscriber server
2) Start subscriber server
3) Enable the subscription
4) Alter subscription to refresh the publications

Based on the above, I also feel it is better to get the upgrade
publisher patch committed first, as a) it will reduce the data copying
time(as truncate is not required) b) the number of steps will reduce
c) all the use cases will be handled.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Sep 27, 2023 at 07:31:41PM +0530, Amit Kapila wrote:
> On Wed, Sep 27, 2023 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote:
>> Once the table is in SUBREL_STATE_SYNCDONE state, the apply worker
>> will check if the apply worker has some LSN records that need to be
>> applied to reach the LSN of the table. Once the required WAL is
>> applied, the table state will be changed from SUBREL_STATE_SYNCDONE to
>> SUBREL_STATE_READY state. Since there is a chance that in this case
>> the apply worker has to apply some transactions to get all the tables
>> in READY state, I felt the minimum requirement should be that at least
>> all the tables should be in READY state for the upgradation of the
>> Subscriber.
>
> I don't think this theory is completely correct because the pending
> WAL can be applied even after an upgrade.

Yeah, agreed that putting a pre-check about the state of the relations
stored in pg_subscription_rel when handling the upgrade of a
subscriber is not necessary.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Sep 27, 2023 at 9:14 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote:
> > On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> Sure, that's assuming that the publisher side is upgraded.
> >
> > At some point, user needs to upgrade publisher and subscriber could
> > itself have some publications defined which means the downstream
> > subscribers will have the same problem.
>
> Not always.  I take it as a valid case that one may want to create a
> logical setup only for the sake of an upgrade, and trashes the
> publisher after a failover to an upgraded subscriber node after the
> latter has done a sync up of the data that's been added to the
> relations tracked by the publications while the subscriber was
> pg_upgrade'd.
>

Such a use case is possible to achieve even without this patch.
Sawada-San has already given an alternative to slightly tweak the
steps mentioned by Julien to achieve it. Also, there are other ways to
achieve it by slightly changing the steps. OTOH, it will create a
problem for normal logical replication set up after upgrade as
discused.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Fri, Sep 29, 2023 at 05:32:52PM +0530, Amit Kapila wrote:
> Such a use case is possible to achieve even without this patch.
> Sawada-San has already given an alternative to slightly tweak the
> steps mentioned by Julien to achieve it. Also, there are other ways to
> achieve it by slightly changing the steps. OTOH, it will create a
> problem for normal logical replication set up after upgrade as
> discused.

So, now that 29d0a77fa6 has been applied to the tree, would it be time
to brush up what's been discussed on this thread for subscribers?  I'm
OK to spend time on it.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 21 Sept 2023 at 11:27, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:
> > On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> >> Is there a possibility that apply worker on old cluster connects to the
> >> publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
> >> refuse TCP/IP connections from remotes and port number is also changed, so we can
> >> assume that subscriber does not connect to. But IIUC such settings may not affect
> >> to the connection source, so that the apply worker may try to connect to the
> >> publisher. Also, is there any hazards if it happens?
> >
> > Yes, there is a possibility that the apply worker gets started and new
> > transaction data is being synced from the publisher. I have made a fix
> > not to start the launcher process in binary ugprade mode as we don't
> > want the launcher to start apply worker during upgrade.
>
> Hmm.  I was wondering if 0001 is the right way to handle this case,
> but at the end I'm OK to paint one extra isBinaryUpgrade in the code
> path where apply launchers are registered.  I don't think that the
> patch is complete, though.  A comment should be added in pg_upgrade's
> server.c, exactly start_postmaster(), to tell that -b also stops apply
> workers.  I am attaching a version updated as of the attached, that
> I'd be OK to apply.

I have added comments

> I don't really think that we need to worry about a subscriber
> connecting back to a publisher in this case, though?  I mean, each
> postmaster instance started by pg_upgrade restricts the access to the
> instance with unix_socket_directories set to a custom path and
> permissions at 0700, and a subscription's connection string does not
> know the unix path used by pg_upgrade.  I certainly agree that
> stopping these processes could lead to inconsistencies in the data the
> subscribers have been holding though, if we are not careful, so
> preventing them from running is a good practice anyway.

I have made the fix similar to how upgrade publisher has done to keep
it  consistent.

> I have also reviewed 0002.  As a whole, I think that I'm OK with the
> main approach of the patch in pg_dump to use a new type of dumpable
> object for subscription relations that are dumped with their upgrade
> functions after.  This still needs more work, and more documentation.

Added documentation

> Also, perhaps we should really have an option to control if this part
> of the copy happens or not.  With a --no-subscription-relations for
> pg_dump at least?

Currently this is done by default in binary upgrade mode, I will add a
separate patch to skip dump of subscription relations from upgrade and
dump a little later.

>
> +{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',
>
> During a development cycle, any new function added needs to use an OID
> in range 8000-9999.  Running unused_oids will suggest new random OIDs.

Modified

> FWIW, I am not convinced that there is a need for two functions to add
> an entry to pg_subscription_rel, with sole difference between both the
> handling of a valid or invalid LSN.  We should have only one function
> that's able to handle NULL for the LSN.  So let's remove rel_state_a
> and rel_state_b, and have a single rel_state().  The description of
> the SQL functions is inconsistent with the other binary upgrade ones,
> I would suggest for the two functions
> "for use by pg_upgrade (relation for pg_subscription_rel)"
> "for use by pg_upgrade (remote_lsn for origin)"

Removed rel_state_a and rel_state_b and updated the description accordingly

> +   i_srsublsn = PQfnumber(res, "srsublsn");
> [...]
> +       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
>
> In getSubscriptionTables(), this should check for PQgetisnull()
> because we would have a NULL value for InvalidXLogRecPtr in the
> catalog.  Using a char* for srsublsn is OK, but just assign NULL to
> it, then just pass a hardcoded NULL value to the function as we do in
> other places.  So I don't quite get why this is not the same handling
> as suboriginremotelsn.

Modified

>
> getSubscriptionTables() is entirely skipped if we don't want any
> subscriptions, if we deal with a server of 9.6 or older or if we don't
> do binary upgrades, which is OK.
>
> +/*
> + * getSubscriptionTables
> + *       get information about subscription membership for dumpable tables.
> + */
> This commit is slightly misleading and should mention that this is an
> upgrade-only path?

Modified

>
> The code for dumpSubscriptionTable() is a copy-paste of
> dumpPublicationTable(), but a lot of what you are doing here is
> actually pointless if we are not in binary mode?  Why should this code
> path not taken only under dataOnly?  I mean, this is a code path we
> should never take except if we are in binary mode.  This should have
> at least a cross-check to make sure that we never have a
> DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.

I have added an assert in this case, as it is not expected to come
here in non binary mode

> +    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
> +    {
> +        appendPQExpBufferStr(query,
> +                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
> +        appendStringLiteralAH(query, subinfo->dobj.name, fout);
> +        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
> +    }
>
> Hmm..  Could it be actually useful even for debugging to still have
> this query if suboriginremotelsn is an InvalidXLogRecPtr?  I think
> that this should have a comment of the kind "\n-- For binary upgrade,
> blah".  At least it would not be a bad thing to enforce a correct
> state from the start, removing the NULL check for the second argument
> in binary_upgrade_replorigin_advance().

Modified

> +    /* We need to check for pg_replication_origin_status only once. */
> Perhaps it would be better to explain why?

This remote_lsn code change is actually not required, I have removed this now.

>
> +                       "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
> Why a COALESCE here?  Cannot this stuff just use NULL?

This remote_lsn code change is actually not required, I have removed this now.

> +    fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
> Could it be possible to include the schema of the relation in this log?

Modified

> +static void check_for_subscription_state(ClusterInfo *cluster);
> I'd be tempted to move that into a patch on its own, actually, for a
> cleaner history.

As of now I have kept it together, I will change it later based on
more feedback from others

> +# Copyright (c) 2022-2023, PostgreSQL Global Development Group
> New as of 2023.

Modified

> +# Check that after upgradation of the subscriber server, the incremental
> +# changes added to the publisher are replicated.
> [..]
> +   For upgradation of the subscriptions, all the subscriptions on the old
> +   cluster must have a valid <varname>remote_lsn</varname>, and all the
>
> Upgradation?  I think that this should be reworded:
> "All the subscriptions of an old cluster require a valid remote_lsn
> during an upgrade."

This remote_lsn code change is actually not required, I have removed this now.

>
> A CI run is reporting the following compilation warnings:
> [04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
> [04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
> [04:21:15.290]  4655 |   subrinfo[cur_rel].subinfo = subinfo;

 I have initialized and checked with [-Werror=maybe-uninitialized],
let me check in the next cfbot run


> +ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
> +       "pg_upgrade_output.d/ not removed after pg_upgrade failure");
> Not sure that there's a need for this check.  Okay, that's cheap.

Modified

> And, err.  We are going to need an option to control if the slot data
> is copied, and a bit more documentation in pg_upgrade to explain how
> things happen when the copy happens.
Added documentation for this, we will copy the slot data by default,
we will add a separate patch to skip dump of subscription
relations/replication slot from upgrade and dump a little later.

The attached v9 version patch has the changes for the same.

Apart from this I'm still checking that the old cluster's subscription
relations states are READY state still, but there is a possibility
that SYNCDONE or FINISHEDCOPY could work, this needs more thought
before concluding which is the correct state to check. Let' handle
this in the upcoming version.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Oct 27, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Apart from this I'm still checking that the old cluster's subscription
> relations states are READY state still, but there is a possibility
> that SYNCDONE or FINISHEDCOPY could work, this needs more thought
> before concluding which is the correct state to check. Let' handle
> this in the upcoming version.
>

I was analyzing this part and it seems it could be tricky to upgrade
in FINISHEDCOPY state. Because the system would expect that subscriber
would know the old slotname from oldcluster which it can drop at
SYNCDONE state. Now, as sync_slot_name is generated based on subid,
relid which could be different in the new cluster, the generated
slotname would be different after the upgrade. OTOH, if the relstate
is INIT, then I think the sync could be performed even after the
upgrade.

Shouldn't we at least ensure that replication origins do exist in the
old cluster corresponding to each of the subscriptions? Otherwise,
later the query to get remote_lsn for origin in getSubscriptions()
would fail.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 27 Oct 2023 at 12:09, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 21 Sept 2023 at 11:27, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Fri, Sep 15, 2023 at 03:08:21PM +0530, vignesh C wrote:
> > > On Tue, 12 Sept 2023 at 14:25, Hayato Kuroda (Fujitsu)
> > > <kuroda.hayato@fujitsu.com> wrote:
> > >> Is there a possibility that apply worker on old cluster connects to the
> > >> publisher during the upgrade? Regarding the pg_upgrade on publisher, the we
> > >> refuse TCP/IP connections from remotes and port number is also changed, so we can
> > >> assume that subscriber does not connect to. But IIUC such settings may not affect
> > >> to the connection source, so that the apply worker may try to connect to the
> > >> publisher. Also, is there any hazards if it happens?
> > >
> > > Yes, there is a possibility that the apply worker gets started and new
> > > transaction data is being synced from the publisher. I have made a fix
> > > not to start the launcher process in binary ugprade mode as we don't
> > > want the launcher to start apply worker during upgrade.
> >
> > Hmm.  I was wondering if 0001 is the right way to handle this case,
> > but at the end I'm OK to paint one extra isBinaryUpgrade in the code
> > path where apply launchers are registered.  I don't think that the
> > patch is complete, though.  A comment should be added in pg_upgrade's
> > server.c, exactly start_postmaster(), to tell that -b also stops apply
> > workers.  I am attaching a version updated as of the attached, that
> > I'd be OK to apply.
>
> I have added comments
>
> > I don't really think that we need to worry about a subscriber
> > connecting back to a publisher in this case, though?  I mean, each
> > postmaster instance started by pg_upgrade restricts the access to the
> > instance with unix_socket_directories set to a custom path and
> > permissions at 0700, and a subscription's connection string does not
> > know the unix path used by pg_upgrade.  I certainly agree that
> > stopping these processes could lead to inconsistencies in the data the
> > subscribers have been holding though, if we are not careful, so
> > preventing them from running is a good practice anyway.
>
> I have made the fix similar to how upgrade publisher has done to keep
> it  consistent.
>
> > I have also reviewed 0002.  As a whole, I think that I'm OK with the
> > main approach of the patch in pg_dump to use a new type of dumpable
> > object for subscription relations that are dumped with their upgrade
> > functions after.  This still needs more work, and more documentation.
>
> Added documentation
>
> > Also, perhaps we should really have an option to control if this part
> > of the copy happens or not.  With a --no-subscription-relations for
> > pg_dump at least?
>
> Currently this is done by default in binary upgrade mode, I will add a
> separate patch to skip dump of subscription relations from upgrade and
> dump a little later.
>
> >
> > +{ oid => '4551', descr => 'add a relation with the specified relation state to pg_subscription_rel table',
> >
> > During a development cycle, any new function added needs to use an OID
> > in range 8000-9999.  Running unused_oids will suggest new random OIDs.
>
> Modified
>
> > FWIW, I am not convinced that there is a need for two functions to add
> > an entry to pg_subscription_rel, with sole difference between both the
> > handling of a valid or invalid LSN.  We should have only one function
> > that's able to handle NULL for the LSN.  So let's remove rel_state_a
> > and rel_state_b, and have a single rel_state().  The description of
> > the SQL functions is inconsistent with the other binary upgrade ones,
> > I would suggest for the two functions
> > "for use by pg_upgrade (relation for pg_subscription_rel)"
> > "for use by pg_upgrade (remote_lsn for origin)"
>
> Removed rel_state_a and rel_state_b and updated the description accordingly
>
> > +   i_srsublsn = PQfnumber(res, "srsublsn");
> > [...]
> > +       subrinfo[cur_rel].srsublsn = pg_strdup(PQgetvalue(res, i, i_srsublsn));
> >
> > In getSubscriptionTables(), this should check for PQgetisnull()
> > because we would have a NULL value for InvalidXLogRecPtr in the
> > catalog.  Using a char* for srsublsn is OK, but just assign NULL to
> > it, then just pass a hardcoded NULL value to the function as we do in
> > other places.  So I don't quite get why this is not the same handling
> > as suboriginremotelsn.
>
> Modified
>
> >
> > getSubscriptionTables() is entirely skipped if we don't want any
> > subscriptions, if we deal with a server of 9.6 or older or if we don't
> > do binary upgrades, which is OK.
> >
> > +/*
> > + * getSubscriptionTables
> > + *       get information about subscription membership for dumpable tables.
> > + */
> > This commit is slightly misleading and should mention that this is an
> > upgrade-only path?
>
> Modified
>
> >
> > The code for dumpSubscriptionTable() is a copy-paste of
> > dumpPublicationTable(), but a lot of what you are doing here is
> > actually pointless if we are not in binary mode?  Why should this code
> > path not taken only under dataOnly?  I mean, this is a code path we
> > should never take except if we are in binary mode.  This should have
> > at least a cross-check to make sure that we never have a
> > DO_SUBSCRIPTION_REL in this code path if we are in non-binary mode.
>
> I have added an assert in this case, as it is not expected to come
> here in non binary mode
>
> > +    if (dopt->binary_upgrade && subinfo->suboriginremotelsn)
> > +    {
> > +        appendPQExpBufferStr(query,
> > +                             "SELECT pg_catalog.binary_upgrade_replorigin_advance(");
> > +        appendStringLiteralAH(query, subinfo->dobj.name, fout);
> > +        appendPQExpBuffer(query, ", '%s');\n", subinfo->suboriginremotelsn);
> > +    }
> >
> > Hmm..  Could it be actually useful even for debugging to still have
> > this query if suboriginremotelsn is an InvalidXLogRecPtr?  I think
> > that this should have a comment of the kind "\n-- For binary upgrade,
> > blah".  At least it would not be a bad thing to enforce a correct
> > state from the start, removing the NULL check for the second argument
> > in binary_upgrade_replorigin_advance().
>
> Modified
>
> > +    /* We need to check for pg_replication_origin_status only once. */
> > Perhaps it would be better to explain why?
>
> This remote_lsn code change is actually not required, I have removed this now.
>
> >
> > +                       "WHERE coalesce(remote_lsn, '0/0') = '0/0'"
> > Why a COALESCE here?  Cannot this stuff just use NULL?
>
> This remote_lsn code change is actually not required, I have removed this now.
>
> > +    fprintf(script, "database:%s subscription:%s relation:%s in non-ready state\n",
> > Could it be possible to include the schema of the relation in this log?
>
> Modified
>
> > +static void check_for_subscription_state(ClusterInfo *cluster);
> > I'd be tempted to move that into a patch on its own, actually, for a
> > cleaner history.
>
> As of now I have kept it together, I will change it later based on
> more feedback from others
>
> > +# Copyright (c) 2022-2023, PostgreSQL Global Development Group
> > New as of 2023.
>
> Modified
>
> > +# Check that after upgradation of the subscriber server, the incremental
> > +# changes added to the publisher are replicated.
> > [..]
> > +   For upgradation of the subscriptions, all the subscriptions on the old
> > +   cluster must have a valid <varname>remote_lsn</varname>, and all the
> >
> > Upgradation?  I think that this should be reworded:
> > "All the subscriptions of an old cluster require a valid remote_lsn
> > during an upgrade."
>
> This remote_lsn code change is actually not required, I have removed this now.
>
> >
> > A CI run is reporting the following compilation warnings:
> > [04:21:15.290] pg_dump.c: In function ‘getSubscriptionTables’:
> > [04:21:15.290] pg_dump.c:4655:29: error: ‘subinfo’ may be used
> > uninitialized in this function [-Werror=maybe-uninitialized]
> > [04:21:15.290]  4655 |   subrinfo[cur_rel].subinfo = subinfo;
>
>  I have initialized and checked with [-Werror=maybe-uninitialized],
> let me check in the next cfbot run
>
>
> > +ok(-d $new_sub->data_dir . "/pg_upgrade_output.d",
> > +       "pg_upgrade_output.d/ not removed after pg_upgrade failure");
> > Not sure that there's a need for this check.  Okay, that's cheap.
>
> Modified
>
> > And, err.  We are going to need an option to control if the slot data
> > is copied, and a bit more documentation in pg_upgrade to explain how
> > things happen when the copy happens.
> Added documentation for this, we will copy the slot data by default,
> we will add a separate patch to skip dump of subscription
> relations/replication slot from upgrade and dump a little later.
>
> The attached v9 version patch has the changes for the same.
>
> Apart from this I'm still checking that the old cluster's subscription
> relations states are READY state still, but there is a possibility
> that SYNCDONE or FINISHEDCOPY could work, this needs more thought
> before concluding which is the correct state to check. Let' handle
> this in the upcoming version.

The patch was not applying because of recent commits. Here is a
rebased version of the patches.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Fri, Oct 27, 2023 at 05:05:39PM +0530, Amit Kapila wrote:
> I was analyzing this part and it seems it could be tricky to upgrade
> in FINISHEDCOPY state. Because the system would expect that subscriber
> would know the old slotname from oldcluster which it can drop at
> SYNCDONE state. Now, as sync_slot_name is generated based on subid,
> relid which could be different in the new cluster, the generated
> slotname would be different after the upgrade. OTOH, if the relstate
> is INIT, then I think the sync could be performed even after the
> upgrade.

TBH, I am really wondering if there is any need to go down to being
able to handle anything else than READY for the relation states in
pg_subscription_rel.  One reason is that it makes it much easier to
think about how to handle these in parallel of a node with
publications that also need to go through an upgrade, because as READY
relations they don't require any tracking.  IMO, this makes it simpler
to think about cases where a node holds both subscriptions and
publications.

FWIW, my take is that it feels natural to do the upgrades of
subscriptions first, creating a similarity with the case of minor
updates with physical replication setups.

> Shouldn't we at least ensure that replication origins do exist in the
> old cluster corresponding to each of the subscriptions? Otherwise,
> later the query to get remote_lsn for origin in getSubscriptions()
> would fail.

You mean in the shape of a pre-upgrade check making sure that
pg_replication_origin_status has entries for all the subscriptions we
expect to see during the upgrade?  Makes sense to me.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Oct 30, 2023 at 03:05:09PM +0530, vignesh C wrote:
> The patch was not applying because of recent commits. Here is a
> rebased version of the patches.

+     * We don't want the launcher to run while upgrading because it may start
+     * apply workers which could start receiving changes from the publisher
+     * before the physical files are put in place, causing corruption on the
+     * new cluster upgrading to, so setting max_logical_replication_workers=0
+     * to disable launcher.
      */
     if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
-        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");

At least that's consistent with the other side of the coin with
publications.  So 0001 looks basically OK seen from here.

The indentation of 0002 seems off in a few places.

+    <para>
+     Verify that all the subscription tables in the old subscriber are in
+     <literal>r</literal> (ready) state. Setup the
+     <link linkend="logical-replication-config-subscriber"> subscriber
+     configurations</link> in the new subscriber.
[...]
+    <para>
+     There is a prerequisites that all the subscription tables should be in
+     <literal>r</literal> (ready) state for
+     <application>pg_upgrade</application> to be able to upgrade the
+     subscriber. If this is not met an error will be reported.
+    </para>

This part is repeated.  Globally, this documentation addition does not
seem really helpful for the end-user as it describes the checks that
are done during the upgrade.  Shouldn't this part of the docs,
similarly to the publication part, focus on providing a check list of
actions to take to achieve a clean upgrade, with a list of commands
and configurations required?  The good part is that information about
what's copied is provided (pg_subscription_rel and the origin status),
still this could be improved.

+    <para>
+     Enable the subscriptions by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
+    </para>

This is something users can act on, but how does this operation help
with the upgrade?  Should this happen for all the descriptions
subscriptions?  Or you mean that this is something that needs to be
run after the upgrade?

+    <para>
+     Create all the new tables that were created in the publication and
+     refresh the publication by executing
+     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+    </para>

What does "new tables" refer to in this case?  Are you referring to
the case where new relations have been added on a publication node
after an upgrade and need to be copied?  Does one need to DISABLE the
subscriptions on the subscriber node before running the upgrade, or is
a REFRESH enough?  The test only uses a REFRESH, so the docs and the
code don't entirely agree with each other.

+  <para>
+   For upgradation of the subscriptions, all the subscription tables should be
+   in <literal>r</literal> (ready) state, or else the
+   <application>pg_upgrade</application> run will error.
+  </para>

"Upgradation"?

+# Set tables to 'i' state
+$old_sub->safe_psql(
+    'postgres',
+    "UPDATE pg_subscription_rel
+        SET srsubstate = 'i' WHERE srsubstate = 'r'");

I am not sure that doing catalog manipulation in the TAP test itself
is a good idea, because this can finish by being unpredictible in the
long-term for the test maintenance.  I think that this portion of the
test should just be removed.  poll_query_until() or wait queries
making sure that all the relations are in the state we want them to be
before the beginning of the upgrade is enough in terms of test
coverag, IMO.

+$result = $new_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");

This assumes one row, but perhaps this had better do a match based on
external_id and/or local_id?
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 27 Oct 2023 at 17:05, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 27, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Apart from this I'm still checking that the old cluster's subscription
> > relations states are READY state still, but there is a possibility
> > that SYNCDONE or FINISHEDCOPY could work, this needs more thought
> > before concluding which is the correct state to check. Let' handle
> > this in the upcoming version.
> >
>
> I was analyzing this part and it seems it could be tricky to upgrade
> in FINISHEDCOPY state. Because the system would expect that subscriber
> would know the old slotname from oldcluster which it can drop at
> SYNCDONE state. Now, as sync_slot_name is generated based on subid,
> relid which could be different in the new cluster, the generated
> slotname would be different after the upgrade. OTOH, if the relstate
> is INIT, then I think the sync could be performed even after the
> upgrade.

I had analyzed all the subscription relation states further, here is
my analysis:
The following states are ok, as either the replication slot is not
created or the replication slot is already dropped and the required
WAL files will be present in the publisher:
a) SUBREL_STATE_SYNCDONE b) SUBREL_STATE_READY c) SUBREL_STATE_INIT
The following states are not ok as the worker has dependency on the
replication slot/origin in these case:
a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will try
to drop the replication slot but as the replication slots will be
created with old subscription id in the publisher and the upgraded
subscriber will not be able to clean the slots in this case. b)
SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
expect the origin to be already existing as the origin is created with
an old subscription id, tablesync worker will not be able to find the
origin in this case. c) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP
and SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
so we need not allow these states.
I modified it to support the relation states accordingly.

> Shouldn't we at least ensure that replication origins do exist in the
> old cluster corresponding to each of the subscriptions? Otherwise,
> later the query to get remote_lsn for origin in getSubscriptions()
> would fail.
Added a check for the same.

The attached v10 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v10-0001

======
Commit message

1.
The chance of being able to do so should be small as pg_upgrade uses its
own port and unix domain directory (customizable as well with
--socketdir), but just preventing the launcher to start is safer at the
end, because we are then sure that no changes would ever be applied.

~

"safer at the end" (??)

======
src/bin/pg_upgrade/server.c

2.
+ * We don't want the launcher to run while upgrading because it may start
+ * apply workers which could start receiving changes from the publisher
+ * before the physical files are put in place, causing corruption on the
+ * new cluster upgrading to, so setting max_logical_replication_workers=0
+ * to disable launcher.
  */
  if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
- appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
+ appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c
max_logical_replication_workers=0");

2a.
The comment is one big long sentence. IMO it will be better to break it up.

~

2b.
Add a blank line between this comment note and the previous one.

~~~

2c.
In a recent similar thread [1], they chose to implement a guc_hook to
prevent a user from overriding this via the command line option during
the upgrade. Shouldn't this patch do the same thing, for consistency?

~~~

2d.
If you do implement such a guc_hook (per #2c above), then should the
patch also include a test case for getting an ERROR if the user tries
to override that GUC?

======
[1] https://www.postgresql.org/message-id/20231027.115759.2206827438943188717.horikyota.ntt%40gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Thu, Nov 02, 2023 at 04:35:26PM +1100, Peter Smith wrote:
> The chance of being able to do so should be small as pg_upgrade uses its
> own port and unix domain directory (customizable as well with
> --socketdir), but just preventing the launcher to start is safer at the
> end, because we are then sure that no changes would ever be applied.
> ~
> "safer at the end" (??)

Well, just safer.

> 2a.
> The comment is one big long sentence. IMO it will be better to break it up.
> 2b.
> Add a blank line between this comment note and the previous one.

Yes, I found that equally confusing when looking at this patch, so
I've edited the patch this way when I was looking at it today.  This
is enough to do the job, so I have applied it for now, before moving
on with the second one of this thread.

> 2c.
> In a recent similar thread [1], they chose to implement a guc_hook to
> prevent a user from overriding this via the command line option during
> the upgrade. Shouldn't this patch do the same thing, for consistency?
> 2d.
> If you do implement such a guc_hook (per #2c above), then should the
> patch also include a test case for getting an ERROR if the user tries
> to override that GUC?

Yeah, that may be something to do, but I am not sure that it is worth
complicating the backend code for the remote case where one enforces
an option while we are already setting a GUC in the upgrade path:
https://www.postgresql.org/message-id/CAA4eK1Lh9J5VLypSQugkdD+H=_5-6p3rOocjo7JbTogcxA2hxg@mail.gmail.com

That feels like a lot of extra facility for cases that should never
happen.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Nov 1, 2023 at 8:33 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Oct 27, 2023 at 05:05:39PM +0530, Amit Kapila wrote:
> > I was analyzing this part and it seems it could be tricky to upgrade
> > in FINISHEDCOPY state. Because the system would expect that subscriber
> > would know the old slotname from oldcluster which it can drop at
> > SYNCDONE state. Now, as sync_slot_name is generated based on subid,
> > relid which could be different in the new cluster, the generated
> > slotname would be different after the upgrade. OTOH, if the relstate
> > is INIT, then I think the sync could be performed even after the
> > upgrade.
>
> TBH, I am really wondering if there is any need to go down to being
> able to handle anything else than READY for the relation states in
> pg_subscription_rel.  One reason is that it makes it much easier to
> think about how to handle these in parallel of a node with
> publications that also need to go through an upgrade, because as READY
> relations they don't require any tracking.  IMO, this makes it simpler
> to think about cases where a node holds both subscriptions and
> publications.
>

But that poses needless restrictions for the users. For example, there
appears no harm in upgrading even when the relation is in
SUBREL_STATE_INIT state. Users should be able to continue replication
after the upgrade.

> FWIW, my take is that it feels natural to do the upgrades of
> subscriptions first, creating a similarity with the case of minor
> updates with physical replication setups.
>
> > Shouldn't we at least ensure that replication origins do exist in the
> > old cluster corresponding to each of the subscriptions? Otherwise,
> > later the query to get remote_lsn for origin in getSubscriptions()
> > would fail.
>
> You mean in the shape of a pre-upgrade check making sure that
> pg_replication_origin_status has entries for all the subscriptions we
> expect to see during the upgrade?
>

Yes.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 1 Nov 2023 at 10:13, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Oct 30, 2023 at 03:05:09PM +0530, vignesh C wrote:
> > The patch was not applying because of recent commits. Here is a
> > rebased version of the patches.
>
> +     * We don't want the launcher to run while upgrading because it may start
> +     * apply workers which could start receiving changes from the publisher
> +     * before the physical files are put in place, causing corruption on the
> +     * new cluster upgrading to, so setting max_logical_replication_workers=0
> +     * to disable launcher.
>       */
>      if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
> -        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
> +        appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1 -c max_logical_replication_workers=0");
>
> At least that's consistent with the other side of the coin with
> publications.  So 0001 looks basically OK seen from here.
>
> The indentation of 0002 seems off in a few places.

I fixed wherever possible for documentation and also ran pgindent and
pgperltidy.

> +    <para>
> +     Verify that all the subscription tables in the old subscriber are in
> +     <literal>r</literal> (ready) state. Setup the
> +     <link linkend="logical-replication-config-subscriber"> subscriber
> +     configurations</link> in the new subscriber.
> [...]
> +    <para>
> +     There is a prerequisites that all the subscription tables should be in
> +     <literal>r</literal> (ready) state for
> +     <application>pg_upgrade</application> to be able to upgrade the
> +     subscriber. If this is not met an error will be reported.
> +    </para>
>
> This part is repeated.

Removed the duplicate contents.

> Globally, this documentation addition does not
> seem really helpful for the end-user as it describes the checks that
> are done during the upgrade.  Shouldn't this part of the docs,
> similarly to the publication part, focus on providing a check list of
> actions to take to achieve a clean upgrade, with a list of commands
> and configurations required?  The good part is that information about
> what's copied is provided (pg_subscription_rel and the origin status),
> still this could be improved.

I have slightly modified it now and also made it consistent with the
replication slot upgrade, but I was not sure if we need to add
anything more. Let me know if anything else needs to be added. I will
add it.

> +    <para>
> +     Enable the subscriptions by executing
> +     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... ENABLE</command></link>.
> +    </para>
>
> This is something users can act on, but how does this operation help
> with the upgrade?  Should this happen for all the descriptions
> subscriptions?  Or you mean that this is something that needs to be
> run after the upgrade?

The subscriptions will be upgraded in disabled mode. Users must enable
the subscriptions after the upgrade is completed. I have mentioned the
same to avoid confusion.

> +    <para>
> +     Create all the new tables that were created in the publication and
> +     refresh the publication by executing
> +     <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
> +    </para>
>
> What does "new tables" refer to in this case?  Are you referring to
> the case where new relations have been added on a publication node
> after an upgrade and need to be copied?  Does one need to DISABLE the
> subscriptions on the subscriber node before running the upgrade, or is
> a REFRESH enough?  The test only uses a REFRESH, so the docs and the
> code don't entirely agree with each other.

Yes, "new tables" refers to the new tables created in the publisher
when the upgrade is in progress. No need to disable the subscription
before upgrade, during upgrade the subscriptions will be copied in
disabled mode, they should be enabled after the upgrade. Mentioned all
these accordingly.

> +  <para>
> +   For upgradation of the subscriptions, all the subscription tables should be
> +   in <literal>r</literal> (ready) state, or else the
> +   <application>pg_upgrade</application> run will error.
> +  </para>
>
> "Upgradation"?

I have removed this content since we have added this in the
prerequisite section now.

> +# Set tables to 'i' state
> +$old_sub->safe_psql(
> +       'postgres',
> +       "UPDATE pg_subscription_rel
> +               SET srsubstate = 'i' WHERE srsubstate = 'r'");
>
> I am not sure that doing catalog manipulation in the TAP test itself
> is a good idea, because this can finish by being unpredictible in the
> long-term for the test maintenance.  I think that this portion of the
> test should just be removed.  poll_query_until() or wait queries
> making sure that all the relations are in the state we want them to be
> before the beginning of the upgrade is enough in terms of test
> coverag, IMO.

Changed the scenario by using primary key failure.

> +$result = $new_sub->safe_psql('postgres',
> +       "SELECT remote_lsn FROM pg_replication_origin_status");
>
> This assumes one row, but perhaps this had better do a match based on
> external_id and/or local_id?

Modified

The attached v11 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Nov 2, 2023 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have slightly modified it now and also made it consistent with the
> replication slot upgrade, but I was not sure if we need to add
> anything more. Let me know if anything else needs to be added. I will
> add it.
>

I think it is important for users to know how they upgrade their
multi-node setup. Say a two-node setup where replication is working
both ways (aka each node has both publications and subscriptions),
similarly, how to upgrade, if there are multiple nodes involved?

One more thing I was thinking about this patch was that here unlike
the publication's slot information, we can't ensure with origin's
remote_lsn that all the WAL is received and applied before allowing
the upgrade. I can't think of any problem at the moment due to this
but still a point worth giving a thought.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 2 Nov 2023 at 11:05, Peter Smith <smithpb2250@gmail.com> wrote:
>
> ~~~
>
> 2c.
> In a recent similar thread [1], they chose to implement a guc_hook to
> prevent a user from overriding this via the command line option during
> the upgrade. Shouldn't this patch do the same thing, for consistency?

Added GUC hook for consistency.

> ~~~
>
> 2d.
> If you do implement such a guc_hook (per #2c above), then should the
> patch also include a test case for getting an ERROR if the user tries
> to override that GUC?

Added a test for the same.

We can use this patch if we are planning to go ahead with guc_hooks
for max_slot_wal_keep_size as discussed at [1].
The attached patch has the changes for the same.

[1] -
https://www.postgresql.org/message-id/CAHut%2BPsTrB%3DmjBA-Y-%2BW4kK63tao9%3DXBsMXG9rkw4g_m9WatwA%40mail.gmail.com


Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Thu, Nov 02, 2023 at 05:00:55PM +0530, Amit Kapila wrote:
> I think it is important for users to know how they upgrade their
> multi-node setup. Say a two-node setup where replication is working
> both ways (aka each node has both publications and subscriptions),
> similarly, how to upgrade, if there are multiple nodes involved?

+1.  My next remarks also apply to the thread where publishers are
handled in upgrades, but I'd like to think that at the end of the
release cycle it would be nice to have the basic features in, with
also a set of regression tests for logical upgrade scenarios that we'd
expect to work.  Two "basic" ones coming into mind:
- Cascading logical setup, with one node in the middle having both
publisher(s) and subscriber(s).
- Two-way replication, with two nodes.

> One more thing I was thinking about this patch was that here unlike
> the publication's slot information, we can't ensure with origin's
> remote_lsn that all the WAL is received and applied before allowing
> the upgrade. I can't think of any problem at the moment due to this
> but still a point worth giving a thought.

Yeah, that may be an itchy point, which is also related to my concerns
on trying to allow more syncstates than ready when beginning the
upgrade, which is at least a point we are sure that a relation was up
to date, up to a certain point.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v11-0001

======
Commit message

1.
The subscription's replication origin are needed to ensure
that we don't replicate anything twice.

~

/are needed/is needed/

~~~

2.
Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud

~

Include Vignesh as another author.

======
doc/src/sgml/ref/pgupgrade.sgml

3.
+     <application>pg_upgrade</application> attempts to migrate subscription
+     dependencies which includes the subscription tables information present in
+     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
+     system table and the subscription replication origin which
+     will help in continuing logical replication from where the old subscriber
+     was replicating. This helps in avoiding the need for setting up the

I became a bit lost reading paragraph due to the multiple 'which'...

SUGGESTION
pg_upgrade attempts to migrate subscription dependencies which
includes the subscription table information present in
pg_subscription_rel system
catalog and also the subscription replication origin. This allows
logical replication on the new subscriber to continue from where the
old subscriber was up to.

~~~

4.
+     was replicating. This helps in avoiding the need for setting up the
+     subscription objects manually which requires truncating all the
+     subscription tables and setting the logical replication slots. Migration

SUGGESTION
Having the ability to migrate subscription objects avoids the need to
set them up manually, which would require truncating all the
subscription tables and setting the logical replication slots.

~

TBH, I am wondering what is the purpose of this sentence. It seems
more like a justification for the patch, but does the user need to
know all this?

~~~

5.
+      <para>
+       All the subscription tables in the old subscriber should be in
+       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
+       <literal>s</literal> (synchronized). This can be verified by checking
+       <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
+      </para>

/should be in/should be in state/

~~~

6.
+      <para>
+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.
+      </para>

missing words?

/This can be checking/This can be found by checking/

~~~

7.
+    <para>
+     The subscriptions will be migrated to new cluster in disabled state, they
+     can be enabled after upgrade by following the steps:
+    </para>

The first bullet also says "Enable the subscription..." so I think
this paragraph should be worded like the below.

SUGGESTION
The subscriptions will be migrated to the new cluster in a disabled
state. After migration, do this:

======
src/backend/catalog/pg_subscription.c

8.
 #include "nodes/makefuncs.h"
+#include "replication/origin.h"
+#include "replication/worker_internal.h"
 #include "storage/lmgr.h"

Why does this change need to be in the patch when there are no other
code changes in this file?

======
src/backend/utils/adt/pg_upgrade_support.c

9. binary_upgrade_create_sub_rel_state

IMO a better name for this function would be
'binary_upgrade_add_sub_rel_state' (because it delegates to
AddSubscriptionRelState).

Then it would obey the same name pattern as the other function
'binary_upgrade_replorigin_advance' (which delegates to
replorigin_advance).

~~~

10.
+/*
+ * binary_upgrade_create_sub_rel_state
+ *
+ * Add the relation with the specified relation state to pg_subscription_rel
+ * table.
+ */
+Datum
+binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;

10a.
/to pg_subscription_rel table./to pg_subscription_rel catalog./

~

10b.
Maybe it would be helpful if the function argument were documented
up-front in the function-comment, or in the variable declarations.

SUGGESTION
char      *subname;  /* ARG0 = subscription name */
Oid        relid;    /* ARG1 = relation Oid */
char       relstate; /* ARG2 = subrel state */
XLogRecPtr sublsn;   /* ARG3 (optional) = subscription lsn */

~~~

11.
if (PG_ARGISNULL(3))
sublsn = InvalidXLogRecPtr;
else
sublsn = PG_GETARG_LSN(3);
FWIW, I'd write that as a one-line ternary assignment allowing all the
args to be grouped nicely together.

SUGGESTION
sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);

~~~

12. binary_upgrade_replorigin_advance

/*
 * binary_upgrade_replorigin_advance
 *
 * Update the remote_lsn for the subscriber's replication origin.
 */
Datum
binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
{
Relation rel;
HeapTuple tup;
Oid subid;
Form_pg_subscription form;
char    *subname;
XLogRecPtr sublsn;
char originname[NAMEDATALEN];
RepOriginId originid;
~

Similar to previous comment #10b. Maybe it would be helpful if the
function argument were documented up-front in the function-comment, or
in the variable declarations.

SUGGESTION
char         originname[NAMEDATALEN];
RepOriginId  originid;
char        *subname; /* ARG0 = subscription name */
XLogRecPtr   sublsn;  /* ARG1 = subscription lsn */

~~~

13.
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+ if (PG_ARGISNULL(1))
+ sublsn = InvalidXLogRecPtr;
+ else
+ sublsn = PG_GETARG_LSN(1);

Similar to previous comment #11. FWIW, I'd write that as a one-line
ternary assignment allowing all the args to be grouped nicely
together.

SUGGESTION
subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);

======
src/bin/pg_dump/pg_dump.c

14. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   get information about subscription membership for dumpable tables, this
+ *    will be used only in binary-upgrade mode.
+ */

Should use multiple sentences.

SUGGESTION
Get information about subscription membership for dumpable tables.
This will be used only in binary-upgrade mode.

~~~

15.
+ /* Get subscription relation fields */
+ i_srsubid = PQfnumber(res, "srsubid");
+ i_srrelid = PQfnumber(res, "srrelid");
+ i_srsubstate = PQfnumber(res, "srsubstate");
+ i_srsublsn = PQfnumber(res, "srsublsn");

Might it be better to say "Get pg_subscription_rel attributes"?

~~~

16. getSubscriptions

+ appendPQExpBufferStr(query, "o.remote_lsn\n");
  appendPQExpBufferStr(query,
  "FROM pg_subscription s\n"
+ "LEFT JOIN pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n"
  "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
  "                   WHERE datname = current_database())");

~

16a.
Should that "remote_lsn" have an alias like "suboriginremotelsn" so
that it matches the later field assignment better?

~

16b.
Probably these catalogs should be qualified using "pg_catalog.".

~~~

17. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   dump the definition of the given subscription table mapping, this will be
+ *    used only for upgrade operation.
+ */

Make this comment consistent with the other one for getSubscriptionTables:
- split into multiple sentences
- use the same terminology "binary-upgrade mode" versus "upgrade operation'.

~~~

18.
+ /*
+ * binary_upgrade_create_sub_rel_state will add the subscription
+ * relation to pg_subscripion_rel table, this is supported only for
+ * upgrade operation.
+ */

Split into multiple sentences.

======
src/bin/pg_dump/pg_dump_sort.c

19.
+ case DO_SUBSCRIPTION_REL:
+ snprintf(buf, bufsize,
+ "SUBSCRIPTION TABLE (ID %d)",
+ obj->dumpId);
+ return;

Should it include the OID (like for DO PUBLICATION_TABLE)?

======
src/bin/pg_upgrade/check.c

20.
  check_for_reg_data_type_usage(&old_cluster);
  check_for_isn_and_int8_passing_mismatch(&old_cluster);

+ check_for_subscription_state(&old_cluster);
+

There seems no reason anymore for this check to be separated from all
the other checks. Just remove the blank line.

~~~

21. check_for_subscription_state

+/*
+ * check_for_subscription_state()
+ *
+ * Verify that each of the subscriptions have all their corresponding tables in
+ * ready state.
+ */
+static void
+check_for_subscription_state(ClusterInfo *cluster)

/have/has/

This comment only refers to 'ready' state, but perhaps it is
misleading (or not entirely correct) because later the SQL is testing
for more than just the READY state:

+ "WHERE srsubstate NOT IN ('i', 's', 'r') "

~~~

22.
+ res = executeQueryOrDie(conn,
+ "SELECT s.subname, c.relname, n.nspname "
+ "FROM pg_catalog.pg_subscription_rel r "
+ "LEFT JOIN pg_catalog.pg_subscription s"
+ " ON r.srsubid = s.oid "
+ "LEFT JOIN pg_catalog.pg_class c"
+ " ON r.srrelid = c.oid "
+ "LEFT JOIN pg_catalog.pg_namespace n"
+ " ON c.relnamespace = n.oid "
+ "WHERE srsubstate NOT IN ('i', 's', 'r') "
+ "ORDER BY s.subname");

If you are going to check 'i', 's', and 'r' then I thought this
statement should maybe have some comment about why those states.

~~~

23.
+ pg_fatal("Your installation contains subscription(s) with\n"
+ "Subscription not having origin and/or subscription relation(s) not
in ready state.\n"
+ "A list of subscription not having origin and/or\n"
+ "subscription relation(s) not in ready state is in the file: %s",
+ output_path);

23a.
This message seems to just be saying the same thing 2 times.

Is also should use newlines and spaces more like the other similar
pg_patals in this file (e.g. the %s is on next line etc).

SUGGESTION
Your installation contains subscriptions without origin or having
relations not in a ready state.\n
A list of the problem subscriptions is in the file:\n
    %s

~

23b.
Same question about 'not in ready state'. Is that entirely correct?

======
src/bin/pg_upgrade/t/004_subscription.pl

24.
+sub insert_line
+{
+ my $payload = shift;
+
+ foreach ("t1", "t2")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}

For clarity, maybe call this function 'insert_line_at_pub'

~~~

25.
+# ------------------------------------------------------
+# Check that pg_upgrade is succesful when all tables are in ready state.
+# ------------------------------------------------------

/succesful/successful/

~~~

26.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with invalid remote_lsn');

This is the command for the "success" case. Why is the message part
referring to "invalid remote_lsn"?

~~~

27.
+$publisher->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial, val text);");
+$old_sub->safe_psql('postgres',
+ "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
+$publisher->safe_psql('postgres',


Maybe it is not necessary, but won't it be better if the publisher
table also has a primary key (so DDL matches its table name)?

~~~

28.
+# Add a row in subscriber so that the table sync will fail.
+$old_sub->safe_psql('postgres',
+ "INSERT INTO tab_primary_key values(1, 'before initial sync')");

The comment should be slightly more descriptive by saying the reason
it will fail is that you deliberately inserted the same PK value
again.

~~~

29.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

Since this cannot synchronize the table data, maybe the message should
be more like "Timed out while waiting for the table state to become
'd' (datasync)"


~~~

30.
+command_fails(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+ '-D',         $new_sub->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub->host,
+ '-p',         $old_sub->port,     '-P', $new_sub->port,
+ $mode,        '--check',
+ ],
+ 'run of pg_upgrade --check for old instance with incorrect sub rel');

/with incorrect sub rel/with incorrect sub rel state/ (??)

~~~

31.
+# ------------------------------------------------------
+# Check that pg_upgrade doesn't detect any problem once all the subscription's
+# relation are in 'r' (ready) state.
+# ------------------------------------------------------


31a.
/relation/relations/

~

31b.
Do you think that comment is correct? All you are doing here is
allowing the old_sub to proceed because there is no longer any
conflict -- but isn't that just normal pub/sub behaviour that has
nothing to do with pg_upgrade?

~~~

32.
+# Stop the old subscriber, insert a row in each table while it's down and add
+# t2 to the publication

/in each table/in each publisher table/

Also, it is not each table -- it's only t1 and t2; not tab_primary_key.

~~~

33.
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2), "There should be 2 rows in pg_subscription_rel");

/2 rows in pg_subscription_rel/2 rows in pg_subscription_rel
(representing t1 and tab_primary_key)/

======

34. binary_upgrade_create_sub_rel_state

+{ oid => '8404', descr => 'for use by pg_upgrade (relation for
pg_subscription_rel)',
+  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_create_sub_rel_state' },

As mentioned in a previous review comment #9, I felt this function
should have a different name: binary_upgrade_add_sub_rel_state.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 6 Nov 2023 at 07:51, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v11-0001
>
> ======
> Commit message
>
> 1.
> The subscription's replication origin are needed to ensure
> that we don't replicate anything twice.
>
> ~
>
> /are needed/is needed/

Modified

>
> 2.
> Author: Julien Rouhaud
> Reviewed-by: FIXME
> Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
>
> ~
>
> Include Vignesh as another author.

Modified

> ======
> doc/src/sgml/ref/pgupgrade.sgml
>
> 3.
> +     <application>pg_upgrade</application> attempts to migrate subscription
> +     dependencies which includes the subscription tables information present in
> +     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
> +     system table and the subscription replication origin which
> +     will help in continuing logical replication from where the old subscriber
> +     was replicating. This helps in avoiding the need for setting up the
>
> I became a bit lost reading paragraph due to the multiple 'which'...
>
> SUGGESTION
> pg_upgrade attempts to migrate subscription dependencies which
> includes the subscription table information present in
> pg_subscription_rel system
> catalog and also the subscription replication origin. This allows
> logical replication on the new subscriber to continue from where the
> old subscriber was up to.

Modified

> ~~~
>
> 4.
> +     was replicating. This helps in avoiding the need for setting up the
> +     subscription objects manually which requires truncating all the
> +     subscription tables and setting the logical replication slots. Migration
>
> SUGGESTION
> Having the ability to migrate subscription objects avoids the need to
> set them up manually, which would require truncating all the
> subscription tables and setting the logical replication slots.

I have removed this

> ~
>
> TBH, I am wondering what is the purpose of this sentence. It seems
> more like a justification for the patch, but does the user need to
> know all this?
>
> ~~~
>
> 5.
> +      <para>
> +       All the subscription tables in the old subscriber should be in
> +       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
> +       <literal>s</literal> (synchronized). This can be verified by checking
> +       <link
linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
> +      </para>
>
> /should be in/should be in state/

Modified

> ~~~
>
> 6.
> +      <para>
> +       The replication origin entry corresponding to each of the subscriptions
> +       should exist in the old cluster. This can be checking
> +       <link linkend="catalog-pg-subscription">pg_subscription</link> and
> +       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
> +       system tables.
> +      </para>
>
> missing words?
>
> /This can be checking/This can be found by checking/

Modified

> ~~~
>
> 7.
> +    <para>
> +     The subscriptions will be migrated to new cluster in disabled state, they
> +     can be enabled after upgrade by following the steps:
> +    </para>
>
> The first bullet also says "Enable the subscription..." so I think
> this paragraph should be worded like the below.
>
> SUGGESTION
> The subscriptions will be migrated to the new cluster in a disabled
> state. After migration, do this:

Modified

> ======
> src/backend/catalog/pg_subscription.c
>
> 8.
>  #include "nodes/makefuncs.h"
> +#include "replication/origin.h"
> +#include "replication/worker_internal.h"
>  #include "storage/lmgr.h"
>
> Why does this change need to be in the patch when there are no other
> code changes in this file?

Modified

> ======
> src/backend/utils/adt/pg_upgrade_support.c
>
> 9. binary_upgrade_create_sub_rel_state
>
> IMO a better name for this function would be
> 'binary_upgrade_add_sub_rel_state' (because it delegates to
> AddSubscriptionRelState).
>
> Then it would obey the same name pattern as the other function
> 'binary_upgrade_replorigin_advance' (which delegates to
> replorigin_advance).

Modified

> ~~~
>
> 10.
> +/*
> + * binary_upgrade_create_sub_rel_state
> + *
> + * Add the relation with the specified relation state to pg_subscription_rel
> + * table.
> + */
> +Datum
> +binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
> +{
> + Relation rel;
> + HeapTuple tup;
> + Oid subid;
> + Form_pg_subscription form;
> + char    *subname;
> + Oid relid;
> + char relstate;
> + XLogRecPtr sublsn;
>
> 10a.
> /to pg_subscription_rel table./to pg_subscription_rel catalog./

Modified

> ~
>
> 10b.
> Maybe it would be helpful if the function argument were documented
> up-front in the function-comment, or in the variable declarations.
>
> SUGGESTION
> char      *subname;  /* ARG0 = subscription name */
> Oid        relid;    /* ARG1 = relation Oid */
> char       relstate; /* ARG2 = subrel state */
> XLogRecPtr sublsn;   /* ARG3 (optional) = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

> ~~~
>
> 11.
> if (PG_ARGISNULL(3))
> sublsn = InvalidXLogRecPtr;
> else
> sublsn = PG_GETARG_LSN(3);
> FWIW, I'd write that as a one-line ternary assignment allowing all the
> args to be grouped nicely together.
>
> SUGGESTION
> sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);

Modified

> ~~~
>
> 12. binary_upgrade_replorigin_advance
>
> /*
>  * binary_upgrade_replorigin_advance
>  *
>  * Update the remote_lsn for the subscriber's replication origin.
>  */
> Datum
> binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
> {
> Relation rel;
> HeapTuple tup;
> Oid subid;
> Form_pg_subscription form;
> char    *subname;
> XLogRecPtr sublsn;
> char originname[NAMEDATALEN];
> RepOriginId originid;
> ~
>
> Similar to previous comment #10b. Maybe it would be helpful if the
> function argument were documented up-front in the function-comment, or
> in the variable declarations.
>
> SUGGESTION
> char         originname[NAMEDATALEN];
> RepOriginId  originid;
> char        *subname; /* ARG0 = subscription name */
> XLogRecPtr   sublsn;  /* ARG1 = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

> ~~~
>
> 13.
> + subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
> +
> + if (PG_ARGISNULL(1))
> + sublsn = InvalidXLogRecPtr;
> + else
> + sublsn = PG_GETARG_LSN(1);
>
> Similar to previous comment #11. FWIW, I'd write that as a one-line
> ternary assignment allowing all the args to be grouped nicely
> together.
>
> SUGGESTION
> subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
> sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);

Modified

> ======
> src/bin/pg_dump/pg_dump.c
>
> 14. getSubscriptionTables
>
> +/*
> + * getSubscriptionTables
> + *   get information about subscription membership for dumpable tables, this
> + *    will be used only in binary-upgrade mode.
> + */
>
> Should use multiple sentences.
>
> SUGGESTION
> Get information about subscription membership for dumpable tables.
> This will be used only in binary-upgrade mode.

Modified

> ~~~
>
> 15.
> + /* Get subscription relation fields */
> + i_srsubid = PQfnumber(res, "srsubid");
> + i_srrelid = PQfnumber(res, "srrelid");
> + i_srsubstate = PQfnumber(res, "srsubstate");
> + i_srsublsn = PQfnumber(res, "srsublsn");
>
> Might it be better to say "Get pg_subscription_rel attributes"?

Modified

> ~~~
>
> 16. getSubscriptions
>
> + appendPQExpBufferStr(query, "o.remote_lsn\n");
>   appendPQExpBufferStr(query,
>   "FROM pg_subscription s\n"
> + "LEFT JOIN pg_replication_origin_status o \n"
> + "    ON o.external_id = 'pg_' || s.oid::text \n"
>   "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
>   "                   WHERE datname = current_database())");
>
> ~
>
> 16a.
> Should that "remote_lsn" have an alias like "suboriginremotelsn" so
> that it matches the later field assignment better?

Modified

> ~
>
> 16b.
> Probably these catalogs should be qualified using "pg_catalog.".

Modified

> ~~~
>
> 17. dumpSubscriptionTable
>
> +/*
> + * dumpSubscriptionTable
> + *   dump the definition of the given subscription table mapping, this will be
> + *    used only for upgrade operation.
> + */
>
> Make this comment consistent with the other one for getSubscriptionTables:
> - split into multiple sentences
> - use the same terminology "binary-upgrade mode" versus "upgrade operation'.

Modified

> ~~~
>
> 18.
> + /*
> + * binary_upgrade_create_sub_rel_state will add the subscription
> + * relation to pg_subscripion_rel table, this is supported only for
> + * upgrade operation.
> + */
>
> Split into multiple sentences.

Modified

> ======
> src/bin/pg_dump/pg_dump_sort.c
>
> 19.
> + case DO_SUBSCRIPTION_REL:
> + snprintf(buf, bufsize,
> + "SUBSCRIPTION TABLE (ID %d)",
> + obj->dumpId);
> + return;
>
> Should it include the OID (like for DO PUBLICATION_TABLE)?

Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 20.
>   check_for_reg_data_type_usage(&old_cluster);
>   check_for_isn_and_int8_passing_mismatch(&old_cluster);
>
> + check_for_subscription_state(&old_cluster);
> +
>
> There seems no reason anymore for this check to be separated from all
> the other checks. Just remove the blank line.

Modified

> ~~~
>
> 21. check_for_subscription_state
>
> +/*
> + * check_for_subscription_state()
> + *
> + * Verify that each of the subscriptions have all their corresponding tables in
> + * ready state.
> + */
> +static void
> +check_for_subscription_state(ClusterInfo *cluster)
>
> /have/has/
>
> This comment only refers to 'ready' state, but perhaps it is
> misleading (or not entirely correct) because later the SQL is testing
> for more than just the READY state:
>
> + "WHERE srsubstate NOT IN ('i', 's', 'r') "

Modified

> ~~~
>
> 22.
> + res = executeQueryOrDie(conn,
> + "SELECT s.subname, c.relname, n.nspname "
> + "FROM pg_catalog.pg_subscription_rel r "
> + "LEFT JOIN pg_catalog.pg_subscription s"
> + " ON r.srsubid = s.oid "
> + "LEFT JOIN pg_catalog.pg_class c"
> + " ON r.srrelid = c.oid "
> + "LEFT JOIN pg_catalog.pg_namespace n"
> + " ON c.relnamespace = n.oid "
> + "WHERE srsubstate NOT IN ('i', 's', 'r') "
> + "ORDER BY s.subname");
>
> If you are going to check 'i', 's', and 'r' then I thought this
> statement should maybe have some comment about why those states.

Modified

> ~~~
>
> 23.
> + pg_fatal("Your installation contains subscription(s) with\n"
> + "Subscription not having origin and/or subscription relation(s) not
> in ready state.\n"
> + "A list of subscription not having origin and/or\n"
> + "subscription relation(s) not in ready state is in the file: %s",
> + output_path);
>
> 23a.
> This message seems to just be saying the same thing 2 times.
>
> Is also should use newlines and spaces more like the other similar
> pg_patals in this file (e.g. the %s is on next line etc).
>
> SUGGESTION
> Your installation contains subscriptions without origin or having
> relations not in a ready state.\n
> A list of the problem subscriptions is in the file:\n
>     %s

Modified

> ~
>
> 23b.
> Same question about 'not in ready state'. Is that entirely correct?

Modified

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 24.
> +sub insert_line
> +{
> + my $payload = shift;
> +
> + foreach ("t1", "t2")
> + {
> + $publisher->safe_psql('postgres',
> + "INSERT INTO " . $_ . " (val) VALUES('$payload')");
> + }
> +}
>
> For clarity, maybe call this function 'insert_line_at_pub'

Modified

> ~~~
>
> 25.
> +# ------------------------------------------------------
> +# Check that pg_upgrade is succesful when all tables are in ready state.
> +# ------------------------------------------------------
>
> /succesful/successful/

Modified

> ~~~
>
> 26.
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
> + '-D',         $new_sub->data_dir, '-b', $bindir,
> + '-B',         $bindir,            '-s', $new_sub->host,
> + '-p',         $old_sub->port,     '-P', $new_sub->port,
> + $mode,        '--check',
> + ],
> + 'run of pg_upgrade --check for old instance with invalid remote_lsn');
>
> This is the command for the "success" case. Why is the message part
> referring to "invalid remote_lsn"?

Modified

> ~~~
>
> 27.
> +$publisher->safe_psql('postgres',
> + "CREATE TABLE tab_primary_key(id serial, val text);");
> +$old_sub->safe_psql('postgres',
> + "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
> +$publisher->safe_psql('postgres',
>
>
> Maybe it is not necessary, but won't it be better if the publisher
> table also has a primary key (so DDL matches its table name)?

Modified

> ~~~
>
> 28.
> +# Add a row in subscriber so that the table sync will fail.
> +$old_sub->safe_psql('postgres',
> + "INSERT INTO tab_primary_key values(1, 'before initial sync')");
>
> The comment should be slightly more descriptive by saying the reason
> it will fail is that you deliberately inserted the same PK value
> again.

Modified

> ~~~
>
> 29.
> +my $started_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
> +$old_sub->poll_query_until('postgres', $started_query)
> +  or die "Timed out while waiting for subscriber to synchronize data";
>
> Since this cannot synchronize the table data, maybe the message should
> be more like "Timed out while waiting for the table state to become
> 'd' (datasync)"

Modified

> ~~~
>
> 30.
> +command_fails(
> + [
> + 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
> + '-D',         $new_sub->data_dir, '-b', $bindir,
> + '-B',         $bindir,            '-s', $new_sub->host,
> + '-p',         $old_sub->port,     '-P', $new_sub->port,
> + $mode,        '--check',
> + ],
> + 'run of pg_upgrade --check for old instance with incorrect sub rel');
>
> /with incorrect sub rel/with incorrect sub rel state/ (??)

Modified

> ~~~
>
> 31.
> +# ------------------------------------------------------
> +# Check that pg_upgrade doesn't detect any problem once all the subscription's
> +# relation are in 'r' (ready) state.
> +# ------------------------------------------------------
>
>
> 31a.
> /relation/relations/
>

I have removed this comment

>
> 31b.
> Do you think that comment is correct? All you are doing here is
> allowing the old_sub to proceed because there is no longer any
> conflict -- but isn't that just normal pub/sub behaviour that has
> nothing to do with pg_upgrade?

I have removed this comment

> ~~~
>
> 32.
> +# Stop the old subscriber, insert a row in each table while it's down and add
> +# t2 to the publication
>
> /in each table/in each publisher table/
>
> Also, it is not each table -- it's only t1 and t2; not tab_primary_key.

Modified

> ~~~
>
> 33.
> +  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
> +is($result, qq(2), "There should be 2 rows in pg_subscription_rel");
>
> /2 rows in pg_subscription_rel/2 rows in pg_subscription_rel
> (representing t1 and tab_primary_key)/

Modified

> ======
>
> 34. binary_upgrade_create_sub_rel_state
>
> +{ oid => '8404', descr => 'for use by pg_upgrade (relation for
> pg_subscription_rel)',
> +  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'void',
> +  proargtypes => 'text oid char pg_lsn',
> +  prosrc => 'binary_upgrade_create_sub_rel_state' },
>
> As mentioned in a previous review comment #9, I felt this function
> should have a different name: binary_upgrade_add_sub_rel_state.

Modified

Thanks for the comments, the attached v12 version patch has the
changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 2 Nov 2023 at 17:01, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 2, 2023 at 3:41 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > I have slightly modified it now and also made it consistent with the
> > replication slot upgrade, but I was not sure if we need to add
> > anything more. Let me know if anything else needs to be added. I will
> > add it.
> >
>
> I think it is important for users to know how they upgrade their
> multi-node setup. Say a two-node setup where replication is working
> both ways (aka each node has both publications and subscriptions),
> similarly, how to upgrade, if there are multiple nodes involved?

I was thinking of documenting something like this:
Steps to upgrade logical replication clusters:
Warning:
Upgrading logical replication nodes requires multiple steps to be
performed. Because not all operations are transactional, the user is
advised to take backups.
Backups can be taken as described in
https://www.postgresql.org/docs/current/backup.html

Upgrading 2 node logical replication cluster:
1) Let's say publisher is in Node1 and subscriber is in Node2.
2) Stop the publisher server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the subscriber server in Node2.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.
10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.

Steps to upgrade cascaded logical replication clusters:
1) Let's say we have a cascaded logical replication setup
Node1->Node2->Node3. Here Node2 is subscribing to Node1 and Node3 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2 and Node3.
4) Upgrade the publisher node Node1 to Node1_new.
5) Start the publisher node Node1_new.
6) Stop the server in Node1.
7) Upgrade the subscriber node Node2 to Node2_new.
8) Start the subscriber node Node2_new.
9) Alter the subscription connections in Node2_new to point from Node1
to Node1_new.
10) Enable the subscriptions in Node2_new.
11) Create any tables that were created in Node1_new between step-5
and now and Refresh the publications.
12) Stop the server in Node3.
13) Upgrade the subscriber node Node3 to Node3_new.
14) Start the subscriber node Node3_new.
15) Alter the subscription connections in Node3_new to point from
Node2 to Node2_new.
16) Enable the subscriptions in Node2_new.
17) Create any tables that were created in Node2_new between step-8
and now and Refresh the publications.

Upgrading 2 node circular logical replication cluster:
1) Let's say we have a circular logical replication setup Node1->Node2
& Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
subscribing to Node2.
2) Stop the server in Node1.
3) Disable the subscriptions in Node2.
4) Upgrade the node Node1 to Node1_new.
5) Start the node Node1_new.
6) Enable the subscriptions in Node1_new.
7) Wait till all the incremental changes are synchronized.
8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
9) Create any tables that were created in Node2 between step-2 and now
and Refresh the publications.
10) Stop the server in Node2.
11) Disable the subscriptions in Node1.
12) Upgrade the node Node2 to Node2_new.
13) Start the subscriber node Node2_new.
14) Enable the subscriptions in Node2_new.
15) Alter the subscription connections in Node1 to point from Node2 to
Node2_new.
16) Create any tables that were created in Node1_new between step-10
and now and Refresh the publications.

I have done basic testing with this, I will do further testing and
update it if I find any issues.
Let me know if this idea is ok or we need something different.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Nov 08, 2023 at 10:52:29PM +0530, vignesh C wrote:
> Upgrading logical replication nodes requires multiple steps to be
> performed. Because not all operations are transactional, the user is
> advised to take backups.
> Backups can be taken as described in
> https://www.postgresql.org/docs/current/backup.html

There's a similar risk with --link if the upgrade fails after the new
cluster was started and the files linked began getting modified, so
that's something users would be OK with, I guess.

> Upgrading 2 node logical replication cluster:
> 1) Let's say publisher is in Node1 and subscriber is in Node2.
> 2) Stop the publisher server in Node1.
> 3) Disable the subscriptions in Node2.
> 4) Upgrade the publisher node Node1 to Node1_new.
> 5) Start the publisher node Node1_new.
> 6) Stop the subscriber server in Node2.
> 7) Upgrade the subscriber node Node2 to Node2_new.
> 8) Start the subscriber node Node2_new.
> 9) Alter the subscription connections in Node2_new to point from Node1
> to Node1_new.

Do they really need to do so in an pg_upgrade flow?  The connection
endpoint would be likely the same for transparency, no?

> 10) Enable the subscriptions in Node2_new.
> 11) Create any tables that were created in Node1_new between step-5
> and now and Refresh the publications.

How about the opposite stance, where an upgrade flow does first the
subscriber and then the publisher?  Would this be worth mentioning?
Case 3 touches that as nodes hold both publishers and subscribers.

> Steps to upgrade cascaded logical replication clusters:
> 1) Let's say we have a cascaded logical replication setup
> Node1->Node2->Node3. Here Node2 is subscribing to Node1 and Node3 is
> subscribing to Node2.
> 2) Stop the server in Node1.
> 3) Disable the subscriptions in Node2 and Node3.
> 4) Upgrade the publisher node Node1 to Node1_new.
> 5) Start the publisher node Node1_new.
> 6) Stop the server in Node1.
> 7) Upgrade the subscriber node Node2 to Node2_new.
> 8) Start the subscriber node Node2_new.
> 9) Alter the subscription connections in Node2_new to point from Node1
> to Node1_new.

Same here.

> 10) Enable the subscriptions in Node2_new.
> 11) Create any tables that were created in Node1_new between step-5
> and now and Refresh the publications.
> 12) Stop the server in Node3.
> 13) Upgrade the subscriber node Node3 to Node3_new.
> 14) Start the subscriber node Node3_new.
> 15) Alter the subscription connections in Node3_new to point from
> Node2 to Node2_new.
> 16) Enable the subscriptions in Node2_new.
> 17) Create any tables that were created in Node2_new between step-8
> and now and Refresh the publications.
>
> Upgrading 2 node circular logical replication cluster:
> 1) Let's say we have a circular logical replication setup Node1->Node2
> & Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
> subscribing to Node2.
> 2) Stop the server in Node1.
> 3) Disable the subscriptions in Node2.
> 4) Upgrade the node Node1 to Node1_new.
> 5) Start the node Node1_new.
> 6) Enable the subscriptions in Node1_new.
> 7) Wait till all the incremental changes are synchronized.
> 8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
> 9) Create any tables that were created in Node2 between step-2 and now
> and Refresh the publications.
> 10) Stop the server in Node2.
> 11) Disable the subscriptions in Node1.
> 12) Upgrade the node Node2 to Node2_new.
> 13) Start the subscriber node Node2_new.
> 14) Enable the subscriptions in Node2_new.
> 15) Alter the subscription connections in Node1 to point from Node2 to
> Node2_new.
> 16) Create any tables that were created in Node1_new between step-10
> and now and Refresh the publications.
>
> I have done basic testing with this, I will do further testing and
> update it if I find any issues.
> Let me know if this idea is ok or we need something different.

I have not tested, but having documentation among these lines is good
because it becomes clear what the steps one needs to do are.

Another thing that I doubt is worth mentioning is the schema changes
that may happen.  We could just say that the schema should be fixed
while running an upgrade, which is kind of fair to expect in logical
setups for tables replicated anyway?

Do you think that there would be an issue in automating such tests
once support for the upgrade of subscribers is done (hopefully)?  The
first scenario may not need extra coverage if we have already
003_logical_slots.pl and a second file to test for the subscriber
part, though.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Thanks for addressing my previous review comments.

I re-checked the latest patch v12-0001 and found the following:

======
Commit message

1.
The new SQL binary_upgrade_create_sub_rel_state function has the following
syntax:
SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid,
state char [,sublsn pg_lsn])

~

Looks like v12 accidentally forgot to update this to the modified
function name 'binary_upgrade_add_sub_rel_state'

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Nov 8, 2023 at 10:52 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Upgrading 2 node circular logical replication cluster:
> 1) Let's say we have a circular logical replication setup Node1->Node2
> & Node2->Node1. Here Node2 is subscribing to Node1 and Node1 is
> subscribing to Node2.
> 2) Stop the server in Node1.
> 3) Disable the subscriptions in Node2.
> 4) Upgrade the node Node1 to Node1_new.
> 5) Start the node Node1_new.
> 6) Enable the subscriptions in Node1_new.
> 7) Wait till all the incremental changes are synchronized.
> 8) Alter the subscription connections in Node2 to point from Node1 to Node1_new.
> 9) Create any tables that were created in Node2 between step-2 and now
> and Refresh the publications.
>

I haven't reviewed all the steps yet but here steps 7 and 9 seem to
require some validation. How can incremental changes be synchronized
till all the new tables are created and synced before step 7?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Thu, Nov 09, 2023 at 01:14:05PM +1100, Peter Smith wrote:
> Looks like v12 accidentally forgot to update this to the modified
> function name 'binary_upgrade_add_sub_rel_state'

This v12 is overall cleaner than its predecessors.  Nice to see.

+my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
+is($result, qq(1), "check initial t1 table data on publisher");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
+is($result, qq(1), "check initial t1 table data on the old subscriber");
+$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");

I'd argue that t1 and t2 should have less generic names.  t1 is used
to check that the upgrade process works, while t2 is added to the
publication after upgrading the subscriber.  Say something like
tab_upgraded or tab_not_upgraded?

+my $synced_query =
+  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";

Perhaps it would be safer to use a query that checks the number of
relations in 'r' state?  This query would return true if
pg_subscription_rel has no tuples.

+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";

Relying on a pkey error to enforce an incorrect state is a good trick.
Nice.

+command_fails(
+    [
+        'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
+        '-D',         $new_sub->data_dir, '-b', $bindir,
+        '-B',         $bindir,            '-s', $new_sub->host,
+        '-p',         $old_sub->port,     '-P', $new_sub->port,
+        $mode,        '--check',
+    ],
+    'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
+rmtree($new_sub->data_dir . "/pg_upgrade_output.d");

Okay by me to not stop the cluster for the --check to shave a few
cycles.  It's a bit sad that we don't cross-check the contents of
subscription_state.txt before removing pg_upgrade_output.d.  Finding
the file is easy even if the subdir where it is included is not a
constant name.  Then it is possible to apply a regexp with the
contents consumed by a slurp_file().

+my $remote_lsn = $old_sub->safe_psql('postgres',
+    "SELECT remote_lsn FROM pg_replication_origin_status");
Perhaps you've not noticed, but this would be 0/0 most of the time.
However the intention is to check after a valid LSN to make sure that
the origin is set, no?

I am wondering whether this should use a bit more data than just one
tuple, say at least two transaction, one of them with a multi-value
INSERT?

+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready state.
+# ------------------------------------------------------
This comment is a bit inconsistent with the state that are accepted,
but why not, at least that's predictible.

+        * relation to pg_subscripion_rel table. This will be used only in

Typo: s/pg_subscripion_rel/pg_subscription_rel/.

This needs some word-smithing to explain the reasons why a state is
not needed:

+        /*
+         * The subscription relation should be in either i (initialize),
+         * r (ready) or s (synchronized) state as either the replication slot
+         * is not created or the replication slot is already dropped and the
+         * required WAL files will be present in the publisher. The other
+         * states are not ok as the worker has dependency on the replication
+         * slot/origin in these case:

A slot not created yet refers to the 'i' state, while 'r' and 's'
refer to a slot created previously but already dropped, right?
Shouldn't this comment tell that rather than mixing the assumptions?

+         * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
+         * try to drop the replication slot but as the replication slots will
+         * be created with old subscription id in the publisher and the
+         * upgraded subscriber will not be able to clean the slots in this
+         * case.

Proposal: A relation upgraded while in this state would retain a
replication slot, which could not be dropped by the sync worker
spawned after the upgrade because the subscription ID tracked by the
publisher does not match anymore.

Note: actually, this would be OK if we are able to keep the OIDs of
the subscribers consistent across upgrades?  I'm OK to not do nothing
about that in this patch, to keep it simpler.  Just asking in passing.

+         * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
+         * expect the origin to be already existing as the origin is created
+         * with an old subscription id, tablesync worker will not be able to
+         * find the origin in this case.

Proposal: A tablesync worker spawned to work on a relation upgraded
while in this state would expect an origin ID with the OID of the
subscription used before the upgrade, causing it to fail.

+                "A list of problem subscriptions is in the file:\n"

Sounds a bit strange, perhaps use an extra "the", as of "the problem
subscriptions"?

Could it be worth mentioning in the docs that one could also DISABLE
the subscriptions before running the upgrade?

+       The replication origin entry corresponding to each of the subscriptions
+       should exist in the old cluster. This can be found by checking
+       <link linkend="catalog-pg-subscription">pg_subscription</link> and
+       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
+       system tables.

Hmm.  No need to mention pg_replication_origin_status?

If I may ask, how did you check that the given relation states were
OK or not OK?  Did you hardcode some wait points in tablesync.c up to
where a state is updated in pg_subscription_rel, then shutdown the
cluster before the upgrade to maintain the catalog in this state?
Finally, after the upgrade, you've cross-checked the dependencies on
the slots and origins to see that the spawned sync workers turned
crazy because of the inconsistencies.  Right?
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Nov 09, 2023 at 01:14:05PM +1100, Peter Smith wrote:
> > Looks like v12 accidentally forgot to update this to the modified
> > function name 'binary_upgrade_add_sub_rel_state'
>
> This v12 is overall cleaner than its predecessors.  Nice to see.
>
> +my $result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t1");
> +is($result, qq(1), "check initial t1 table data on publisher");
> +$result = $publisher->safe_psql('postgres', "SELECT count(*) FROM t2");
> +is($result, qq(1), "check initial t1 table data on publisher");
> +$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t1");
> +is($result, qq(1), "check initial t1 table data on the old subscriber");
> +$result = $old_sub->safe_psql('postgres', "SELECT count(*) FROM t2");
>
> I'd argue that t1 and t2 should have less generic names.  t1 is used
> to check that the upgrade process works, while t2 is added to the
> publication after upgrading the subscriber.  Say something like
> tab_upgraded or tab_not_upgraded?

Modified

> +my $synced_query =
> +  "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r');";
>
> Perhaps it would be safer to use a query that checks the number of
> relations in 'r' state?  This query would return true if
> pg_subscription_rel has no tuples.

Modified

> +# Table will be in 'd' (data is being copied) state as table sync will fail
> +# because of primary key constraint error.
> +my $started_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
>
> Relying on a pkey error to enforce an incorrect state is a good trick.
> Nice.

That was better way to get data sync state without manually changing
the pg_subscription_rel catalog

> +command_fails(
> +    [
> +        'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
> +        '-D',         $new_sub->data_dir, '-b', $bindir,
> +        '-B',         $bindir,            '-s', $new_sub->host,
> +        '-p',         $old_sub->port,     '-P', $new_sub->port,
> +        $mode,        '--check',
> +    ],
> +    'run of pg_upgrade --check for old instance with relation in \'d\' datasync(invalid) state');
> +rmtree($new_sub->data_dir . "/pg_upgrade_output.d");
>
> Okay by me to not stop the cluster for the --check to shave a few
> cycles.  It's a bit sad that we don't cross-check the contents of
> subscription_state.txt before removing pg_upgrade_output.d.  Finding
> the file is easy even if the subdir where it is included is not a
> constant name.  Then it is possible to apply a regexp with the
> contents consumed by a slurp_file().

Modified

> +my $remote_lsn = $old_sub->safe_psql('postgres',
> +    "SELECT remote_lsn FROM pg_replication_origin_status");
> Perhaps you've not noticed, but this would be 0/0 most of the time.
> However the intention is to check after a valid LSN to make sure that
> the origin is set, no?

I have added few more inserts to make remote_lsn not be 0/0

> I am wondering whether this should use a bit more data than just one
> tuple, say at least two transaction, one of them with a multi-value
> INSERT?

Added one more multi-insert

> +# ------------------------------------------------------
> +# Check that pg_upgrade is successful when all tables are in ready state.
> +# ------------------------------------------------------
> This comment is a bit inconsistent with the state that are accepted,
> but why not, at least that's predictible.

The key test validation is mentioned in this style of comment

> +        * relation to pg_subscripion_rel table. This will be used only in
>
> Typo: s/pg_subscripion_rel/pg_subscription_rel/.

Modified

> This needs some word-smithing to explain the reasons why a state is
> not needed:
>
> +        /*
> +         * The subscription relation should be in either i (initialize),
> +         * r (ready) or s (synchronized) state as either the replication slot
> +         * is not created or the replication slot is already dropped and the
> +         * required WAL files will be present in the publisher. The other
> +         * states are not ok as the worker has dependency on the replication
> +         * slot/origin in these case:
>
> A slot not created yet refers to the 'i' state, while 'r' and 's'
> refer to a slot created previously but already dropped, right?
> Shouldn't this comment tell that rather than mixing the assumptions?

Modified

> +         * a) SUBREL_STATE_DATASYNC: In this case, the table sync worker will
> +         * try to drop the replication slot but as the replication slots will
> +         * be created with old subscription id in the publisher and the
> +         * upgraded subscriber will not be able to clean the slots in this
> +         * case.
>
> Proposal: A relation upgraded while in this state would retain a
> replication slot, which could not be dropped by the sync worker
> spawned after the upgrade because the subscription ID tracked by the
> publisher does not match anymore.

Modified

> Note: actually, this would be OK if we are able to keep the OIDs of
> the subscribers consistent across upgrades?  I'm OK to not do nothing
> about that in this patch, to keep it simpler.  Just asking in passing.

I will analyze more on this and post the analysis in the subsequent mail.

> +         * b) SUBREL_STATE_FINISHEDCOPY: In this case, the tablesync worker will
> +         * expect the origin to be already existing as the origin is created
> +         * with an old subscription id, tablesync worker will not be able to
> +         * find the origin in this case.
>
> Proposal: A tablesync worker spawned to work on a relation upgraded
> while in this state would expect an origin ID with the OID of the
> subscription used before the upgrade, causing it to fail.

Modified

> +                "A list of problem subscriptions is in the file:\n"
>
> Sounds a bit strange, perhaps use an extra "the", as of "the problem
> subscriptions"?

Modified

> Could it be worth mentioning in the docs that one could also DISABLE
> the subscriptions before running the upgrade?

I felt since the changes that we are planning to make won't start the
apply workers during upgrade, there will be no impact even if the
subscriptions are enabled. I felt no need to mention it unless we are
planning to allow starting of apply workers during upgrade.

> +       The replication origin entry corresponding to each of the subscriptions
> +       should exist in the old cluster. This can be found by checking
> +       <link linkend="catalog-pg-subscription">pg_subscription</link> and
> +       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
> +       system tables.
>
> Hmm.  No need to mention pg_replication_origin_status?

 When we create origin, the origin status would be created implicitly,
I felt we need not check on replication origin status and also need
not mention it here.

> If I may ask, how did you check that the given relation states were
> OK or not OK?  Did you hardcode some wait points in tablesync.c up to
> where a state is updated in pg_subscription_rel, then shutdown the
> cluster before the upgrade to maintain the catalog in this state?
> Finally, after the upgrade, you've cross-checked the dependencies on
> the slots and origins to see that the spawned sync workers turned
> crazy because of the inconsistencies.  Right?

I did testing in the same lines that you mentioned. Apart from that I
also reviewed the design where it was using the old subscription id
like in case of table sync workers, the tables sync worker will use
replication using old subscription id. replication slot and
replication origin. I also checked the impact of remote_lsn's.
Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
replication slot once worker is started but since the slot will be
created with an old subscription, we will not be able to drop the
replication slot and create a leak. Similarly the problem exists with
SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
created with an old sub id.

Thanks for the comments, the attached v13 version patch has the
changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 9 Nov 2023 at 07:44, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Thanks for addressing my previous review comments.
>
> I re-checked the latest patch v12-0001 and found the following:
>
> ======
> Commit message
>
> 1.
> The new SQL binary_upgrade_create_sub_rel_state function has the following
> syntax:
> SELECT binary_upgrade_create_sub_rel_state(subname text, relid oid,
> state char [,sublsn pg_lsn])
>
> ~
>
> Looks like v12 accidentally forgot to update this to the modified
> function name 'binary_upgrade_add_sub_rel_state'

This is handled in the v13 version patch posted at:
https://www.postgresql.org/message-id/CALDaNm0mGz6_69BiJTmEqC8Q0U0x2nMZOs3w9btKOHZZpfC2ow%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v13-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+ int i_srsublsn;
+ int i;
+ int cur_rel = 0;
+ int ntups;

What is the difference between 'i' and 'cur_rel'?

AFAICT these represent the same tuple index, in which case you might
as well throw away 'cur_rel' and only keep 'i'.

~~~

2. getSubscriptionTables

+ for (i = 0; i < ntups; i++)
+ {
+ Oid cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
+ Oid relid = atooid(PQgetvalue(res, i, i_srrelid));
+ TableInfo  *tblinfo;

Since this is all new code, using C99 style for loop variable
declaration of 'i' will be better.

======
src/bin/pg_upgrade/check.c

3. check_for_subscription_state

+check_for_subscription_state(ClusterInfo *cluster)
+{
+ int dbnum;
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+
+ /* Subscription relations state can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
+ return;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subscription_state.txt");

I felt this filename ought to be more like
'subscriptions_with_bad_state.txt' because the current name looks like
a normal logfile with nothing to indicate that it is only for the
states of the "bad" subscriptions.

~~~

4.
+ for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {

Since this is all new code, using C99 style for loop variable
declaration of 'dbnum' will be better.

~~~

5.
+ * a) SUBREL_STATE_DATASYNC:A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.

missing whitespace

/SUBREL_STATE_DATASYNC:A relation/SUBREL_STATE_DATASYNC: A relation/

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Fri, Nov 10, 2023 at 07:26:18PM +0530, vignesh C wrote:
> I did testing in the same lines that you mentioned. Apart from that I
> also reviewed the design where it was using the old subscription id
> like in case of table sync workers, the tables sync worker will use
> replication using old subscription id. replication slot and
> replication origin. I also checked the impact of remote_lsn's.
> Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
> replication slot once worker is started but since the slot will be
> created with an old subscription, we will not be able to drop the
> replication slot and create a leak. Similarly the problem exists with
> SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
> created with an old sub id.

Yeah, I was playing a bit with these states and I can confirm that
leaving around a DATASYNC relation in pg_subscription_rel during
the upgrade would leave a slot on the publisher of the old cluster,
which is no good.  It would be an option to explore later what could
be improved, but I'm also looking forward at hearing from the users
first, as what you have here may be enough for the basic purposes we
are trying to cover.  FINISHEDCOPY similarly, is not OK.  I was able
to get an origin lying around after an upgrade.

Anyway, after a closer lookup, I think that your conclusions regarding
the states that are allowed in the patch during the upgrade have some
flaws.

First, are you sure that SYNCDONE is OK to keep?  This catalog state
is set in process_syncing_tables_for_sync(), and just after the code
opens a transaction to clean up the tablesync slot, followed by a
second transaction to clean up the origin.  However, imagine that
there is a failure in dropping the slot, the origin, or just in
transaction processing, cannot we finish in a state where the relation
is marked as SYNCDONE in the catalog but still has an origin and/or a
tablesync slot lying around?  Assuming that SYNCDONE is an OK state
seems incorrect to me.  I am pretty sure that injecting an error in a
code path after the slot is created would equally lead to an
inconsistency.

It seems to me that INIT cannot be relied on for a similar reason.
This state would be set for a new relation in
LogicalRepSyncTableStart(), and the relation would still be in INIT
state when creating the slot via walrcv_create_slot() in a second
transaction started a bit later.  However, if we have a failure after
the transaction that created the slot commits, then we'd have an INIT
relation in the catalog that got committed *and* a slot related to it
lying around.

The only state that I can see is possible to rely on safely is READY,
set in the same transaction as when the replication origin is dropped,
because that's the point where we are sure that there are no origin
and no tablesync slot: the READY state is visible in the catalog only
if the transaction dropping the slot succeeds.  Even with this one, I
was having the odd feeling that there's a code path where we could
leak something, though I have not seen a problem with after a few
hours of looking at this area.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> It seems to me that INIT cannot be relied on for a similar reason.
> This state would be set for a new relation in
> LogicalRepSyncTableStart(), and the relation would still be in INIT
> state when creating the slot via walrcv_create_slot() in a second
> transaction started a bit later.
>

Before creating a slot, we changed the state to DATASYNC.

>
>  However, if we have a failure after
> the transaction that created the slot commits, then we'd have an INIT
> relation in the catalog that got committed *and* a slot related to it
> lying around.
>

I don't think this can happen otherwise this could be a problem even
without an upgrade after restart.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v13 version patch has the
> changes for the same.
>

+
+ ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
sizeof(originname));
+ originid = replorigin_by_name(originname, false);
+ replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
+    false /* backward */ ,
+    false /* WAL log */ );

This seems to update the origin state only in memory. Is it sufficient
to use this here? Anyway, I think using this requires us to first
acquire RowExclusiveLock on pg_replication_origin something the patch
is doing for some other system table.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Nov 13, 2023 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v13 version patch has the
> > changes for the same.
> >
>
> +
> + ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
> sizeof(originname));
> + originid = replorigin_by_name(originname, false);
> + replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
> +    false /* backward */ ,
> +    false /* WAL log */ );
>
> This seems to update the origin state only in memory. Is it sufficient
> to use this here?
>

I think it is probably getting ensured by clean shutdown
(shutdown_checkpoint) which happens on the new cluster after calling
this function. We can probably try to add a comment for it. BTW, we
also need to ensure that max_replication_slots is configured to a
value higher than origins we are planning to create on the new
cluster.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication[

From
Michael Paquier
Date:
On Mon, Nov 13, 2023 at 04:02:27PM +0530, Amit Kapila wrote:
> On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:
>> It seems to me that INIT cannot be relied on for a similar reason.
>> This state would be set for a new relation in
>> LogicalRepSyncTableStart(), and the relation would still be in INIT
>> state when creating the slot via walrcv_create_slot() in a second
>> transaction started a bit later.
>
> Before creating a slot, we changed the state to DATASYNC.

Still, playing the devil's advocate, couldn't it be possible that a
server crashes just after the slot got created, then restarts with
max_logical_replication_workers=0?  This would keep the catalog in a
state authorized by the upgrade, still leak a replication slot on the
publication side if the node gets upgraded.  READY in the catalog
seems to be the only state where we are guaranteed that there is no
origin and no slot remaining around.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 13 Nov 2023 at 13:52, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Nov 10, 2023 at 07:26:18PM +0530, vignesh C wrote:
> > I did testing in the same lines that you mentioned. Apart from that I
> > also reviewed the design where it was using the old subscription id
> > like in case of table sync workers, the tables sync worker will use
> > replication using old subscription id. replication slot and
> > replication origin. I also checked the impact of remote_lsn's.
> > Few example: IN SUBREL_STATE_DATASYNC state we will try to drop the
> > replication slot once worker is started but since the slot will be
> > created with an old subscription, we will not be able to drop the
> > replication slot and create a leak. Similarly the problem exists with
> > SUBREL_STATE_FINISHEDCOPY where we will not be able to drop the origin
> > created with an old sub id.
>
> Yeah, I was playing a bit with these states and I can confirm that
> leaving around a DATASYNC relation in pg_subscription_rel during
> the upgrade would leave a slot on the publisher of the old cluster,
> which is no good.  It would be an option to explore later what could
> be improved, but I'm also looking forward at hearing from the users
> first, as what you have here may be enough for the basic purposes we
> are trying to cover.  FINISHEDCOPY similarly, is not OK.  I was able
> to get an origin lying around after an upgrade.
>
> Anyway, after a closer lookup, I think that your conclusions regarding
> the states that are allowed in the patch during the upgrade have some
> flaws.
>
> First, are you sure that SYNCDONE is OK to keep?  This catalog state
> is set in process_syncing_tables_for_sync(), and just after the code
> opens a transaction to clean up the tablesync slot, followed by a
> second transaction to clean up the origin.  However, imagine that
> there is a failure in dropping the slot, the origin, or just in
> transaction processing, cannot we finish in a state where the relation
> is marked as SYNCDONE in the catalog but still has an origin and/or a
> tablesync slot lying around?  Assuming that SYNCDONE is an OK state
> seems incorrect to me.  I am pretty sure that injecting an error in a
> code path after the slot is created would equally lead to an
> inconsistency.

There are couple of things happening here: a) In the first part we
take care of setting subscription relation to SYNCDONE and dropping
the replication slot at publisher node, only if drop replication slot
is successful the relation state will be set to SYNCDONE , if drop
replication slot fails the relation state will still be in
FINISHEDCOPY. So if there is a failure in the drop replication slot we
will not have an issue as the tablesync worker will be in
FINISHEDCOPYstate and this state is not allowed for upgrade. When the
state is in SYNCDONE the tablesync slot will not be present. b) In the
second part we drop the replication origin, even if there is a chance
that drop replication origin fails due to some reason, there will be
no problem as we do not copy the table sync replication origin to the
new cluster while upgrading. Since the table sync replication origin
is not copied to the new cluster there will be no replication origin
leaks.
I feel these issues will not be there in SYNCDONE state.

Regards,
Vignesh



Re: pg_upgrade and logical replication[

From
Amit Kapila
Date:
On Tue, Nov 14, 2023 at 5:52 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Nov 13, 2023 at 04:02:27PM +0530, Amit Kapila wrote:
> > On Mon, Nov 13, 2023 at 1:52 PM Michael Paquier <michael@paquier.xyz> wrote:
> >> It seems to me that INIT cannot be relied on for a similar reason.
> >> This state would be set for a new relation in
> >> LogicalRepSyncTableStart(), and the relation would still be in INIT
> >> state when creating the slot via walrcv_create_slot() in a second
> >> transaction started a bit later.
> >
> > Before creating a slot, we changed the state to DATASYNC.
>
> Still, playing the devil's advocate, couldn't it be possible that a
> server crashes just after the slot got created, then restarts with
> max_logical_replication_workers=0?  This would keep the catalog in a
> state authorized by the upgrade,
>

The state should be DATASYNC by that time and I don't think that is an
authorized state by upgrade.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 13 Nov 2023 at 13:52, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v13-0001
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 1. getSubscriptionTables
>
> + int i_srsublsn;
> + int i;
> + int cur_rel = 0;
> + int ntups;
>
> What is the difference between 'i' and 'cur_rel'?
>
> AFAICT these represent the same tuple index, in which case you might
> as well throw away 'cur_rel' and only keep 'i'.

Modified

> ~~~
>
> 2. getSubscriptionTables
>
> + for (i = 0; i < ntups; i++)
> + {
> + Oid cur_srsubid = atooid(PQgetvalue(res, i, i_srsubid));
> + Oid relid = atooid(PQgetvalue(res, i, i_srrelid));
> + TableInfo  *tblinfo;
>
> Since this is all new code, using C99 style for loop variable
> declaration of 'i' will be better.

Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 3. check_for_subscription_state
>
> +check_for_subscription_state(ClusterInfo *cluster)
> +{
> + int dbnum;
> + FILE    *script = NULL;
> + char output_path[MAXPGPATH];
> + int ntup;
> +
> + /* Subscription relations state can be migrated since PG17. */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
> + return;
> +
> + prep_status("Checking for subscription state");
> +
> + snprintf(output_path, sizeof(output_path), "%s/%s",
> + log_opts.basedir,
> + "subscription_state.txt");
>
> I felt this filename ought to be more like
> 'subscriptions_with_bad_state.txt' because the current name looks like
> a normal logfile with nothing to indicate that it is only for the
> states of the "bad" subscriptions.

I  have kept the file name intentionally shorted as we noticed that
when the upgrade of the publisher patch used a longer name there were
some buildfarm failures because of longer names.

> ~~~
>
> 4.
> + for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> + {
>
> Since this is all new code, using C99 style for loop variable
> declaration of 'dbnum' will be better.

Modified

> ~~~
>
> 5.
> + * a) SUBREL_STATE_DATASYNC:A relation upgraded while in this state
> + * would retain a replication slot, which could not be dropped by the
> + * sync worker spawned after the upgrade because the subscription ID
> + * tracked by the publisher does not match anymore.
>
> missing whitespace
>
> /SUBREL_STATE_DATASYNC:A relation/SUBREL_STATE_DATASYNC: A relation/

Modified

Also added a couple of missing test cases. The attached v14 version
patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 13 Nov 2023 at 17:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v13 version patch has the
> > changes for the same.
> >
>
> +
> + ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
> sizeof(originname));
> + originid = replorigin_by_name(originname, false);
> + replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
> +    false /* backward */ ,
> +    false /* WAL log */ );
>
> This seems to update the origin state only in memory. Is it sufficient
> to use this here? Anyway, I think using this requires us to first
> acquire RowExclusiveLock on pg_replication_origin something the patch
> is doing for some other system table.

Added the lock.

The attached v14 patch at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm20%3DBk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 13 Nov 2023 at 17:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 13, 2023 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Nov 10, 2023 at 7:26 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for the comments, the attached v13 version patch has the
> > > changes for the same.
> > >
> >
> > +
> > + ReplicationOriginNameForLogicalRep(subid, InvalidOid, originname,
> > sizeof(originname));
> > + originid = replorigin_by_name(originname, false);
> > + replorigin_advance(originid, sublsn, InvalidXLogRecPtr,
> > +    false /* backward */ ,
> > +    false /* WAL log */ );
> >
> > This seems to update the origin state only in memory. Is it sufficient
> > to use this here?
> >
>
> I think it is probably getting ensured by clean shutdown
> (shutdown_checkpoint) which happens on the new cluster after calling
> this function. We can probably try to add a comment for it. BTW, we
> also need to ensure that max_replication_slots is configured to a
> value higher than origins we are planning to create on the new
> cluster.

Added comments and also added the check for max_replication_slots.

The attached v14 patch at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm20%3DBk_w9jDZXEqkJ3_NUAxOBswCn4jR-tmh-MqNpPZYw%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v14-0001

======
src/backend/utils/adt/pg_upgrade_support.c

1. binary_upgrade_replorigin_advance

+ /* lock to prevent the replication origin from vanishing */
+ LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
+ originid = replorigin_by_name(originname, false);

Use uppercase for the lock comment.

======
src/bin/pg_upgrade/check.c

2. check_for_subscription_state

> > + prep_status("Checking for subscription state");
> > +
> > + snprintf(output_path, sizeof(output_path), "%s/%s",
> > + log_opts.basedir,
> > + "subscription_state.txt");
> >
> > I felt this filename ought to be more like
> > 'subscriptions_with_bad_state.txt' because the current name looks like
> > a normal logfile with nothing to indicate that it is only for the
> > states of the "bad" subscriptions.
>
> I  have kept the file name intentionally shorted as we noticed that
> when the upgrade of the publisher patch used a longer name there were
> some buildfarm failures because of longer names.

OK, but how about some other short meaningful name like 'subs_invalid.txt'?

I also thought "state" in the original name was misleading because
this file contains not only subscriptions with bad 'state' but also
subscriptions with missing 'origin'.

~~~

3. check_new_cluster_logical_replication_slots

  int nslots_on_old;
  int nslots_on_new;
+ int nsubs_on_old = old_cluster.subscription_count;

I felt it might be better to make both these quantities 'unsigned' to
make it more obvious that there are no special meanings for negative
numbers.

~~~

4. check_new_cluster_logical_replication_slots

nslots_on_old = count_old_cluster_logical_slots();

~

IMO the 'nsubs_on_old' should be coded the same as above. AFAICT, this
is the only code where you are interested in the number of
subscribers, and furthermore, it seems you only care about that count
in the *old* cluster. This means the current implementation of
get_subscription_count() seems more generic than it needs to be and
that results in more unnecessary patch code. (I will repeat this same
review comment in the other relevant places).

SUGGESTION
nslots_on_old = count_old_cluster_logical_slots();
nsubs_on_old = count_old_cluster_subscriptions();

~~~

5.
+ /*
+ * Quick return if there are no logical slots and subscriptions to be
+ * migrated.
+ */
+ if (nslots_on_old == 0 && nsubs_on_old == 0)
  return;

/and subscriptions/and no subscriptions/

~~~

6.
- if (nslots_on_old > max_replication_slots)
+ if (nslots_on_old && nslots_on_old > max_replication_slots)
  pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
  "logical replication slots (%d) on the old cluster",
  max_replication_slots, nslots_on_old);

Neither nslots_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nslots_on_old > max_replication_slots)" acheives the same
thing that you want.

~~~

7.
+ if (nsubs_on_old && nsubs_on_old > max_replication_slots)
+ pg_fatal("max_replication_slots (%d) must be greater than or equal
to the number of "
+ "subscriptions (%d) on the old cluster",
+ max_replication_slots, nsubs_on_old);

Neither nsubs_on_old nor max_replication_slots can be < 0, so I don't
see why the additional check is needed here.
AFAICT "if (nsubs_on_old > max_replication_slots)" achieves the same
thing that you want.

======
src/bin/pg_upgrade/info.c

8. get_db_rel_and_slot_infos

+ if (cluster == &old_cluster)
+ get_subscription_count(cluster);
+

I felt this is unnecessary because you only want to know the
nsubs_on_old in one place and then only for the old cluster, so
calling this to set a generic attribute for the cluster is overkill.

~~~

9.
+/*
+ * Get the number of subscriptions in the old cluster.
+ */
+static void
+get_subscription_count(ClusterInfo *cluster)
+{
+ PGconn    *conn;
+ PGresult   *res;
+
+ if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
+ return;
+
+ conn = connectToServer(cluster, "template1");
+ res = executeQueryOrDie(conn,
+   "SELECT oid FROM pg_catalog.pg_subscription");
+
+ cluster->subscription_count = PQntuples(res);
+
+ PQclear(res);
+ PQfinish(conn);
+}

9a.
Currently, this is needed only for the old_cluster (like the function
comment implies), so the parameter is not required.

Also, AFAICT this number is only needed in one place
(check_new_cluster_logical_replication_slots) so IMO it would be
better to make lots of changes to simplify this code:
- change the function name to be like the other one. e.g.
count_old_cluster_subscriptions()
- function to return unsigned

SUGGESTION (something like this...)

unsigned
count_old_cluster_subscriptions(void)
{
  unsigned nsubs = 0;

  if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
  {
    PGconn *conn = connectToServer(&old_cluster, "template1");
    PGresult *res = executeQueryOrDie(conn,
                            "SELECT oid FROM pg_catalog.pg_subscription");
    nsubs = PQntuples(res);
    PQclear(res);
    PQfinish(conn);
  }

  return nsubs;
}

~

9b.
This function is returning 0 (aka not assigning
cluster->subscription_count) for clusters before PG17. IIUC this is
effectively the same behaviour as count_old_cluster_logical_slots()
but probably it needs to be mentioned more in this function comment
why it is like this.

======
src/bin/pg_upgrade/pg_upgrade.h

10.
  const char *tablespace_suffix; /* directory specification */
+ int subscription_count; /* number of subscriptions */
 } ClusterInfo;

I felt this is not needed because you only need to know the
nsubs_on_old in one place, so you can just call the counting function
from there. Making this a generic attribute for the cluster seems
overkill.

======
src/bin/pg_upgrade/t/004_subscription.pl

11. TEST: Check that pg_upgrade is successful when the table is in init state.

+$synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
+$old_sub1->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";

But it doesn't get to "synchronize data", so should that message say
more like "Timed out while waiting for the table to reach INIT state"

~

12.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync',        '-d', $old_sub1->data_dir,
+ '-D',         $new_sub1->data_dir, '-b', $bindir,
+ '-B',         $bindir,            '-s', $new_sub1->host,
+ '-p',         $old_sub1->port,     '-P', $new_sub1->port,
+ $mode,
+ ],
+ 'run of pg_upgrade --check for old instance when the subscription
tables are in ready state'
+);

Should that message say "init state" instead of "ready state"?

~~~

13. TEST: when the subscription's replication origin does not exist.

+$old_sub2->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 disable");

/disable/DISABLE/

~~~

14.
+my $subid = $old_sub2->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
+my $reporigin = 'pg_'.qq($subid);
+$old_sub2->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')"
+);

Maybe this part needs a comment to say the reason why the origin does
not exist -- it's because you found and explicitly dropped it.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

Thanks for updating the patch! Here are some comments.
They are mainly cosmetic because I have not read yours these days.

01. binary_upgrade_add_sub_rel_state()

```
+    /* We must check these things before dereferencing the arguments */
+    if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+        elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed")
```

But fourth argument can be NULL, right? I know you copied from other functions,
but they do not accept for all arguments. One approach is that pg_dump explicitly
writes InvalidXLogRecPtr as the fourth argument.

02. binary_upgrade_add_sub_rel_state()

```
+    if (!OidIsValid(relid))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("invalid relation identifier used: %u", relid));
+
+    tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+    if (!HeapTupleIsValid(tup))
+        ereport(ERROR,
+                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("relation %u does not exist", relid))
```

I'm not sure they should be ereport(). Isn't it that they will be never occurred?
Other upgrade funcs do not have ereport(), and I think it does not have to be
translated.

03. binary_upgrade_replorigin_advance()

IIUC this function is very similar to pg_replication_origin_advance(). Can we
extract a common part of them? I think pg_replication_origin_advance() will be
just a wrapper, and binary_upgrade_replorigin_advance() will get the name of
origin and pass to it.

04. binary_upgrade_replorigin_advance()

Even if you do not accept 03, some variable name can be follow the function.

05. getSubscriptions()

```
+    appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n")
```

Hmm, this value is taken anyway, but will be dumed only when the cluster is PG17+.
Should we avoid getting the value like subrunasowner and subpasswordrequired?
Not sure...

06. dumpSubscriptionTable()

Can we assert that remote version is PG17+?

07. check_for_subscription_state()

IIUC, this function is used only for old cluster. Should we follow
check_old_cluster_for_valid_slots()?

08. check_for_subscription_state()

```
+            fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
+                    active_db->db_name,
+                    PQgetvalue(res, i, 0),
+                    PQgetvalue(res, i, 1),
+                    PQgetvalue(res, i, 2),
+                    PQgetvalue(res, i, 3));
```

IIRC, format strings should be double-quoted.

09. check_new_cluster_logical_replication_slots()

Checks for replication origin were added in check_new_cluster_logical_replication_slots(),
but I felt it became a super function. Can we devide?

10. check_new_cluster_logical_replication_slots()

Even if you reject above, it should be renamed.

11. pg_upgrade.h

```
+    int         subscription_count; /* number of subscriptions */
```

Based on other struct, it should be "nsubscriptions".

12. 004_subscription.pl

```
+use File::Path qw(rmtree);
```

I think this is not used.

13. 004_subscription.pl

```
+my $bindir = $new_sub->config_data('--bindir');
```
For extensibility, it might be better to separate for old/new bindir.

14. 004_subscription.pl

```
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for subscriber to synchronize data";
```

Actually, I'm not sure it is really needed. wait_for_subscription_sync() in line 163
ensures that sync are done? Are there any holes around here?

15. 004_subscription.pl

```
+# Check the number of rows for each table on each server
+my $result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50), "check initial tab_upgraded table data on publisher");
+$result =
+  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(1), "check initial tab_upgraded table data on publisher");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
+is($result, qq(50),
+    "check initial tab_upgraded table data on the old subscriber");
+$result =
+  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
+is($result, qq(0),
+    "check initial tab_not_upgraded table data on the old subscriber");
```

I'm not sure they are really needed. At that time pg_upgrade --check is called,
this won't change the state of clusters.

16. pg_proc.dat

```
+{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
+  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text oid char pg_lsn',
+  prosrc => 'binary_upgrade_add_sub_rel_state' },
+{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
+  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'void',
+  proargtypes => 'text pg_lsn',
+  prosrc => 'binary_upgrade_replorigin_advance' },
```

Based on other function, descr just should be "for use by pg_upgrade".

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:
> >
>
> > Note: actually, this would be OK if we are able to keep the OIDs of
> > the subscribers consistent across upgrades?  I'm OK to not do nothing
> > about that in this patch, to keep it simpler.  Just asking in passing.
>
> I will analyze more on this and post the analysis in the subsequent mail.

I analyzed further and felt that retaining subscription oid would be
cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
all of these will be using the same oid as earlier and also probably
help in supporting upgrade of subscription in more scenarios later.
Here is a patch to handle the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, 9 Nov 2023 at 12:23, Michael Paquier <michael@paquier.xyz> wrote:
> > >
> >
> > > Note: actually, this would be OK if we are able to keep the OIDs of
> > > the subscribers consistent across upgrades?  I'm OK to not do nothing
> > > about that in this patch, to keep it simpler.  Just asking in passing.
> >
> > I will analyze more on this and post the analysis in the subsequent mail.
>
> I analyzed further and felt that retaining subscription oid would be
> cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
> all of these will be using the same oid as earlier and also probably
> help in supporting upgrade of subscription in more scenarios later.
> Here is a patch to handle the same.

Sorry I had attached the older patch, here is the correct updated one.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 16 Nov 2023 at 07:45, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v14-0001
>
> ======
> src/backend/utils/adt/pg_upgrade_support.c
>
> 1. binary_upgrade_replorigin_advance
>
> + /* lock to prevent the replication origin from vanishing */
> + LockRelationOid(ReplicationOriginRelationId, RowExclusiveLock);
> + originid = replorigin_by_name(originname, false);
>
> Use uppercase for the lock comment.

Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 2. check_for_subscription_state
>
> > > + prep_status("Checking for subscription state");
> > > +
> > > + snprintf(output_path, sizeof(output_path), "%s/%s",
> > > + log_opts.basedir,
> > > + "subscription_state.txt");
> > >
> > > I felt this filename ought to be more like
> > > 'subscriptions_with_bad_state.txt' because the current name looks like
> > > a normal logfile with nothing to indicate that it is only for the
> > > states of the "bad" subscriptions.
> >
> > I  have kept the file name intentionally shorted as we noticed that
> > when the upgrade of the publisher patch used a longer name there were
> > some buildfarm failures because of longer names.
>
> OK, but how about some other short meaningful name like 'subs_invalid.txt'?
>
> I also thought "state" in the original name was misleading because
> this file contains not only subscriptions with bad 'state' but also
> subscriptions with missing 'origin'.

Modified

> ~~~
>
> 3. check_new_cluster_logical_replication_slots
>
>   int nslots_on_old;
>   int nslots_on_new;
> + int nsubs_on_old = old_cluster.subscription_count;
>
> I felt it might be better to make both these quantities 'unsigned' to
> make it more obvious that there are no special meanings for negative
> numbers.

I have used int itself as all others also use int like in case of
logical slots. I tried making the changes, but the code was not
consistent, so used int like that is used for others.


> ~~~
>
> 4. check_new_cluster_logical_replication_slots
>
> nslots_on_old = count_old_cluster_logical_slots();
>
> ~
>
> IMO the 'nsubs_on_old' should be coded the same as above. AFAICT, this
> is the only code where you are interested in the number of
> subscribers, and furthermore, it seems you only care about that count
> in the *old* cluster. This means the current implementation of
> get_subscription_count() seems more generic than it needs to be and
> that results in more unnecessary patch code. (I will repeat this same
> review comment in the other relevant places).
>
> SUGGESTION
> nslots_on_old = count_old_cluster_logical_slots();
> nsubs_on_old = count_old_cluster_subscriptions();

Modified to keep it similar to logical slot implementation.

> ~~~
>
> 5.
> + /*
> + * Quick return if there are no logical slots and subscriptions to be
> + * migrated.
> + */
> + if (nslots_on_old == 0 && nsubs_on_old == 0)
>   return;
>
> /and subscriptions/and no subscriptions/

Modified

> ~~~
>
> 6.
> - if (nslots_on_old > max_replication_slots)
> + if (nslots_on_old && nslots_on_old > max_replication_slots)
>   pg_fatal("max_replication_slots (%d) must be greater than or equal
> to the number of "
>   "logical replication slots (%d) on the old cluster",
>   max_replication_slots, nslots_on_old);
>
> Neither nslots_on_old nor max_replication_slots can be < 0, so I don't
> see why the additional check is needed here.
> AFAICT "if (nslots_on_old > max_replication_slots)" acheives the same
> thing that you want.

This part of code is changed now

> ~~~
>
> 7.
> + if (nsubs_on_old && nsubs_on_old > max_replication_slots)
> + pg_fatal("max_replication_slots (%d) must be greater than or equal
> to the number of "
> + "subscriptions (%d) on the old cluster",
> + max_replication_slots, nsubs_on_old);
>
> Neither nsubs_on_old nor max_replication_slots can be < 0, so I don't
> see why the additional check is needed here.
> AFAICT "if (nsubs_on_old > max_replication_slots)" achieves the same
> thing that you want.

This part of code is changed now

> ======
> src/bin/pg_upgrade/info.c
>
> 8. get_db_rel_and_slot_infos
>
> + if (cluster == &old_cluster)
> + get_subscription_count(cluster);
> +
>
> I felt this is unnecessary because you only want to know the
> nsubs_on_old in one place and then only for the old cluster, so
> calling this to set a generic attribute for the cluster is overkill.

We need to do this here because when we do the validation of new
cluster the old cluster will not be running. I have made the flow
similar to logical slots now.

> ~~~
>
> 9.
> +/*
> + * Get the number of subscriptions in the old cluster.
> + */
> +static void
> +get_subscription_count(ClusterInfo *cluster)
> +{
> + PGconn    *conn;
> + PGresult   *res;
> +
> + if (GET_MAJOR_VERSION(cluster->major_version) < 1700)
> + return;
> +
> + conn = connectToServer(cluster, "template1");
> + res = executeQueryOrDie(conn,
> +   "SELECT oid FROM pg_catalog.pg_subscription");
> +
> + cluster->subscription_count = PQntuples(res);
> +
> + PQclear(res);
> + PQfinish(conn);
> +}
>
> 9a.
> Currently, this is needed only for the old_cluster (like the function
> comment implies), so the parameter is not required.
>
> Also, AFAICT this number is only needed in one place
> (check_new_cluster_logical_replication_slots) so IMO it would be
> better to make lots of changes to simplify this code:
> - change the function name to be like the other one. e.g.
> count_old_cluster_subscriptions()
> - function to return unsigned
>
> SUGGESTION (something like this...)
>
> unsigned
> count_old_cluster_subscriptions(void)
> {
>   unsigned nsubs = 0;
>
>   if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
>   {
>     PGconn *conn = connectToServer(&old_cluster, "template1");
>     PGresult *res = executeQueryOrDie(conn,
>                             "SELECT oid FROM pg_catalog.pg_subscription");
>     nsubs = PQntuples(res);
>     PQclear(res);
>     PQfinish(conn);
>   }
>
>   return nsubs;
> }

 This function is not needed anymore, making the logic similar to logical slots.

> ~
>
> 9b.
> This function is returning 0 (aka not assigning
> cluster->subscription_count) for clusters before PG17. IIUC this is
> effectively the same behaviour as count_old_cluster_logical_slots()
> but probably it needs to be mentioned more in this function comment
> why it is like this.

This function is not needed anymore, making the logic similar to logical slots.

> ======
> src/bin/pg_upgrade/pg_upgrade.h
>
> 10.
>   const char *tablespace_suffix; /* directory specification */
> + int subscription_count; /* number of subscriptions */
>  } ClusterInfo;
>
> I felt this is not needed because you only need to know the
> nsubs_on_old in one place, so you can just call the counting function
> from there. Making this a generic attribute for the cluster seems
> overkill.

We need to do this here because when we do the validation of a new
cluster the old cluster will not be running. I have made the flow
similar to logical slots now.

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 11. TEST: Check that pg_upgrade is successful when the table is in init state.
>
> +$synced_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'";
> +$old_sub1->poll_query_until('postgres', $synced_query)
> +  or die "Timed out while waiting for subscriber to synchronize data";
>
> But it doesn't get to "synchronize data", so should that message say
> more like "Timed out while waiting for the table to reach INIT state"

Modified

> ~
>
> 12.
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync',        '-d', $old_sub1->data_dir,
> + '-D',         $new_sub1->data_dir, '-b', $bindir,
> + '-B',         $bindir,            '-s', $new_sub1->host,
> + '-p',         $old_sub1->port,     '-P', $new_sub1->port,
> + $mode,
> + ],
> + 'run of pg_upgrade --check for old instance when the subscription
> tables are in ready state'
> +);
>
> Should that message say "init state" instead of "ready state"?

Modified

> ~~~
>
> 13. TEST: when the subscription's replication origin does not exist.
>
> +$old_sub2->safe_psql('postgres',
> + "ALTER SUBSCRIPTION regress_sub2 disable");
>
> /disable/DISABLE/

Modified

> ~~~
>
> 14.
> +my $subid = $old_sub2->safe_psql('postgres',
> + "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub2'");
> +my $reporigin = 'pg_'.qq($subid);
> +$old_sub2->safe_psql('postgres',
> + "SELECT pg_replication_origin_drop('$reporigin')"
> +);
>
> Maybe this part needs a comment to say the reason why the origin does
> not exist -- it's because you found and explicitly dropped it.

Modified

The attached v15 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 16 Nov 2023 at 18:25, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Vignesh,
>
> Thanks for updating the patch! Here are some comments.
> They are mainly cosmetic because I have not read yours these days.
>
> 01. binary_upgrade_add_sub_rel_state()
>
> ```
> +    /* We must check these things before dereferencing the arguments */
> +    if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
> +        elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is not allowed")
> ```
>
> But fourth argument can be NULL, right? I know you copied from other functions,
> but they do not accept for all arguments. One approach is that pg_dump explicitly
> writes InvalidXLogRecPtr as the fourth argument.

I did not find any problem with this approach, if the lsn is valid
like in ready state, we will send a valid lsn, if lsn is not valid
like in init state we will pass as NULL. This approach was also
suggested at [1].

> 02. binary_upgrade_add_sub_rel_state()
>
> ```
> +    if (!OidIsValid(relid))
> +        ereport(ERROR,
> +                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                errmsg("invalid relation identifier used: %u", relid));
> +
> +    tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
> +    if (!HeapTupleIsValid(tup))
> +        ereport(ERROR,
> +                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                errmsg("relation %u does not exist", relid))
> ```
>
> I'm not sure they should be ereport(). Isn't it that they will be never occurred?
> Other upgrade funcs do not have ereport(), and I think it does not have to be
> translated.

I have removed the first check and retained the second one for a sanity check.

> 03. binary_upgrade_replorigin_advance()
>
> IIUC this function is very similar to pg_replication_origin_advance(). Can we
> extract a common part of them? I think pg_replication_origin_advance() will be
> just a wrapper, and binary_upgrade_replorigin_advance() will get the name of
> origin and pass to it.

We will be able to reduce hardly 4 lines, I felt the existing is better.

> 04. binary_upgrade_replorigin_advance()
>
> Even if you do not accept 03, some variable name can be follow the function.

Modified

> 05. getSubscriptions()
>
> ```
> +    appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n")
> ```
>
> Hmm, this value is taken anyway, but will be dumed only when the cluster is PG17+.
> Should we avoid getting the value like subrunasowner and subpasswordrequired?
> Not sure...

Modified

> 06. dumpSubscriptionTable()
>
> Can we assert that remote version is PG17+?

Modified

> 07. check_for_subscription_state()
>
> IIUC, this function is used only for old cluster. Should we follow
> check_old_cluster_for_valid_slots()?

Modified

> 08. check_for_subscription_state()
>
> ```
> +            fprintf(script, "database:%s subscription:%s schema:%s relation:%s state:%s not in required state\n",
> +                    active_db->db_name,
> +                    PQgetvalue(res, i, 0),
> +                    PQgetvalue(res, i, 1),
> +                    PQgetvalue(res, i, 2),
> +                    PQgetvalue(res, i, 3));
> ```
>
> IIRC, format strings should be double-quoted.

Modified

> 09. check_new_cluster_logical_replication_slots()
>
> Checks for replication origin were added in check_new_cluster_logical_replication_slots(),
> but I felt it became a super function. Can we devide?

Modified

> 10. check_new_cluster_logical_replication_slots()
>
> Even if you reject above, it should be renamed.

Since the previous is handled, this is not valid.

> 11. pg_upgrade.h
>
> ```
> +    int         subscription_count; /* number of subscriptions */
> ```
>
> Based on other struct, it should be "nsubscriptions".

Modified

> 12. 004_subscription.pl
>
> ```
> +use File::Path qw(rmtree);
> ```
>
> I think this is not used.

Modified

> 13. 004_subscription.pl
>
> ```
> +my $bindir = $new_sub->config_data('--bindir');
> ```
> For extensibility, it might be better to separate for old/new bindir.

Modified

> 14. 004_subscription.pl
>
> ```
> +my $synced_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
> +$old_sub->poll_query_until('postgres', $synced_query)
> +  or die "Timed out while waiting for subscriber to synchronize data";
> ```
>
> Actually, I'm not sure it is really needed. wait_for_subscription_sync() in line 163
> ensures that sync are done? Are there any holes around here?

wait_for_subscription_sync will check if table is in syndone or in
ready state, since we are allowing sycndone state, I have removed this
part.

> 15. 004_subscription.pl
>
> ```
> +# Check the number of rows for each table on each server
> +my $result =
> +  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
> +is($result, qq(50), "check initial tab_upgraded table data on publisher");
> +$result =
> +  $publisher->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
> +is($result, qq(1), "check initial tab_upgraded table data on publisher");
> +$result =
> +  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_upgraded");
> +is($result, qq(50),
> +    "check initial tab_upgraded table data on the old subscriber");
> +$result =
> +  $old_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded");
> +is($result, qq(0),
> +    "check initial tab_not_upgraded table data on the old subscriber");
> ```
>
> I'm not sure they are really needed. At that time pg_upgrade --check is called,
> this won't change the state of clusters.

In the newer version, the check has been removed now. So these are required.

> 16. pg_proc.dat
>
> ```
> +{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)',
> +  proname => 'binary_upgrade_add_sub_rel_state', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'void',
> +  proargtypes => 'text oid char pg_lsn',
> +  prosrc => 'binary_upgrade_add_sub_rel_state' },
> +{ oid => '8405', descr => 'for use by pg_upgrade (remote_lsn for origin)',
> +  proname => 'binary_upgrade_replorigin_advance', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'void',
> +  proargtypes => 'text pg_lsn',
> +  prosrc => 'binary_upgrade_replorigin_advance' },
> ```
>
> Based on other function, descr just should be "for use by pg_upgrade".

This was improvised based on one of earlier comments at [1]
The v15 version attached at [2] has the changes for the comments.

[1] - https://www.postgresql.org/message-id/ZQvbV2sdzBY6WEBl%40paquier.xyz
[2] - https://www.postgresql.org/message-id/CALDaNm2ssmSFs4bjpfxbkfUbPE%3DxFSGqxFoip87kF259FG%3DX2g%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Sun, Nov 19, 2023 at 06:56:05AM +0530, vignesh C wrote:
> On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:
>> On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:
>>> I will analyze more on this and post the analysis in the subsequent mail.
>>
>> I analyzed further and felt that retaining subscription oid would be
>> cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
>> all of these will be using the same oid as earlier and also probably
>> help in supporting upgrade of subscription in more scenarios later.
>> Here is a patch to handle the same.
>
> Sorry I had attached the older patch, here is the correct updated one.

Thanks for digging into that.  I think that we should consider that
once the main patch is merged and stable in the tree for v17 to get a
more consistent experience.  Shouldn't this include a test in the new
TAP test for the upgrade of subscriptions?  It should be as simple as
cross-checking the OIDs of the subscriptions before and after the
upgrade.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 13 Nov 2023 at 13:52, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > Anyway, after a closer lookup, I think that your conclusions regarding
> > the states that are allowed in the patch during the upgrade have some
> > flaws.
> >
> > First, are you sure that SYNCDONE is OK to keep?  This catalog state
> > is set in process_syncing_tables_for_sync(), and just after the code
> > opens a transaction to clean up the tablesync slot, followed by a
> > second transaction to clean up the origin.  However, imagine that
> > there is a failure in dropping the slot, the origin, or just in
> > transaction processing, cannot we finish in a state where the relation
> > is marked as SYNCDONE in the catalog but still has an origin and/or a
> > tablesync slot lying around?  Assuming that SYNCDONE is an OK state
> > seems incorrect to me.  I am pretty sure that injecting an error in a
> > code path after the slot is created would equally lead to an
> > inconsistency.
>
> There are couple of things happening here: a) In the first part we
> take care of setting subscription relation to SYNCDONE and dropping
> the replication slot at publisher node, only if drop replication slot
> is successful the relation state will be set to SYNCDONE , if drop
> replication slot fails the relation state will still be in
> FINISHEDCOPY. So if there is a failure in the drop replication slot we
> will not have an issue as the tablesync worker will be in
> FINISHEDCOPYstate and this state is not allowed for upgrade. When the
> state is in SYNCDONE the tablesync slot will not be present. b) In the
> second part we drop the replication origin, even if there is a chance
> that drop replication origin fails due to some reason, there will be
> no problem as we do not copy the table sync replication origin to the
> new cluster while upgrading. Since the table sync replication origin
> is not copied to the new cluster there will be no replication origin
> leaks.
>

And, this will work because in the SYNCDONE state, while removing the
origin, we are okay with missing origins. It seems not copying the
origin for tablesync workers in this state (SYNCDONE) relies on the
fact that currently, we don't use those origins once the system
reaches the SYNCDONE state but I am not sure it is a good idea to have
such a dependency and that upgrade assuming such things doesn't seems
ideal to me. Personally, I think allowing an upgrade in  'i'
(initialize) state or 'r' (ready) state seems safe because in those
states either slots/origins don't exist or are dropped. What do you
think?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v15-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptions

+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
+ else
+ appendPQExpBufferStr(query, "NULL AS suboriginremotelsn\n");
+

There should be preceding spaces in those append strings to match the
other ones.

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = subrinfo->subinfo;
+ PQExpBuffer query;
+ char    *tag;
+
+ /* Do nothing in data-only dump */
+ if (dopt->dataOnly)
+ return;
+
+ Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);

The function comment says this is only for binary-upgrade mode, so why
does the Assert use || (OR)?

======
src/bin/pg_upgrade/check.c

3. check_and_dump_old_cluster

+ /*
+ * Subscription dependencies can be migrated since PG17. See comments atop
+ * get_old_cluster_subscription_count().
+ */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+ check_old_cluster_subscription_state(&old_cluster);
+

Should this be combined with the other adjacent check so there is only
one "if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)"
needed?

~~~

4. check_new_cluster

  check_new_cluster_logical_replication_slots();
+
+ check_new_cluster_subscription_configuration();

When checking the old cluster, the subscription was checked before the
slots, but here for the new cluster, the slots are checked before the
subscription. Maybe it makes no difference but it might be tidier to
do these old/new checks in the same order.

~~~

5. check_new_cluster_logical_replication_slots

- /* Quick return if there are no logical slots to be migrated. */
+ /* Quick return if there are no logical slots to be migrated */

Change is not relevant for this patch.

~~~

6.

+ res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
+ "WHERE name IN ('max_replication_slots') "
+ "ORDER BY name DESC;");

Using IN and ORDER BY in this SQL seems unnecessary when you are only
searching for one name.

======
src/bin/pg_upgrade/info.c

7. statics

-
+static void get_old_cluster_subscription_count(DbInfo *dbinfo);

This change also removes an existing blank line -- not sure if that
was intentional

~~~

8.
@@ -365,7 +369,6 @@ get_template0_info(ClusterInfo *cluster)
  PQfinish(conn);
 }

-
 /*
  * get_db_infos()
  *

This blank line change (before get_db_infos) should not be part of this patch.

~~~

9. get_old_cluster_subscription_count

It seems a slightly misleading function name because this is a PER-DB
count, not a cluster count.

~~~


10.
+ /* Subscriptions can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to compare < 1700 instead of <= 1600. It keeps the
code more aligned with the comment.

~~~

11. count_old_cluster_subscriptions

+/*
+ * count_old_cluster_subscriptions()
+ *
+ * Returns the number of subscription for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because we gather subscriptions only for cluster versions greater than or
+ * equal to PG17. See get_old_cluster_subscription_count().
+ */
+int
+count_old_cluster_subscriptions(void)
+{
+ int nsubs = 0;
+
+ for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+ nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
+
+ return nsubs;
+}

11a.
/subscription/subscriptions/

~

11b.
The code is now consistent with the slots code which looks good. OTOH
I thought that 'pg_catalog.pg_subscription' is shared across all
databases of the cluster, so isn't this code inefficient to be
querying again and again for every database (if there are many of
them) instead of just querying 1 time only for the whole cluster?

======
src/bin/pg_upgrade/t/004_subscription.pl

12.
It is difficult to keep track of all the tables (upgraded and not
upgraded) at each step of these tests. Maybe the comments can be more
explicit along the way. e.g

BEFORE
+# Add tab_not_upgraded1 to the publication

SUGGESTION
+# Add tab_not_upgraded1 to the publication. Now publication has <blah blah>

and

BEFORE
+# Subscription relations should be preserved

SUGGESTION
+# Subscription relations should be preserved. The upgraded won't know
about 'tab_not_upgraded1' because <blah blah>

etc.

~~~

13.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
+is($result, qq(0),
+ "no change in table tab_not_upgraded1 afer enable subscription which
is not part of the publication"

/afer/after/

~~~

14.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run a) if there's a subscription with tables
+# in a state different than 'r' (ready), 'i' (init) and 's' (synchronized)
+# and/or b) if the subscription does not have a replication origin.
+# ------------------------------------------------------

14a,
/does not have a/has no/

~

14b.
Maybe put a) and b) on newlines to be more readable.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Nov 20, 2023 at 09:49:41AM +0530, Amit Kapila wrote:
> On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
>> There are couple of things happening here: a) In the first part we
>> take care of setting subscription relation to SYNCDONE and dropping
>> the replication slot at publisher node, only if drop replication slot
>> is successful the relation state will be set to SYNCDONE , if drop
>> replication slot fails the relation state will still be in
>> FINISHEDCOPY. So if there is a failure in the drop replication slot we
>> will not have an issue as the tablesync worker will be in
>> FINISHEDCOPYstate and this state is not allowed for upgrade. When the
>> state is in SYNCDONE the tablesync slot will not be present. b) In the
>> second part we drop the replication origin, even if there is a chance
>> that drop replication origin fails due to some reason, there will be
>> no problem as we do not copy the table sync replication origin to the
>> new cluster while upgrading. Since the table sync replication origin
>> is not copied to the new cluster there will be no replication origin
>> leaks.
>
> And, this will work because in the SYNCDONE state, while removing the
> origin, we are okay with missing origins. It seems not copying the
> origin for tablesync workers in this state (SYNCDONE) relies on the
> fact that currently, we don't use those origins once the system
> reaches the SYNCDONE state but I am not sure it is a good idea to have
> such a dependency and that upgrade assuming such things doesn't seems
> ideal to me.

Hmm, yeah, you mean the replorigin_drop_by_name() calls in
tablesync.c.  I did not pay much attention about that in the code, but
your point sounds sensible.

(I have not been able to complete an analysis of the risks behind 's'
to convince myself that it is entirely safe, but leaks are scary as
hell if this gets automated across a large fleet of nodes..)

> Personally, I think allowing an upgrade in  'i'
> (initialize) state or 'r' (ready) state seems safe because in those
> states either slots/origins don't exist or are dropped. What do you
> think?

I share a similar impression about 's'.  From a design point of view,
making the conditions to reach harder in the first implementation
makes the user experience stricter, but that's safer regarding leaks
and it is still possible to relax these choices in the future
depending on the improvement pieces we are able to figure out.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 20 Nov 2023 at 10:44, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v15-0001
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 1. getSubscriptions
>
> + if (fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query, "o.remote_lsn AS suboriginremotelsn\n");
> + else
> + appendPQExpBufferStr(query, "NULL AS suboriginremotelsn\n");
> +
>
> There should be preceding spaces in those append strings to match the
> other ones.

Modified

> ~~~
>
> 2. dumpSubscriptionTable
>
> +/*
> + * dumpSubscriptionTable
> + *   Dump the definition of the given subscription table mapping. This will be
> + *    used only in binary-upgrade mode.
> + */
> +static void
> +dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
> +{
> + DumpOptions *dopt = fout->dopt;
> + SubscriptionInfo *subinfo = subrinfo->subinfo;
> + PQExpBuffer query;
> + char    *tag;
> +
> + /* Do nothing in data-only dump */
> + if (dopt->dataOnly)
> + return;
> +
> + Assert(fout->dopt->binary_upgrade || fout->remoteVersion >= 170000);
>
> The function comment says this is only for binary-upgrade mode, so why
> does the Assert use || (OR)?

Added comments

> ======
> src/bin/pg_upgrade/check.c
>
> 3. check_and_dump_old_cluster
>
> + /*
> + * Subscription dependencies can be migrated since PG17. See comments atop
> + * get_old_cluster_subscription_count().
> + */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
> + check_old_cluster_subscription_state(&old_cluster);
> +
>
> Should this be combined with the other adjacent check so there is only
> one "if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)"
> needed?

Modified

> ~~~
>
> 4. check_new_cluster
>
>   check_new_cluster_logical_replication_slots();
> +
> + check_new_cluster_subscription_configuration();
>
> When checking the old cluster, the subscription was checked before the
> slots, but here for the new cluster, the slots are checked before the
> subscription. Maybe it makes no difference but it might be tidier to
> do these old/new checks in the same order.

Modified

> ~~~
>
> 5. check_new_cluster_logical_replication_slots
>
> - /* Quick return if there are no logical slots to be migrated. */
> + /* Quick return if there are no logical slots to be migrated */
>
> Change is not relevant for this patch.

Removed it

> ~~~
>
> 6.
>
> + res = executeQueryOrDie(conn, "SELECT setting FROM pg_settings "
> + "WHERE name IN ('max_replication_slots') "
> + "ORDER BY name DESC;");
>
> Using IN and ORDER BY in this SQL seems unnecessary when you are only
> searching for one name.

Modified

> ======
> src/bin/pg_upgrade/info.c
>
> 7. statics
>
> -
> +static void get_old_cluster_subscription_count(DbInfo *dbinfo);
>
> This change also removes an existing blank line -- not sure if that
> was intentional

Modified

> ~~~
>
> 8.
> @@ -365,7 +369,6 @@ get_template0_info(ClusterInfo *cluster)
>   PQfinish(conn);
>  }
>
> -
>  /*
>   * get_db_infos()
>   *
>
> This blank line change (before get_db_infos) should not be part of this patch.

Modified

> ~~~
>
> 9. get_old_cluster_subscription_count
>
> It seems a slightly misleading function name because this is a PER-DB
> count, not a cluster count.

Modified

> ~~~
>
>
> 10.
> + /* Subscriptions can be migrated since PG17. */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
> + return;
>
> IMO it is better to compare < 1700 instead of <= 1600. It keeps the
> code more aligned with the comment.

Modified

> ~~~
>
> 11. count_old_cluster_subscriptions
>
> +/*
> + * count_old_cluster_subscriptions()
> + *
> + * Returns the number of subscription for all databases.
> + *
> + * Note: this function always returns 0 if the old_cluster is PG16 and prior
> + * because we gather subscriptions only for cluster versions greater than or
> + * equal to PG17. See get_old_cluster_subscription_count().
> + */
> +int
> +count_old_cluster_subscriptions(void)
> +{
> + int nsubs = 0;
> +
> + for (int dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
> + nsubs += old_cluster.dbarr.dbs[dbnum].nsubs;
> +
> + return nsubs;
> +}
>
> 11a.
> /subscription/subscriptions/

Modified

> ~
>
> 11b.
> The code is now consistent with the slots code which looks good. OTOH
> I thought that 'pg_catalog.pg_subscription' is shared across all
> databases of the cluster, so isn't this code inefficient to be
> querying again and again for every database (if there are many of
> them) instead of just querying 1 time only for the whole cluster?

My earlier version was like that, changed it to keep the code
consistent to logical replication slots.

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 12.
> It is difficult to keep track of all the tables (upgraded and not
> upgraded) at each step of these tests. Maybe the comments can be more
> explicit along the way. e.g
>
> BEFORE
> +# Add tab_not_upgraded1 to the publication
>
> SUGGESTION
> +# Add tab_not_upgraded1 to the publication. Now publication has <blah blah>
>
> and
>
> BEFORE
> +# Subscription relations should be preserved
>
> SUGGESTION
> +# Subscription relations should be preserved. The upgraded won't know
> about 'tab_not_upgraded1' because <blah blah>
>
> etc.

Modified

> ~~~
>
> 13.
> +$result =
> +  $new_sub->safe_psql('postgres', "SELECT count(*) FROM tab_not_upgraded1");
> +is($result, qq(0),
> + "no change in table tab_not_upgraded1 afer enable subscription which
> is not part of the publication"
>
> /afer/after/

Modified

> ~~~
>
> 14.
> +# ------------------------------------------------------
> +# Check that pg_upgrade refuses to run a) if there's a subscription with tables
> +# in a state different than 'r' (ready), 'i' (init) and 's' (synchronized)
> +# and/or b) if the subscription does not have a replication origin.
> +# ------------------------------------------------------
>
> 14a,
> /does not have a/has no/

Modified

> ~
>
> 14b.
> Maybe put a) and b) on newlines to be more readable.

Modified

The attached v16 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Thanks for addressing my past review comments.

Here are some more review comments for patch v16-0001

======
doc/src/sgml/ref/pgupgrade.sgml

1.
+      <para>
+       Create all the new tables that were created in the publication during
+       upgrade and refresh the publication by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
+      </para>

"Create all ... that were created" sounds a bit strange.

SUGGESTION (maybe like this or similar?)
Create equivalent subscriber tables for anything that became newly
part of the publication during the upgrade and....

======
src/bin/pg_dump/pg_dump.c

2. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

This function comment says "used only in binary-upgrade mode." and the
Assert says the same. But, is this compatible with the other function
dumpSubscriptionTable() where it says "used only in binary-upgrade
mode and for PG17 or later versions"?

======
src/bin/pg_upgrade/check.c

3. check_new_cluster_subscription_configuration

+static void
+check_new_cluster_subscription_configuration(void)
+{
+ PGresult   *res;
+ PGconn    *conn;
+ int nsubs_on_old;
+ int max_replication_slots;
+
+ /* Logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;

IMO it is better to say < 1700 in this check, instead of <= 1600.

~~~

4.
+ /* Quick return if there are no subscriptions to be migrated */
+ if (nsubs_on_old == 0)
+ return;

Missing period in comment.

~~~

5.
+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize), r (ready) or s (synchronized) state.
+ */
+static void
+check_old_cluster_subscription_state(ClusterInfo *cluster)

This function is only for the old cluster (hint: the function name) so
there is no need to pass the 'cluster' parameter here. Just directly
use old_cluster in the function body.

======
src/bin/pg_upgrade/t/004_subscription.pl

6.
+# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");

Typo in comment. You added tab_not_upgraded2, not tab_not_upgraded1

~~

7.
+# Subscription relations should be preserved. The upgraded won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.

Typo or missing word in comment?

"The upgraded" ??

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Shlok Kyal
Date:
On Wed, 22 Nov 2023 at 06:48, Peter Smith <smithpb2250@gmail.com> wrote:
> ======
> doc/src/sgml/ref/pgupgrade.sgml
>
> 1.
> +      <para>
> +       Create all the new tables that were created in the publication during
> +       upgrade and refresh the publication by executing
> +       <link linkend="sql-altersubscription"><command>ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION</command></link>.
> +      </para>
>
> "Create all ... that were created" sounds a bit strange.
>
> SUGGESTION (maybe like this or similar?)
> Create equivalent subscriber tables for anything that became newly
> part of the publication during the upgrade and....

Modified

> ======
> src/bin/pg_dump/pg_dump.c
>
> 2. getSubscriptionTables
>
> +/*
> + * getSubscriptionTables
> + *   Get information about subscription membership for dumpable tables. This
> + *    will be used only in binary-upgrade mode.
> + */
> +void
> +getSubscriptionTables(Archive *fout)
> +{
> + DumpOptions *dopt = fout->dopt;
> + SubscriptionInfo *subinfo = NULL;
> + SubRelInfo *subrinfo;
> + PQExpBuffer query;
> + PGresult   *res;
> + int i_srsubid;
> + int i_srrelid;
> + int i_srsubstate;
> + int i_srsublsn;
> + int ntups;
> + Oid last_srsubid = InvalidOid;
> +
> + if (dopt->no_subscriptions || !dopt->binary_upgrade ||
> + fout->remoteVersion < 170000)
> + return;
>
> This function comment says "used only in binary-upgrade mode." and the
> Assert says the same. But, is this compatible with the other function
> dumpSubscriptionTable() where it says "used only in binary-upgrade
> mode and for PG17 or later versions"?
>
Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 3. check_new_cluster_subscription_configuration
>
> +static void
> +check_new_cluster_subscription_configuration(void)
> +{
> + PGresult   *res;
> + PGconn    *conn;
> + int nsubs_on_old;
> + int max_replication_slots;
> +
> + /* Logical slots can be migrated since PG17. */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
> + return;
>
> IMO it is better to say < 1700 in this check, instead of <= 1600.
>
Modified

> ~~~
>
> 4.
> + /* Quick return if there are no subscriptions to be migrated */
> + if (nsubs_on_old == 0)
> + return;
>
> Missing period in comment.
>
Modified

> ~~~
>
> 5.
> +/*
> + * check_old_cluster_subscription_state()
> + *
> + * Verify that each of the subscriptions has all their corresponding tables in
> + * i (initialize), r (ready) or s (synchronized) state.
> + */
> +static void
> +check_old_cluster_subscription_state(ClusterInfo *cluster)
>
> This function is only for the old cluster (hint: the function name) so
> there is no need to pass the 'cluster' parameter here. Just directly
> use old_cluster in the function body.
>
Modified

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 6.
> +# Add tab_not_upgraded1 to the publication. Now publication has tab_upgraded1
> +# and tab_upgraded2 tables.
> +$publisher->safe_psql('postgres',
> + "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
>
> Typo in comment. You added tab_not_upgraded2, not tab_not_upgraded1
>
Modified

> ~~
>
> 7.
> +# Subscription relations should be preserved. The upgraded won't know
> +# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
>
> Typo or missing word in comment?
>
> "The upgraded" ??
>
Modified

Attached the v17 patch which have the same changes

Thanks,
Shlok Kumar Kyal

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v17-0001

======
src/bin/pg_dump/pg_dump.c

1. getSubscriptionTables

+/*
+ * getSubscriptionTables
+ *   Get information about subscription membership for dumpable tables. This
+ *    will be used only in binary-upgrade mode and for PG17 or later versions.
+ */
+void
+getSubscriptionTables(Archive *fout)
+{
+ DumpOptions *dopt = fout->dopt;
+ SubscriptionInfo *subinfo = NULL;
+ SubRelInfo *subrinfo;
+ PQExpBuffer query;
+ PGresult   *res;
+ int i_srsubid;
+ int i_srrelid;
+ int i_srsubstate;
+ int i_srsublsn;
+ int ntups;
+ Oid last_srsubid = InvalidOid;
+
+ if (dopt->no_subscriptions || !dopt->binary_upgrade ||
+ fout->remoteVersion < 170000)
+ return;

I still felt that the function comment ("used only in binary-upgrade
mode and for PG17 or later") was misleading. IMO that sounds like it
would be OK for PG17 regardless of the binary mode, but the code says
otherwise.

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

~~~

2. dumpSubscriptionTable

+/*
+ * dumpSubscriptionTable
+ *   Dump the definition of the given subscription table mapping. This will be
+ *    used only in binary-upgrade mode and for PG17 or later versions.
+ */
+static void
+dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)

(this is the same as the previous review comment #1)

Assuming the code is correct, perhaps the comment should say:
"... used only in binary-upgrade mode for PG17 or later versions."

======
src/bin/pg_upgrade/check.c

3.
+static void
+check_old_cluster_subscription_state()
+{
+ FILE    *script = NULL;
+ char output_path[MAXPGPATH];
+ int ntup;
+ ClusterInfo *cluster = &old_cluster;
+
+ prep_status("Checking for subscription state");
+
+ snprintf(output_path, sizeof(output_path), "%s/%s",
+ log_opts.basedir,
+ "subs_invalid.txt");
+ for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+ {
+ PGresult   *res;
+ DbInfo    *active_db = &cluster->dbarr.dbs[dbnum];
+ PGconn    *conn = connectToServer(cluster, active_db->db_name);

There seems no need for an extra variable ('cluster') here when you
can just reference 'old_cluster' directly in the code, the same as
other functions in this file do all the time.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 21 Nov 2023 at 07:11, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Nov 20, 2023 at 09:49:41AM +0530, Amit Kapila wrote:
> > On Tue, Nov 14, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
> >> There are couple of things happening here: a) In the first part we
> >> take care of setting subscription relation to SYNCDONE and dropping
> >> the replication slot at publisher node, only if drop replication slot
> >> is successful the relation state will be set to SYNCDONE , if drop
> >> replication slot fails the relation state will still be in
> >> FINISHEDCOPY. So if there is a failure in the drop replication slot we
> >> will not have an issue as the tablesync worker will be in
> >> FINISHEDCOPYstate and this state is not allowed for upgrade. When the
> >> state is in SYNCDONE the tablesync slot will not be present. b) In the
> >> second part we drop the replication origin, even if there is a chance
> >> that drop replication origin fails due to some reason, there will be
> >> no problem as we do not copy the table sync replication origin to the
> >> new cluster while upgrading. Since the table sync replication origin
> >> is not copied to the new cluster there will be no replication origin
> >> leaks.
> >
> > And, this will work because in the SYNCDONE state, while removing the
> > origin, we are okay with missing origins. It seems not copying the
> > origin for tablesync workers in this state (SYNCDONE) relies on the
> > fact that currently, we don't use those origins once the system
> > reaches the SYNCDONE state but I am not sure it is a good idea to have
> > such a dependency and that upgrade assuming such things doesn't seems
> > ideal to me.
>
> Hmm, yeah, you mean the replorigin_drop_by_name() calls in
> tablesync.c.  I did not pay much attention about that in the code, but
> your point sounds sensible.
>
> (I have not been able to complete an analysis of the risks behind 's'
> to convince myself that it is entirely safe, but leaks are scary as
> hell if this gets automated across a large fleet of nodes..)
>
> > Personally, I think allowing an upgrade in  'i'
> > (initialize) state or 'r' (ready) state seems safe because in those
> > states either slots/origins don't exist or are dropped. What do you
> > think?
>
> I share a similar impression about 's'.  From a design point of view,
> making the conditions to reach harder in the first implementation
> makes the user experience stricter, but that's safer regarding leaks
> and it is still possible to relax these choices in the future
> depending on the improvement pieces we are able to figure out.

Based on the suggestions just to have safe init and ready state, I
have made the changes to handle the same in v18 version patch
attached.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 23 Nov 2023 at 05:56, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v17-0001
>
> ======
> src/bin/pg_dump/pg_dump.c
>
> 1. getSubscriptionTables
>
> +/*
> + * getSubscriptionTables
> + *   Get information about subscription membership for dumpable tables. This
> + *    will be used only in binary-upgrade mode and for PG17 or later versions.
> + */
> +void
> +getSubscriptionTables(Archive *fout)
> +{
> + DumpOptions *dopt = fout->dopt;
> + SubscriptionInfo *subinfo = NULL;
> + SubRelInfo *subrinfo;
> + PQExpBuffer query;
> + PGresult   *res;
> + int i_srsubid;
> + int i_srrelid;
> + int i_srsubstate;
> + int i_srsublsn;
> + int ntups;
> + Oid last_srsubid = InvalidOid;
> +
> + if (dopt->no_subscriptions || !dopt->binary_upgrade ||
> + fout->remoteVersion < 170000)
> + return;
>
> I still felt that the function comment ("used only in binary-upgrade
> mode and for PG17 or later") was misleading. IMO that sounds like it
> would be OK for PG17 regardless of the binary mode, but the code says
> otherwise.
>
> Assuming the code is correct, perhaps the comment should say:
> "... used only in binary-upgrade mode for PG17 or later versions."

Modified

> ~~~
>
> 2. dumpSubscriptionTable
>
> +/*
> + * dumpSubscriptionTable
> + *   Dump the definition of the given subscription table mapping. This will be
> + *    used only in binary-upgrade mode and for PG17 or later versions.
> + */
> +static void
> +dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo)
>
> (this is the same as the previous review comment #1)
>
> Assuming the code is correct, perhaps the comment should say:
> "... used only in binary-upgrade mode for PG17 or later versions."

Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 3.
> +static void
> +check_old_cluster_subscription_state()
> +{
> + FILE    *script = NULL;
> + char output_path[MAXPGPATH];
> + int ntup;
> + ClusterInfo *cluster = &old_cluster;
> +
> + prep_status("Checking for subscription state");
> +
> + snprintf(output_path, sizeof(output_path), "%s/%s",
> + log_opts.basedir,
> + "subs_invalid.txt");
> + for (int dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
> + {
> + PGresult   *res;
> + DbInfo    *active_db = &cluster->dbarr.dbs[dbnum];
> + PGconn    *conn = connectToServer(cluster, active_db->db_name);
>
> There seems no need for an extra variable ('cluster') here when you
> can just reference 'old_cluster' directly in the code, the same as
> other functions in this file do all the time.

Modified

The  v18 version patch attached at [1] has the changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm3wyYY5ywFpCwUVW1_Di1af3WxeZggGEDQEu8qa58a7FQ%40mail.gmail.com



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
I have only trivial review comments for patch v18-0001

======
src/bin/pg_upgrade/check.c

1. check_new_cluster_subscription_configuration

+ /*
+ * A slot not created yet refers to the 'i' (initialize) state, while
+ * 'r' (ready) state refer to a slot created previously but already
+ * dropped. These states are supported states for upgrade. The other
+ * states listed below are not ok:
+ *
+ * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
+ * would retain a replication slot, which could not be dropped by the
+ * sync worker spawned after the upgrade because the subscription ID
+ * tracked by the publisher does not match anymore.
+ *
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.
+ *
+ * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
+ * a relation upgraded while in this state would expect an origin ID
+ * with the OID of the subscription used before the upgrade, causing
+ * it to fail.
+ *
+ * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
+ * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
+ * so we need not allow these states.
+ */

1a.
/while 'r' (ready) state refer to a slot/while 'r' (ready) state
refers to a slot/

1b.
/These states are supported states for upgrade./These states are
supported for pg_upgrade./

1c
/The other states listed below are not ok./The other states listed
below are not supported./

======
src/bin/pg_upgrade/t/004_subscription.pl

2.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state different than
+#    'r' (ready) or 'i' (init) state and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

/if there's a subscription with tables in a state different than 'r'
(ready) or 'i' (init) state and/if there's a subscription with tables
in a state other than 'r' (ready) or 'i' (init) and/

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 24 Nov 2023 at 07:00, Peter Smith <smithpb2250@gmail.com> wrote:
>
> I have only trivial review comments for patch v18-0001
>
> ======
> src/bin/pg_upgrade/check.c
>
> 1. check_new_cluster_subscription_configuration
>
> + /*
> + * A slot not created yet refers to the 'i' (initialize) state, while
> + * 'r' (ready) state refer to a slot created previously but already
> + * dropped. These states are supported states for upgrade. The other
> + * states listed below are not ok:
> + *
> + * a) SUBREL_STATE_DATASYNC: A relation upgraded while in this state
> + * would retain a replication slot, which could not be dropped by the
> + * sync worker spawned after the upgrade because the subscription ID
> + * tracked by the publisher does not match anymore.
> + *
> + * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
> + * would retain the replication origin in certain cases.
> + *
> + * c) SUBREL_STATE_FINISHEDCOPY: A tablesync worker spawned to work on
> + * a relation upgraded while in this state would expect an origin ID
> + * with the OID of the subscription used before the upgrade, causing
> + * it to fail.
> + *
> + * d) SUBREL_STATE_SYNCWAIT, SUBREL_STATE_CATCHUP and
> + * SUBREL_STATE_UNKNOWN: These states are not stored in the catalog,
> + * so we need not allow these states.
> + */
>
> 1a.
> /while 'r' (ready) state refer to a slot/while 'r' (ready) state
> refers to a slot/

Modified

> 1b.
> /These states are supported states for upgrade./These states are
> supported for pg_upgrade./

Modified

> 1c
> /The other states listed below are not ok./The other states listed
> below are not supported./

Modified

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 2.
> +# ------------------------------------------------------
> +# Check that pg_upgrade refuses to run in:
> +# a) if there's a subscription with tables in a state different than
> +#    'r' (ready) or 'i' (init) state and/or
> +# b) if the subscription has no replication origin.
> +# ------------------------------------------------------
>
> /if there's a subscription with tables in a state different than 'r'
> (ready) or 'i' (init) state and/if there's a subscription with tables
> in a state other than 'r' (ready) or 'i' (init) and/

Modified

The attached v19 version patch has the changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 20 Nov 2023 at 05:27, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Nov 19, 2023 at 06:56:05AM +0530, vignesh C wrote:
> > On Sun, 19 Nov 2023 at 06:52, vignesh C <vignesh21@gmail.com> wrote:
> >> On Fri, 10 Nov 2023 at 19:26, vignesh C <vignesh21@gmail.com> wrote:
> >>> I will analyze more on this and post the analysis in the subsequent mail.
> >>
> >> I analyzed further and felt that retaining subscription oid would be
> >> cleaner as subscription/subscription_rel/replication_origin/replication_origin_status
> >> all of these will be using the same oid as earlier and also probably
> >> help in supporting upgrade of subscription in more scenarios later.
> >> Here is a patch to handle the same.
> >
> > Sorry I had attached the older patch, here is the correct updated one.
>
> Thanks for digging into that.  I think that we should consider that
> once the main patch is merged and stable in the tree for v17 to get a
> more consistent experience.

Yes, that approach makes sense.

> Shouldn't this include a test in the new
> TAP test for the upgrade of subscriptions?  It should be as simple as
> cross-checking the OIDs of the subscriptions before and after the
> upgrade.

Added a test for the same.

The changes for the same are present in v19-0002 patch.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
>

Few comments on v19:
==================
1.
+    <para>
+     The subscriptions will be migrated to the new cluster in a disabled state.
+     After migration, do this:
+    </para>
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       Enable the subscriptions by executing
+       <link linkend="sql-altersubscription"><command>ALTER
SUBSCRIPTION ... ENABLE</command></link>.

The reason for this restriction is not very clear to me. Is it because
we are using pg_dump for subscription and the existing functionality
is doing it? If so, I think currently even connect is false.

2.
+ * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
+ * would retain the replication origin in certain cases.

I think this is vague. Can we briefly describe cases where the origins
would be retained?

3. I think the cases where the publisher is also upgraded restoring
the origin's LSN is of no use. Currently, I can't see a problem with
restoring stale originLSN in such cases as we won't be able to
distinguish during the upgrade but I think we should document it in
the comments somewhere in the patch.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch set v19*

//////

v19-0001.

No comments

///////

v19-0002.

(I saw that both changes below seemed cut/paste from similar
functions, but I will ask the questions anyway).

======
src/backend/commands/subscriptioncmds.c

1.
+/* Potentially set by pg_upgrade_support functions */
+Oid binary_upgrade_next_pg_subscription_oid = InvalidOid;
+

The comment "by pg_upgrade_support functions" seemed a bit vague. IMO
you might as well tell the name of the function that sets this.

SUGGESTION
Potentially set by the pg_upgrade_support function --
binary_upgrade_set_next_pg_subscription_oid().

~~~

2. CreateSubscription

+ if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("pg_subscription OID value not set when in binary upgrade mode")));

Doesn't this condition mean some kind of impossible internal error
occurred -- i.e. should this be elog instead of ereport?

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Few comments on v19:
> ==================
> 1.
> +    <para>
> +     The subscriptions will be migrated to the new cluster in a disabled state.
> +     After migration, do this:
> +    </para>
> +
> +    <itemizedlist>
> +     <listitem>
> +      <para>
> +       Enable the subscriptions by executing
> +       <link linkend="sql-altersubscription"><command>ALTER
> SUBSCRIPTION ... ENABLE</command></link>.
>
> The reason for this restriction is not very clear to me. Is it because
> we are using pg_dump for subscription and the existing functionality
> is doing it? If so, I think currently even connect is false.

This was done this way so that the apply worker doesn't get started
while the upgrade is happening. Now that we have set
max_logical_replication_workers to 0, the apply workers will not get
started during the upgrade process. I think now we can create the
subscriptions with the same options as the old cluster in case of
upgrade.

> 2.
> + * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
> + * would retain the replication origin in certain cases.
>
> I think this is vague. Can we briefly describe cases where the origins
> would be retained?

I will modify this in the next version

> 3. I think the cases where the publisher is also upgraded restoring
> the origin's LSN is of no use. Currently, I can't see a problem with
> restoring stale originLSN in such cases as we won't be able to
> distinguish during the upgrade but I think we should document it in
> the comments somewhere in the patch.

I will add a comment for this in the next version

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Nov 27, 2023 at 3:18 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > Few comments on v19:
> > ==================
> > 1.
> > +    <para>
> > +     The subscriptions will be migrated to the new cluster in a disabled state.
> > +     After migration, do this:
> > +    </para>
> > +
> > +    <itemizedlist>
> > +     <listitem>
> > +      <para>
> > +       Enable the subscriptions by executing
> > +       <link linkend="sql-altersubscription"><command>ALTER
> > SUBSCRIPTION ... ENABLE</command></link>.
> >
> > The reason for this restriction is not very clear to me. Is it because
> > we are using pg_dump for subscription and the existing functionality
> > is doing it? If so, I think currently even connect is false.
>
> This was done this way so that the apply worker doesn't get started
> while the upgrade is happening. Now that we have set
> max_logical_replication_workers to 0, the apply workers will not get
> started during the upgrade process. I think now we can create the
> subscriptions with the same options as the old cluster in case of
> upgrade.
>

Okay, but what is your plan to change it. Currently, we are relying on
existing pg_dump code to dump subscriptions data, do you want to
change that? There is a reason for the current behavior of pg_dump
which as mentioned in docs is: "When dumping logical replication
subscriptions, pg_dump will generate CREATE SUBSCRIPTION commands that
use the connect = false option, so that restoring the subscription
does not make remote connections for creating a replication slot or
for initial table copy. That way, the dump can be restored without
requiring network access to the remote servers. It is then up to the
user to reactivate the subscriptions in a suitable way. If the
involved hosts have changed, the connection information might have to
be changed. It might also be appropriate to truncate the target tables
before initiating a new full table copy."

I guess one reason to not enable subscription after restore was that
it can't work without origins, and also one can restore the dump in a
totally different environment, and one may choose not to dump all the
corresponding tables which I don't think is true for an upgrade. So,
that could be one reason to do differently for upgrades. Do we see
reasons similar to pg_dump/restore due to which after upgrade
subscriptions may not work?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 27 Nov 2023 at 17:12, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 27, 2023 at 3:18 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Nov 25, 2023 at 7:21 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > Few comments on v19:
> > > ==================
> > > 1.
> > > +    <para>
> > > +     The subscriptions will be migrated to the new cluster in a disabled state.
> > > +     After migration, do this:
> > > +    </para>
> > > +
> > > +    <itemizedlist>
> > > +     <listitem>
> > > +      <para>
> > > +       Enable the subscriptions by executing
> > > +       <link linkend="sql-altersubscription"><command>ALTER
> > > SUBSCRIPTION ... ENABLE</command></link>.
> > >
> > > The reason for this restriction is not very clear to me. Is it because
> > > we are using pg_dump for subscription and the existing functionality
> > > is doing it? If so, I think currently even connect is false.
> >
> > This was done this way so that the apply worker doesn't get started
> > while the upgrade is happening. Now that we have set
> > max_logical_replication_workers to 0, the apply workers will not get
> > started during the upgrade process. I think now we can create the
> > subscriptions with the same options as the old cluster in case of
> > upgrade.
> >
>
> Okay, but what is your plan to change it. Currently, we are relying on
> existing pg_dump code to dump subscriptions data, do you want to
> change that? There is a reason for the current behavior of pg_dump
> which as mentioned in docs is: "When dumping logical replication
> subscriptions, pg_dump will generate CREATE SUBSCRIPTION commands that
> use the connect = false option, so that restoring the subscription
> does not make remote connections for creating a replication slot or
> for initial table copy. That way, the dump can be restored without
> requiring network access to the remote servers. It is then up to the
> user to reactivate the subscriptions in a suitable way. If the
> involved hosts have changed, the connection information might have to
> be changed. It might also be appropriate to truncate the target tables
> before initiating a new full table copy."
>
> I guess one reason to not enable subscription after restore was that
> it can't work without origins, and also one can restore the dump in a
> totally different environment, and one may choose not to dump all the
> corresponding tables which I don't think is true for an upgrade. So,
> that could be one reason to do differently for upgrades. Do we see
> reasons similar to pg_dump/restore due to which after upgrade
> subscriptions may not work?

I felt that the behavior for upgrade can be slightly different than
the dump as the subscription relations and the replication origin will
be updated when the subscriber is upgraded. And as the logical
replication workers will not be started during the upgrade we can
preserve the subscription enabled status too. I felt just adding an
"ALTER SUBSCRIPTION sub-name ENABLE" for the subscriptions that were
enabled in the old cluster in case of upgrade like in the attached
patch should be fine. The behavior of dump is not changed it is
retained as it is.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Sat, 25 Nov 2023 at 17:50, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 2.
> + * b) SUBREL_STATE_SYNCDONE: A relation upgraded while in this state
> + * would retain the replication origin in certain cases.
>
> I think this is vague. Can we briefly describe cases where the origins
> would be retained?

Modified

> 3. I think the cases where the publisher is also upgraded restoring
> the origin's LSN is of no use. Currently, I can't see a problem with
> restoring stale originLSN in such cases as we won't be able to
> distinguish during the upgrade but I think we should document it in
> the comments somewhere in the patch.

Added comments

These are handled in the v20 version patch attached at:
https://www.postgresql.org/message-id/CALDaNm0ST1iSrJLD_CV6hQs%3Dw4GZRCRdftQvQA3cO8Hq3QUvYw%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 27 Nov 2023 at 06:53, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch set v19*
>
> //////
>
> v19-0001.
>
> No comments
>
> ///////
>
> v19-0002.
>
> (I saw that both changes below seemed cut/paste from similar
> functions, but I will ask the questions anyway).
>
> ======
> src/backend/commands/subscriptioncmds.c
>
> 1.
> +/* Potentially set by pg_upgrade_support functions */
> +Oid binary_upgrade_next_pg_subscription_oid = InvalidOid;
> +
>
> The comment "by pg_upgrade_support functions" seemed a bit vague. IMO
> you might as well tell the name of the function that sets this.
>
> SUGGESTION
> Potentially set by the pg_upgrade_support function --
> binary_upgrade_set_next_pg_subscription_oid().

Modified

> ~~~
>
> 2. CreateSubscription
>
> + if (!OidIsValid(binary_upgrade_next_pg_subscription_oid))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("pg_subscription OID value not set when in binary upgrade mode")));
>
> Doesn't this condition mean some kind of impossible internal error
> occurred -- i.e. should this be elog instead of ereport?

This is kind of a sanity check to prevent setting the subscription id
with an invalid oid. This can happen if the server is started in
binary upgrade mode and create subscription is called without calling
binary_upgrade_set_next_pg_subscription_oid.

The comment is handled in the v20 version patch attached at:
https://www.postgresql.org/message-id/CALDaNm0ST1iSrJLD_CV6hQs%3Dw4GZRCRdftQvQA3cO8Hq3QUvYw%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Nov 28, 2023 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:
>

Few comments on the latest patch:
===========================
1.
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
+ else
+ appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
+
+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");
+
+ appendPQExpBufferStr(query,
+ "FROM pg_subscription s\n");
+
+ if (fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query,
+ "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
+ "    ON o.external_id = 'pg_' || s.oid::text \n");

Why 'subenabled' have a check for binary_upgrade but
'suboriginremotelsn' doesn't?

2.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
+{
+ Relation rel;
+ HeapTuple tup;
+ Oid subid;
+ Form_pg_subscription form;
+ char    *subname;
+ Oid relid;
+ char relstate;
+ XLogRecPtr sublsn;
+
+ CHECK_IS_BINARY_UPGRADE;
+
+ /* We must check these things before dereferencing the arguments */
+ if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
+ elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is
not allowed");
+
+ subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ relid = PG_GETARG_OID(1);
+ relstate = PG_GETARG_CHAR(2);
+ sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
+
+ tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+ if (!HeapTupleIsValid(tup))
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("relation %u does not exist", relid));
+ ReleaseSysCache(tup);
+
+ rel = table_open(SubscriptionRelationId, RowExclusiveLock);

Why there is no locking for relation? I see that during subscription
operation, we do acquire AccessShareLock on the relation before adding
a corresponding entry in pg_subscription_rel. See the following code:

CreateSubscription()
{
...
foreach(lc, tables)
{
RangeVar   *rv = (RangeVar *) lfirst(lc);
Oid relid;

relid = RangeVarGetRelid(rv, AccessShareLock, false);

/* Check for supported relkind. */
CheckSubscriptionRelkind(get_rel_relkind(relid),
rv->schemaname, rv->relname);

AddSubscriptionRelState(subid, relid, table_state,
InvalidXLogRecPtr);
...
}

3.
+Datum
+binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
{
...
...
+ AddSubscriptionRelState(subid, relid, relstate, sublsn);
...
}

I see a problem with directly using this function which is that it
doesn't release locks which means it expects either the caller to
release those locks or postpone to release them at the transaction
end. However, all the other binary_upgrade support functions don't
postpone releasing locks till the transaction ends. I think we should
add an additional parameter to indicate whether we want to release
locks and then pass it true from the binary upgrade support function.

4.
extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
  int numTables);
 extern void getSubscriptions(Archive *fout);
+extern void getSubscriptionTables(Archive *fout);

getSubscriptions() and getSubscriptionTables() are defined in the
opposite order in .c file. I think it is better to change the order in
.c file unless there is a reason for not doing so.

5. At this stage, no need to update/send the 0002 patch, we can look
at it after the main patch is committed. That is anyway not directly
related to the main patch.

Apart from the above, I have modified a few comments and messages in
the attached. Kindly review and include the changes if you are fine
with those.

--
With Regards,
Amit Kapila.

Attachment

Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are some review comments for patch v20-0001

======

1. getSubscriptions

+ if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
+ appendPQExpBufferStr(query, " s.subenabled\n");
+ else
+ appendPQExpBufferStr(query, " false AS subenabled\n");

Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
is normally default *enabled*, so why does this code set default
differently as 'false'. OTOH, if this is some special case default
needed because the subscription upgrade is not supported before PG17
then maybe it needs a comment to explain.

~~~

2. dumpSubscription

+ if (strcmp(subinfo->subenabled, "t") == 0)
+ {
+ appendPQExpBufferStr(query,
+ "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
+ appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
+ }

(this is a bit similar to previous comment)

Probably I misunderstood this logic... but AFAIK the CREATE
SUBSCRIPTION is normally default *enabled*. In the CREATE SUBSCRIPTION
top of this function I did not see any "enabled=xxx" code, so won't
this just default to enabled=true per normal. In other words, what
happens if the subscription being upgraded was already DISABLED -- How
does it remain disabled still after upgrade?

But I saw there is a test case for this so perhaps the code is fine?
Maybe it just needs more explanatory comments for this area?

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# The subscription's running status should be preserved
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

BEFORE
check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber

SUGGESTION
check that a subscriber that was disabled on the old subscriber is
disabled on the new subscriber

~

BEFORE
check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber

SUGGESTION
check that a subscriber that was enabled on the old subscriber is
enabled on the new subscriber

~~~

4.
+is($result, qq($remote_lsn), "remote_lsn should have been preserved");
+
+
+# Check the number of rows for each table on each server


Double blank lines.

~~~

5.
+$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 DISABLE");
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none)");
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+

Probably it would be tidier to combine all of those.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
On Thu, Nov 30, 2023 at 12:06 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v20-0001
>
> 3.
> +# The subscription's running status should be preserved
> +my $result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
> +is($result, qq(f),
> + "check that the subscriber that was disable on the old subscriber
> should be disabled in the new subscriber"
> +);
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
> +is($result, qq(t),
> + "check that the subscriber that was enabled on the old subscriber
> should be enabled in the new subscriber"
> +);
> +$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
> +
>
> BEFORE
> check that the subscriber that was disable on the old subscriber
> should be disabled in the new subscriber
>
> SUGGESTION
> check that a subscriber that was disabled on the old subscriber is
> disabled on the new subscriber
>
> ~
>
> BEFORE
> check that the subscriber that was enabled on the old subscriber
> should be enabled in the new subscriber
>
> SUGGESTION
> check that a subscriber that was enabled on the old subscriber is
> enabled on the new subscriber
>

Oops. I think that should have been "subscription", not "subscriber". i.e.

SUGGESTION
check that a subscription that was disabled on the old subscriber is
disabled on the new subscriber

and

SUGGESTION
check that a subscription that was enabled on the old subscriber is
enabled on the new subscriber

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Nov 30, 2023 at 6:37 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v20-0001
>
> ======
>
> 1. getSubscriptions
>
> + if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query, " s.subenabled\n");
> + else
> + appendPQExpBufferStr(query, " false AS subenabled\n");
>
> Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
> is normally default *enabled*, so why does this code set default
> differently as 'false'. OTOH, if this is some special case default
> needed because the subscription upgrade is not supported before PG17
> then maybe it needs a comment to explain.
>

Yes, it is for prior versions. By default subscriptions are restored
disabled even if they are enabled before dump. See docs [1] for
reasons (When dumping logical replication subscriptions, ..). I don't
think we need a comment here as that is a norm we use at other similar
places where we do version checking. We can argue that there could be
more comments as to why the 'connect' is false and if those are really
required, we should do that as a separate patch.

[1] - https://www.postgresql.org/docs/devel/app-pgdump.html

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Nov 29, 2023 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

In general, the test cases are a bit complex to understand, so, it
will be difficult to enhance these later. The complexity comes from
the fact that one upgrade test is trying to test multiple things (a)
Enabled/Disabled subscriptions; (b) relation states 'i' and 'r' are
preserved after the upgrade. (c) rows from non-refreshed tables are
not copied, etc. I understand that you may want to cover as many
things possible in one test to have fewer upgrade tests which could
save some time but I think it makes the test somewhat difficult to
understand and enhance. Can we try to split it such that (a) and (b)
are tested in one test and others could be separated out?

Few other comments:
===================
1.
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION
regress_pub"
+);
+
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
+
+# After the above wait_for_subscription_sync call the table can be either in
+# 'syncdone' or in 'ready' state. Now wait till the table reaches
'ready' state.
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";

Can the table be in 'i' state after above test? If not, then above
comment is misleading.

2.
+# ------------------------------------------------------
+# Check that pg_upgrade is successful when all tables are in ready or in
+# init state.
+# ------------------------------------------------------
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before
initial sync')"
+);
+$publisher->wait_for_catchup('regress_sub');

The previous comment applies to this one as well.

3.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
+
+# Change configuration to prepare a subscription table in init state
+$old_sub->append_conf('postgresql.conf',
+ "max_logical_replication_workers = 0");
+$old_sub->restart;
+
+# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
+# and tab_upgraded2 tables.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");

These two cases for Create and Alter look confusing. I think it would
be better if Alter's case is moved before the comment: "Check that
pg_upgrade is successful when all tables are in ready or in init
state.".

4.
+# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
+# it's down.
+insert_line_at_pub('while old_sub is down');

Isn't sub routine insert_line_at_pub() inserts in all three tables? If
so, then the above comment seems to be wrong and I think it is better
to explain the intention of this insert.

5.
+my $result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
+is($result, qq(f),
+ "check that the subscriber that was disable on the old subscriber
should be disabled in the new subscriber"
+);
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
+is($result, qq(t),
+ "check that the subscriber that was enabled on the old subscriber
should be enabled in the new subscriber"
+);

Can't the above be tested with a single query?

6.
+$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
+
+# Subscription relations should be preserved. The upgraded subscriber
won't know
+# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
+$result =
+  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
+is($result, qq(2),
+ "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Here the DROP SUBSCRIPTION looks confusing. Let's try to move it after
the verification of objects after the upgrade.

7.
1.
+sub insert_line_at_pub
+{
+ my $payload = shift;
+
+ foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+ {
+ $publisher->safe_psql('postgres',
+ "INSERT INTO " . $_ . " (val) VALUES('$payload')");
+ }
+}
+
+# Initial setup
+foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
+{
+ $publisher->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+ $old_sub->safe_psql('postgres',
+ "CREATE TABLE " . $_ . " (id serial, val text)");
+}
+insert_line_at_pub('before initial sync');

This makes the test slightly difficult to understand and we don't seem
to achieve much by writing sub routines.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 29 Nov 2023 at 15:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 28, 2023 at 4:12 PM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> Few comments on the latest patch:
> ===========================
> 1.
> + if (fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n");
> + else
> + appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n");
> +
> + if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query, " s.subenabled\n");
> + else
> + appendPQExpBufferStr(query, " false AS subenabled\n");
> +
> + appendPQExpBufferStr(query,
> + "FROM pg_subscription s\n");
> +
> + if (fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query,
> + "LEFT JOIN pg_catalog.pg_replication_origin_status o \n"
> + "    ON o.external_id = 'pg_' || s.oid::text \n");
>
> Why 'subenabled' have a check for binary_upgrade but
> 'suboriginremotelsn' doesn't?

Combined these two now.

> 2.
> +Datum
> +binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
> +{
> + Relation rel;
> + HeapTuple tup;
> + Oid subid;
> + Form_pg_subscription form;
> + char    *subname;
> + Oid relid;
> + char relstate;
> + XLogRecPtr sublsn;
> +
> + CHECK_IS_BINARY_UPGRADE;
> +
> + /* We must check these things before dereferencing the arguments */
> + if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(2))
> + elog(ERROR, "null argument to binary_upgrade_add_sub_rel_state is
> not allowed");
> +
> + subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
> + relid = PG_GETARG_OID(1);
> + relstate = PG_GETARG_CHAR(2);
> + sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);
> +
> + tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
> + if (!HeapTupleIsValid(tup))
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("relation %u does not exist", relid));
> + ReleaseSysCache(tup);
> +
> + rel = table_open(SubscriptionRelationId, RowExclusiveLock);
>
> Why there is no locking for relation? I see that during subscription
> operation, we do acquire AccessShareLock on the relation before adding
> a corresponding entry in pg_subscription_rel. See the following code:
>
> CreateSubscription()
> {
> ...
> foreach(lc, tables)
> {
> RangeVar   *rv = (RangeVar *) lfirst(lc);
> Oid relid;
>
> relid = RangeVarGetRelid(rv, AccessShareLock, false);
>
> /* Check for supported relkind. */
> CheckSubscriptionRelkind(get_rel_relkind(relid),
> rv->schemaname, rv->relname);
>
> AddSubscriptionRelState(subid, relid, table_state,
> InvalidXLogRecPtr);
> ...
> }

Modified

> 3.
> +Datum
> +binary_upgrade_add_sub_rel_state(PG_FUNCTION_ARGS)
> {
> ...
> ...
> + AddSubscriptionRelState(subid, relid, relstate, sublsn);
> ...
> }
>
> I see a problem with directly using this function which is that it
> doesn't release locks which means it expects either the caller to
> release those locks or postpone to release them at the transaction
> end. However, all the other binary_upgrade support functions don't
> postpone releasing locks till the transaction ends. I think we should
> add an additional parameter to indicate whether we want to release
> locks and then pass it true from the binary upgrade support function.

Modified

> 4.
> extern void getPublicationTables(Archive *fout, TableInfo tblinfo[],
>   int numTables);
>  extern void getSubscriptions(Archive *fout);
> +extern void getSubscriptionTables(Archive *fout);
>
> getSubscriptions() and getSubscriptionTables() are defined in the
> opposite order in .c file. I think it is better to change the order in
> .c file unless there is a reason for not doing so.

Modified

> 5. At this stage, no need to update/send the 0002 patch, we can look
> at it after the main patch is committed. That is anyway not directly
> related to the main patch.

Removed it from this version.

> Apart from the above, I have modified a few comments and messages in
> the attached. Kindly review and include the changes if you are fine
> with those.

Merged them.

The attached v21 version patch has the change for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 30 Nov 2023 at 06:37, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v20-0001
>
> ======
>
> 1. getSubscriptions
>
> + if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
> + appendPQExpBufferStr(query, " s.subenabled\n");
> + else
> + appendPQExpBufferStr(query, " false AS subenabled\n");
>
> Probably I misunderstood this logic... AFAIK the CREATE SUBSCRIPTION
> is normally default *enabled*, so why does this code set default
> differently as 'false'. OTOH, if this is some special case default
> needed because the subscription upgrade is not supported before PG17
> then maybe it needs a comment to explain.

No changes needed to be done in this case, explanation for the same is
given at [1]

> ~~~
>
> 2. dumpSubscription
>
> + if (strcmp(subinfo->subenabled, "t") == 0)
> + {
> + appendPQExpBufferStr(query,
> + "\n-- For binary upgrade, must preserve the subscriber's running state.\n");
> + appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s ENABLE;\n", qsubname);
> + }
>
> (this is a bit similar to previous comment)
>
> Probably I misunderstood this logic... but AFAIK the CREATE
> SUBSCRIPTION is normally default *enabled*. In the CREATE SUBSCRIPTION
> top of this function I did not see any "enabled=xxx" code, so won't
> this just default to enabled=true per normal. In other words, what
> happens if the subscription being upgraded was already DISABLED -- How
> does it remain disabled still after upgrade?
>
> But I saw there is a test case for this so perhaps the code is fine?
> Maybe it just needs more explanatory comments for this area?

No changes needed to be done in this case, explanation for the same is
given at [1]

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 3.
> +# The subscription's running status should be preserved
> +my $result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
> +is($result, qq(f),
> + "check that the subscriber that was disable on the old subscriber
> should be disabled in the new subscriber"
> +);
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
> +is($result, qq(t),
> + "check that the subscriber that was enabled on the old subscriber
> should be enabled in the new subscriber"
> +);
> +$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
> +
>
> BEFORE
> check that the subscriber that was disable on the old subscriber
> should be disabled in the new subscriber
>
> SUGGESTION
> check that a subscriber that was disabled on the old subscriber is
> disabled on the new subscriber
> ~
>
> BEFORE
> check that the subscriber that was enabled on the old subscriber
> should be enabled in the new subscriber
>
> SUGGESTION
> check that a subscriber that was enabled on the old subscriber is
> enabled on the new subscriber

These statements are combined now

> ~~~
>
> 4.
> +is($result, qq($remote_lsn), "remote_lsn should have been preserved");
> +
> +
> +# Check the number of rows for each table on each server
>
>
> Double blank lines.

Modified

> ~~~
>
> 5.
> +$old_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub1 DISABLE");
> +$old_sub->safe_psql('postgres',
> + "ALTER SUBSCRIPTION regress_sub1 SET (slot_name = none)");
> +$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
> +
>
> Probably it would be tidier to combine all of those.

Modified

The changes for the same is present in the v21 version patch attached at [2]

[1] -
https://www.postgresql.org/message-id/CAA4eK1JpWkRBFMDC3wOCK%3DHzCXg8XT1jH-tWb%3Db%2B%2B_8YS2%3DQSQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm37E4tmSZd%2Bk1ixtKevX3eucmhdOnw4pGmykZk4C1Nm4Q%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 30 Nov 2023 at 13:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 29, 2023 at 3:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> In general, the test cases are a bit complex to understand, so, it
> will be difficult to enhance these later. The complexity comes from
> the fact that one upgrade test is trying to test multiple things (a)
> Enabled/Disabled subscriptions; (b) relation states 'i' and 'r' are
> preserved after the upgrade. (c) rows from non-refreshed tables are
> not copied, etc. I understand that you may want to cover as many
> things possible in one test to have fewer upgrade tests which could
> save some time but I think it makes the test somewhat difficult to
> understand and enhance. Can we try to split it such that (a) and (b)
> are tested in one test and others could be separated out?

Yes, I had tried to combine a few tests as it was taking more time to
run. I have refactored the tests by removing tab_not_upgraded1 related
test which is more of a logical replication test, adding more
comments, removing intermediate select count checks. So now we have
test1) which checks for upgrade with subscriber having table in
init/ready state, test2) Check that the data inserted to the publisher
when the subscriber is down will  be replicated to the new subscriber
once the new subscriber is started (these are done as continuation of
the previous test). test3) Check that pg_upgrade fails when
max_replication_slots configured in the new cluster is less than the
number of subscriptions in the old cluster. test4) Check upgrade fails
with old instance with relation in 'd' datasync(invalid) state and
missing replication origin.
In test4 I have combined both datasync relation state and missing
replication origin as the validation for both is in the same file. I
felt the readability is better now, do let me know if any of the test
is still difficult to understand.

> Few other comments:
> ===================
> 1.
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub CONNECTION '$connstr' PUBLICATION
> regress_pub"
> +);
> +
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub');
> +
> +# After the above wait_for_subscription_sync call the table can be either in
> +# 'syncdone' or in 'ready' state. Now wait till the table reaches
> 'ready' state.
> +my $synced_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
> +$old_sub->poll_query_until('postgres', $synced_query)
> +  or die "Timed out while waiting for the table to reach ready state";
>
> Can the table be in 'i' state after above test? If not, then above
> comment is misleading.

This part of test is to get the table in ready state/ modified the
comments appropriately

> 2.
> +# ------------------------------------------------------
> +# Check that pg_upgrade is successful when all tables are in ready or in
> +# init state.
> +# ------------------------------------------------------
> +$publisher->safe_psql('postgres',
> + "INSERT INTO tab_upgraded1 VALUES (generate_series(2,50), 'before
> initial sync')"
> +);
> +$publisher->wait_for_catchup('regress_sub');
>
> The previous comment applies to this one as well.

I have removed this comment and moved it before the upgrade command as
it is more appropriate there.

> 3.
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1"
> +);
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
> +
> +# Change configuration to prepare a subscription table in init state
> +$old_sub->append_conf('postgresql.conf',
> + "max_logical_replication_workers = 0");
> +$old_sub->restart;
> +
> +# Add tab_upgraded2 to the publication. Now publication has tab_upgraded1
> +# and tab_upgraded2 tables.
> +$publisher->safe_psql('postgres',
> + "ALTER PUBLICATION regress_pub ADD TABLE tab_upgraded2");
> +
> +$old_sub->safe_psql('postgres',
> + "ALTER SUBSCRIPTION regress_sub REFRESH PUBLICATION");
>
> These two cases for Create and Alter look confusing. I think it would
> be better if Alter's case is moved before the comment: "Check that
> pg_upgrade is successful when all tables are in ready or in init
> state.".

I have added more comments to make it clear now. I have moved the
"check that pg_upgrade is successful when all tables ..." before the
upgrade command to be more clearer. Added comment "Pre-setup for
preparing subscription table in init state. Add tab_upgraded2 to the
publication." and "# The table tab_upgraded2 will be in init state as
the subscriber configuration  for max_logical_replication_workers is
set to 0."

> 4.
> +# Insert a row in tab_upgraded1 and tab_not_upgraded1 publisher table while
> +# it's down.
> +insert_line_at_pub('while old_sub is down');
>
> Isn't sub routine insert_line_at_pub() inserts in all three tables? If
> so, then the above comment seems to be wrong and I think it is better
> to explain the intention of this insert.

Modified

> 5.
> +my $result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub'");
> +is($result, qq(f),
> + "check that the subscriber that was disable on the old subscriber
> should be disabled in the new subscriber"
> +);
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription WHERE subname = 'regress_sub1'");
> +is($result, qq(t),
> + "check that the subscriber that was enabled on the old subscriber
> should be enabled in the new subscriber"
> +);
>
> Can't the above be tested with a single query?

Modified

> 6.
> +$new_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub1");
> +
> +# Subscription relations should be preserved. The upgraded subscriber
> won't know
> +# about 'tab_not_upgraded1' because the subscription is not yet refreshed.
> +$result =
> +  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
> +is($result, qq(2),
> + "there should be 2 rows in pg_subscription_rel(representing
> tab_upgraded1 and tab_upgraded2)"
> +);
>
> Here the DROP SUBSCRIPTION looks confusing. Let's try to move it after
> the verification of objects after the upgrade.

I have removed this now, no need to move it to down as we will be
stopping the newsub server at the end of this test and this newsub
will not be used later.

> 7.
> 1.
> +sub insert_line_at_pub
> +{
> + my $payload = shift;
> +
> + foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
> + {
> + $publisher->safe_psql('postgres',
> + "INSERT INTO " . $_ . " (val) VALUES('$payload')");
> + }
> +}
> +
> +# Initial setup
> +foreach ("tab_upgraded1", "tab_upgraded2", "tab_not_upgraded1")
> +{
> + $publisher->safe_psql('postgres',
> + "CREATE TABLE " . $_ . " (id serial, val text)");
> + $old_sub->safe_psql('postgres',
> + "CREATE TABLE " . $_ . " (id serial, val text)");
> +}
> +insert_line_at_pub('before initial sync');
>
> This makes the test slightly difficult to understand and we don't seem
> to achieve much by writing sub routines.

Removed the subroutines.

The changes for the same is available at:
https://www.postgresql.org/message-id/CALDaNm37E4tmSZd%2Bk1ixtKevX3eucmhdOnw4pGmykZk4C1Nm4Q%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Peter Smith
Date:
Here are review comments for patch v21-0001

======
src/bin/pg_upgrade/check.c

1. check_old_cluster_subscription_state

+/*
+ * check_old_cluster_subscription_state()
+ *
+ * Verify that each of the subscriptions has all their corresponding tables in
+ * i (initialize) or r (ready).
+ */
+static void
+check_old_cluster_subscription_state(void)

Function comment should also mention it also validates the origin.

~~~

2.
In this function there are a couple of errors written to the
"subs_invalid.txt" file:

+ fprintf(script, "replication origin is missing for database:\"%s\"
subscription:\"%s\"\n",
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1));

and

+ fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
relation:\"%s\" state:\"%s\" not in required state\n",
+ active_db->db_name,
+ PQgetvalue(res, i, 0),
+ PQgetvalue(res, i, 1),
+ PQgetvalue(res, i, 2),
+ PQgetvalue(res, i, 3));

The format of those messages is not consistent. It could be improved
in a number of ways to make them more similar. e.g. below.

SUGGESTION #1
the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
the table sync state \"%s\" is not allowed for database:\"%s\"
subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n

SUGGESTION #2
database:\"%s\" subscription:\"%s\" -- replication origin is missing\n
database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" --
upgrade when table sync state is \"%s\" is not supported\n

etc.

======
src/bin/pg_upgrade/t/004_subscription.pl

3.
+# Initial setup
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
+$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");

IMO it is tidier to combine multiple DDLS whenever you can.

~~~

4.
+# Create a subscription in enabled state before upgrade
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1"
+);
+$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');

That publication has an empty set of tables. Should there be some
comment to explain why it is OK like this?

~~~

5.
+# Wait till the table tab_upgraded1 reaches 'ready' state
+my $synced_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
+$old_sub->poll_query_until('postgres', $synced_query)
+  or die "Timed out while waiting for the table to reach ready state";
+
+$publisher->safe_psql('postgres',
+ "INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
+);
+$publisher->wait_for_catchup('regress_sub2');

IMO better without the blank line, so then everything more clearly
belongs to this same comment.

~~~

6.
+# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
+# to the publication.
+$publisher->safe_psql('postgres',
+ "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+
+$old_sub->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");

Ditto. IMO better without the blank line, so then everything more
clearly belongs to this same comment.

~~~

7.
+command_ok(
+ [
+ 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+ '-D', $new_sub->data_dir, '-b', $oldbindir,
+ '-B', $newbindir, '-s', $new_sub->host,
+ '-p', $old_sub->port, '-P', $new_sub->port,
+ $mode
+ ],
+ 'run of pg_upgrade for old instance when the subscription tables are
in init/ready state'
+);

Maybe those 'command_ok' args can be formatted neatly (like you've
done later for the 'command_checks_all').

~~~

8.
+# ------------------------------------------------------
+# Check that the data inserted to the publisher when the subscriber
is down will
+# be replicated to the new subscriber once the new subscriber is started.
+# ------------------------------------------------------

8a.
SUGGESTION
...when the new subscriber is down will be replicated once it is started.

~

8b.
I thought this main comment should also say something like "Also check
that the old subscription states and relations origins are all
preserved."

~~~

9.
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded1 VALUES(51)");
+$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded2 VALUES(1)");

IMO it is tidier to combine multiple DDLS whenever you can.

~~~

10.
+# The subscription's running status should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+ "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is($result, qq(t
+f),
+ "check that the subscription's running status are preserved"
+);

I felt this was a bit too tricky. It might be more readable to do 2
separate SELECTs with explicit subnames. Alternatively, leave the code
as-is but improve the comment to explicitly say something like:

# old subscription regress_sub was enabled
# old subscription regress_sub1 was disabled

~~~

11.
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than number of subscriptions in the old cluster.
+# ------------------------------------------------------
+my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
+$new_sub1->init;
+$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+$old_sub->stop;

/than number/than the number/

Should that old_sub->stop have been part of the previous cleanup steps?

~~~

12.
+$old_sub->start;
+
+# Drop the subscription
+$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");

Maybe it is tidier puttin that 'start' below the comment.

~~~

13.
+# ------------------------------------------------------
+# Check that pg_upgrade refuses to run in:
+# a) if there's a subscription with tables in a state other than 'r' (ready) or
+#    'i' (init) and/or
+# b) if the subscription has no replication origin.
+# ------------------------------------------------------

13a.
/refuses to run in:/refuses to run if:/

~

13b.
/a) if/a)/

~

13c.
/b) if/b)/

~~~

14.
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (enabled=false)"
+);
+
+my $subid = $old_sub->safe_psql('postgres',
+ "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
+my $reporigin = 'pg_' . qq($subid);
+
+# Drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+ "SELECT pg_replication_origin_drop('$reporigin')");
+
+$old_sub->stop;

14a.
IMO better to have all this without blank lines, because it all
belongs to the first comment.

~

14b.
That 2nd comment "# Drop the..." is not required because the first
comment already says the same.

======
src/include/catalog/pg_subscription_rel.h

15.
 extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
- XLogRecPtr sublsn);
+ XLogRecPtr sublsn, bool upgrade);

Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
matches the comment you added in that function?

If you agree, then change it here and also in the function definition.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Dec 1, 2023 at 10:57 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are review comments for patch v21-0001
>
>
> 2.
> In this function there are a couple of errors written to the
> "subs_invalid.txt" file:
>
> + fprintf(script, "replication origin is missing for database:\"%s\"
> subscription:\"%s\"\n",
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1));
>
> and
>
> + fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
> relation:\"%s\" state:\"%s\" not in required state\n",
> + active_db->db_name,
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1),
> + PQgetvalue(res, i, 2),
> + PQgetvalue(res, i, 3));
>
> The format of those messages is not consistent. It could be improved
> in a number of ways to make them more similar. e.g. below.
>
> SUGGESTION #1
> the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
> the table sync state \"%s\" is not allowed for database:\"%s\"
> subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n
>

+1. Shall we keep 'the' as 'The' in the message? Few other messages in
the same file start with capital letter.

>
> 4.
> +# Create a subscription in enabled state before upgrade
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1"
> +);
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
>
> That publication has an empty set of tables. Should there be some
> comment to explain why it is OK like this?
>

I think we can add a comment to state the intention of overall test
which this is part of.

>
> 10.
> +# The subscription's running status should be preserved
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription ORDER BY subname");
> +is($result, qq(t
> +f),
> + "check that the subscription's running status are preserved"
> +);
>
> I felt this was a bit too tricky. It might be more readable to do 2
> separate SELECTs with explicit subnames. Alternatively, leave the code
> as-is but improve the comment to explicitly say something like:
>
> # old subscription regress_sub was enabled
> # old subscription regress_sub1 was disabled
>

I don't see the need to have separate queries though adding comments
is a good idea.

>
> 15.
>  extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
> - XLogRecPtr sublsn);
> + XLogRecPtr sublsn, bool upgrade);
>
> Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
> matches the comment you added in that function?
>

It is better to name this parameter as retain_lock and then explain it
in the function header. The bigger problem with change is that we
should release the other lock
(LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);)
taken in the function as well.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 1 Dec 2023 at 10:57, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are review comments for patch v21-0001
>
> ======
> src/bin/pg_upgrade/check.c
>
> 1. check_old_cluster_subscription_state
>
> +/*
> + * check_old_cluster_subscription_state()
> + *
> + * Verify that each of the subscriptions has all their corresponding tables in
> + * i (initialize) or r (ready).
> + */
> +static void
> +check_old_cluster_subscription_state(void)
>
> Function comment should also mention it also validates the origin.

Modified

> ~~~
>
> 2.
> In this function there are a couple of errors written to the
> "subs_invalid.txt" file:
>
> + fprintf(script, "replication origin is missing for database:\"%s\"
> subscription:\"%s\"\n",
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1));
>
> and
>
> + fprintf(script, "database:\"%s\" subscription:\"%s\" schema:\"%s\"
> relation:\"%s\" state:\"%s\" not in required state\n",
> + active_db->db_name,
> + PQgetvalue(res, i, 0),
> + PQgetvalue(res, i, 1),
> + PQgetvalue(res, i, 2),
> + PQgetvalue(res, i, 3));
>
> The format of those messages is not consistent. It could be improved
> in a number of ways to make them more similar. e.g. below.
>
> SUGGESTION #1
> the replication origin is missing for database:\"%s\" subscription:\"%s\"\n
> the table sync state \"%s\" is not allowed for database:\"%s\"
> subscription:\"%s\" schema:\"%s\" relation:\"%s\"\n
>
> SUGGESTION #2
> database:\"%s\" subscription:\"%s\" -- replication origin is missing\n
> database:\"%s\" subscription:\"%s\" schema:\"%s\" relation:\"%s\" --
> upgrade when table sync state is \"%s\" is not supported\n
>
> etc.

Modified based on SUGGESTION#1

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 3.
> +# Initial setup
> +$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
> +$publisher->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
> +$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded1(id int)");
> +$old_sub->safe_psql('postgres', "CREATE TABLE tab_upgraded2(id int)");
>
> IMO it is tidier to combine multiple DDLS whenever you can.

Modified

> ~~~
>
> 4.
> +# Create a subscription in enabled state before upgrade
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1"
> +);
> +$old_sub->wait_for_subscription_sync($publisher, 'regress_sub1');
>
> That publication has an empty set of tables. Should there be some
> comment to explain why it is OK like this?

This test is just to verify that the enabled subscriptions will be
enabled after upgrade, we don't need data for this. Data validation
happens with a different subscriptin. Modified comments

> ~~~
>
> 5.
> +# Wait till the table tab_upgraded1 reaches 'ready' state
> +my $synced_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'r'";
> +$old_sub->poll_query_until('postgres', $synced_query)
> +  or die "Timed out while waiting for the table to reach ready state";
> +
> +$publisher->safe_psql('postgres',
> + "INSERT INTO tab_upgraded1 VALUES (generate_series(1,50))"
> +);
> +$publisher->wait_for_catchup('regress_sub2');
>
> IMO better without the blank line, so then everything more clearly
> belongs to this same comment.

Modified

> ~~~
>
> 6.
> +# Pre-setup for preparing subscription table in init state. Add tab_upgraded2
> +# to the publication.
> +$publisher->safe_psql('postgres',
> + "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
> +
> +$old_sub->safe_psql('postgres',
> + "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
>
> Ditto. IMO better without the blank line, so then everything more
> clearly belongs to this same comment.

Modified

> ~~~
>
> 7.
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> + '-D', $new_sub->data_dir, '-b', $oldbindir,
> + '-B', $newbindir, '-s', $new_sub->host,
> + '-p', $old_sub->port, '-P', $new_sub->port,
> + $mode
> + ],
> + 'run of pg_upgrade for old instance when the subscription tables are
> in init/ready state'
> +);
>
> Maybe those 'command_ok' args can be formatted neatly (like you've
> done later for the 'command_checks_all').

This is based on the run from pgperlytidy. Even if i format it
pgperltidy reverts the formatting that I have done. I have seen the
same is the case with other upgrade commands in few places. So not
making any changes for this.

> ~~~
>
> 8.
> +# ------------------------------------------------------
> +# Check that the data inserted to the publisher when the subscriber
> is down will
> +# be replicated to the new subscriber once the new subscriber is started.
> +# ------------------------------------------------------
>
> 8a.
> SUGGESTION
> ...when the new subscriber is down will be replicated once it is started.
>

Modified

> ~
>
> 8b.
> I thought this main comment should also say something like "Also check
> that the old subscription states and relations origins are all
> preserved."

Modified

> ~~~
>
> 9.
> +$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded1 VALUES(51)");
> +$publisher->safe_psql('postgres', "INSERT INTO tab_upgraded2 VALUES(1)");
>
> IMO it is tidier to combine multiple DDLS whenever you can.

Modified

> ~~~
>
> 10.
> +# The subscription's running status should be preserved
> +$result =
> +  $new_sub->safe_psql('postgres',
> + "SELECT subenabled FROM pg_subscription ORDER BY subname");
> +is($result, qq(t
> +f),
> + "check that the subscription's running status are preserved"
> +);
>
> I felt this was a bit too tricky. It might be more readable to do 2
> separate SELECTs with explicit subnames. Alternatively, leave the code
> as-is but improve the comment to explicitly say something like:
>
> # old subscription regress_sub was enabled
> # old subscription regress_sub1 was disabled

Modified to add comments.

> ~~~
>
> 11.
> +# ------------------------------------------------------
> +# Check that pg_upgrade fails when max_replication_slots configured in the new
> +# cluster is less than number of subscriptions in the old cluster.
> +# ------------------------------------------------------
> +my $new_sub1 = PostgreSQL::Test::Cluster->new('new_sub1');
> +$new_sub1->init;
> +$new_sub1->append_conf('postgresql.conf', "max_replication_slots = 0");
> +
> +$old_sub->stop;
>
> /than number/than the number/
>
> Should that old_sub->stop have been part of the previous cleanup steps?

Modified

> ~~~
>
> 12.
> +$old_sub->start;
> +
> +# Drop the subscription
> +$old_sub->safe_psql('postgres', "DROP SUBSCRIPTION regress_sub2");
>
> Maybe it is tidier puttin that 'start' below the comment.

Modified

> ~~~
>
> 13.
> +# ------------------------------------------------------
> +# Check that pg_upgrade refuses to run in:
> +# a) if there's a subscription with tables in a state other than 'r' (ready) or
> +#    'i' (init) and/or
> +# b) if the subscription has no replication origin.
> +# ------------------------------------------------------
>
> 13a.
> /refuses to run in:/refuses to run if:/

Modified

> ~
>
> 13b.
> /a) if/a)/

Modified

> ~
>
> 13c.
> /b) if/b)/

Modified

> ~~~
>
> 14.
> +# Create another subscription and drop the subscription's replication origin
> +$old_sub->safe_psql('postgres',
> + "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
> regress_pub3 WITH (enabled=false)"
> +);
> +
> +my $subid = $old_sub->safe_psql('postgres',
> + "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
> +my $reporigin = 'pg_' . qq($subid);
> +
> +# Drop the subscription's replication origin
> +$old_sub->safe_psql('postgres',
> + "SELECT pg_replication_origin_drop('$reporigin')");
> +
> +$old_sub->stop;
>
> 14a.
> IMO better to have all this without blank lines, because it all
> belongs to the first comment.

Modified

>
> 14b.
> That 2nd comment "# Drop the..." is not required because the first
> comment already says the same.

Modified

> ======
> src/include/catalog/pg_subscription_rel.h
>
> 15.
>  extern void AddSubscriptionRelState(Oid subid, Oid relid, char state,
> - XLogRecPtr sublsn);
> + XLogRecPtr sublsn, bool upgrade);
>
> Shouldn't this 'upgrade' really be 'binary_upgrade' so it better
> matches the comment you added in that function?
>
> If you agree, then change it here and also in the function definition.

Modified it to retain_lock based on suggestions from [1]

The attached v22 version patch has the changes for the same.

[1] - https://www.postgresql.org/message-id/CAA4eK1KFEHhJEo43k_qUpC0Eod34zVq%3DKae34koEDrPFXzeeJg%40mail.gmail.com

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
> The attached v22 version patch has the changes for the same.
>

I have made minor changes in the comments and code at various places.
See and let me know if you are not happy with the changes. I think
unless there are more suggestions or comments, we can proceed with
committing it.

--
With Regards,
Amit Kapila.

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> I have made minor changes in the comments and code at various places.
> See and let me know if you are not happy with the changes. I think
> unless there are more suggestions or comments, we can proceed with
> committing it.

Yeah.  I am planning to look more closely at what you have here, and
it is going to take me a bit more time though (some more stuff planned
for next CF, an upcoming conference and end/beginning-of-year
vacations), but I think that targetting the beginning of next CF in
January would be OK.

Overall, I have the impression that the patch looks pretty solid, with
a restriction in place for "init" and "ready" relations, while there
are tests to check all the states that we expect.  Seeing coverage
about all that makes me a happy hacker.

+ * If retain_lock is true, then don't release the locks taken in this function.
+ * We normally release the locks at the end of transaction but in binary-upgrade
+ * mode, we expect to release those immediately.

I think that this should be documented in pg_upgrade_support.c where
the caller expects the locks to be released, and why these should be
released.  There is a risk that this comment becomes obsolete if
AddSubscriptionRelState() with locks released is called in a different
code path.  Anyway, I am not sure to get why this is OK, or even
necessary.  It seems like a good practice to keep the locks on the
subscription until the transaction that updates its state.  If there's
a specific reason explaining why that's better, the patch should tell
why.

+     * However, this shouldn't be a problem as the upgrade ensures
+     * that all the transactions were replicated before upgrading the
+     * publisher.

This wording looks a bit confusing to me, as "the upgrade" could refer
to the upgrade of a subscriber, but what we want to tell is that the
replay of the transactions is enforced when doing a publisher upgrade.
I'd suggest something like "the upgrade of the publisher ensures that
all the transactions were replicated before upgrading it".

+my $result = $old_sub->safe_psql('postgres',
+   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
+is($result, qq(t), "Check that the table is in init state");

Hmm.  Not sure that this is safe.  Shouldn't this be a
poll_query_until(), polling that the state of the relation is what we
want it to be after requesting a fresh of the publication on the
subscriber?
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > I have made minor changes in the comments and code at various places.
> > See and let me know if you are not happy with the changes. I think
> > unless there are more suggestions or comments, we can proceed with
> > committing it.
>
> Yeah.  I am planning to look more closely at what you have here, and
> it is going to take me a bit more time though (some more stuff planned
> for next CF, an upcoming conference and end/beginning-of-year
> vacations), but I think that targetting the beginning of next CF in
> January would be OK.
>
> Overall, I have the impression that the patch looks pretty solid, with
> a restriction in place for "init" and "ready" relations, while there
> are tests to check all the states that we expect.  Seeing coverage
> about all that makes me a happy hacker.
>
> + * If retain_lock is true, then don't release the locks taken in this function.
> + * We normally release the locks at the end of transaction but in binary-upgrade
> + * mode, we expect to release those immediately.
>
> I think that this should be documented in pg_upgrade_support.c where
> the caller expects the locks to be released, and why these should be
> released.  There is a risk that this comment becomes obsolete if
> AddSubscriptionRelState() with locks released is called in a different
> code path.  Anyway, I am not sure to get why this is OK, or even
> necessary.  It seems like a good practice to keep the locks on the
> subscription until the transaction that updates its state.  If there's
> a specific reason explaining why that's better, the patch should tell
> why.
>

It is to be consistent with other code paths in the upgrade. We
followed existing coding rules like what we do in
binary_upgrade_set_missing_value->SetAttrMissing(). The probable
theory is that during the upgrade we are not worried about concurrent
operations being blocked till the transaction ends. As in this
particular case, we know that the apply worker won't try to sync any
of those relations or a concurrent DDL won't try to remove it from the
pg_subscrition_rel. This point is not being explicitly commented
because of its similarity with the existing code.

>
> +my $result = $old_sub->safe_psql('postgres',
> +   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
> +is($result, qq(t), "Check that the table is in init state");
>
> Hmm.  Not sure that this is safe.  Shouldn't this be a
> poll_query_until(), polling that the state of the relation is what we
> want it to be after requesting a fresh of the publication on the
> subscriber?
>

This is safe because the init state should be marked by the "Alter
Subscription ... Refresh .." command itself. What exactly makes you
think that such a poll would be required?

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Masahiko Sawada
Date:
On Mon, Dec 4, 2023 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > The attached v22 version patch has the changes for the same.
> >
>
> I have made minor changes in the comments and code at various places.
> See and let me know if you are not happy with the changes. I think
> unless there are more suggestions or comments, we can proceed with
> committing it.
>

It seems the patch is already close to ready-to-commit state but I've
had a look at the v23 patch with fresh eyes. It looks mostly good to
me and there are some minor comments:

---
+   tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+   if (!HeapTupleIsValid(tup))
+       ereport(ERROR,
+               errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+               errmsg("relation %u does not exist", relid));
+   ReleaseSysCache(tup);

Given what we want to do here is just an existence check, isn't it
clearer if we use SearchSysCacheExists1() instead?

---
+        query = createPQExpBuffer();
+        appendPQExpBuffer(query, "SELECT srsubid, srrelid,
srsubstate, srsublsn"
+                                          " FROM
pg_catalog.pg_subscription_rel"
+                                          " ORDER BY srsubid");
+        res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
+

Probably we don't need to use PQExpBuffer here since the query to
execute is a static string.

---
+# The subscription's running status should be preserved. Old subscription
+# regress_sub1 should be enabled and old subscription regress_sub2 should be
+# disabled.
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT subenabled FROM pg_subscription ORDER BY subname");
+is( $result, qq(t
+f),
+        "check that the subscription's running status are preserved");
+

How about showing the subname along with the subenabled so that we can
check if each subscription is in an expected state in case where
something error happens?

---
+# Subscription relations should be preserved
+$result =
+  $new_sub->safe_psql('postgres',
+        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
+is($result, qq(2),
+        "there should be 2 rows in pg_subscription_rel(representing
tab_upgraded1 and tab_upgraded2)"
+);

Is there any reason why we check only the number of rows in
pg_subscription_rel? I guess it might be a good idea to check if table
OIDs there are also preserved.

---
+# Enable the subscription
+$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
+$publisher->wait_for_catchup('regress_sub2');
+

IIUC after making the subscription regress_sub2 enabled, we will start
the initial table sync for the table tab_upgraded2. If so, shouldn't
we use wait_for_subscription_sync() instead?

---
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+        "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr'
PUBLICATION regress_pub3 WITH (enabled=false)"

It's better to put spaces before and after '='.

---
+my $subid = $old_sub->safe_psql('postgres',
+        "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");

I think we can reuse $sub_oid.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade and logical replication

From
Masahiko Sawada
Date:
On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > > I have made minor changes in the comments and code at various places.
> > > See and let me know if you are not happy with the changes. I think
> > > unless there are more suggestions or comments, we can proceed with
> > > committing it.
> >
> > Yeah.  I am planning to look more closely at what you have here, and
> > it is going to take me a bit more time though (some more stuff planned
> > for next CF, an upcoming conference and end/beginning-of-year
> > vacations), but I think that targetting the beginning of next CF in
> > January would be OK.
> >
> > Overall, I have the impression that the patch looks pretty solid, with
> > a restriction in place for "init" and "ready" relations, while there
> > are tests to check all the states that we expect.  Seeing coverage
> > about all that makes me a happy hacker.
> >
> > + * If retain_lock is true, then don't release the locks taken in this function.
> > + * We normally release the locks at the end of transaction but in binary-upgrade
> > + * mode, we expect to release those immediately.
> >
> > I think that this should be documented in pg_upgrade_support.c where
> > the caller expects the locks to be released, and why these should be
> > released.  There is a risk that this comment becomes obsolete if
> > AddSubscriptionRelState() with locks released is called in a different
> > code path.  Anyway, I am not sure to get why this is OK, or even
> > necessary.  It seems like a good practice to keep the locks on the
> > subscription until the transaction that updates its state.  If there's
> > a specific reason explaining why that's better, the patch should tell
> > why.
> >
>
> It is to be consistent with other code paths in the upgrade. We
> followed existing coding rules like what we do in
> binary_upgrade_set_missing_value->SetAttrMissing(). The probable
> theory is that during the upgrade we are not worried about concurrent
> operations being blocked till the transaction ends. As in this
> particular case, we know that the apply worker won't try to sync any
> of those relations or a concurrent DDL won't try to remove it from the
> pg_subscrition_rel. This point is not being explicitly commented
> because of its similarity with the existing code.

It seems no problem to me with releasing locks early, I'm not sure how
much it helps in better concurrency as it acquires lower level locks
such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
acquires AccessExclusiveLock on the table on the other hand).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Dec 7, 2023 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > > > I have made minor changes in the comments and code at various places.
> > > > See and let me know if you are not happy with the changes. I think
> > > > unless there are more suggestions or comments, we can proceed with
> > > > committing it.
> > >
> > > Yeah.  I am planning to look more closely at what you have here, and
> > > it is going to take me a bit more time though (some more stuff planned
> > > for next CF, an upcoming conference and end/beginning-of-year
> > > vacations), but I think that targetting the beginning of next CF in
> > > January would be OK.
> > >
> > > Overall, I have the impression that the patch looks pretty solid, with
> > > a restriction in place for "init" and "ready" relations, while there
> > > are tests to check all the states that we expect.  Seeing coverage
> > > about all that makes me a happy hacker.
> > >
> > > + * If retain_lock is true, then don't release the locks taken in this function.
> > > + * We normally release the locks at the end of transaction but in binary-upgrade
> > > + * mode, we expect to release those immediately.
> > >
> > > I think that this should be documented in pg_upgrade_support.c where
> > > the caller expects the locks to be released, and why these should be
> > > released.  There is a risk that this comment becomes obsolete if
> > > AddSubscriptionRelState() with locks released is called in a different
> > > code path.  Anyway, I am not sure to get why this is OK, or even
> > > necessary.  It seems like a good practice to keep the locks on the
> > > subscription until the transaction that updates its state.  If there's
> > > a specific reason explaining why that's better, the patch should tell
> > > why.
> > >
> >
> > It is to be consistent with other code paths in the upgrade. We
> > followed existing coding rules like what we do in
> > binary_upgrade_set_missing_value->SetAttrMissing(). The probable
> > theory is that during the upgrade we are not worried about concurrent
> > operations being blocked till the transaction ends. As in this
> > particular case, we know that the apply worker won't try to sync any
> > of those relations or a concurrent DDL won't try to remove it from the
> > pg_subscrition_rel. This point is not being explicitly commented
> > because of its similarity with the existing code.
>
> It seems no problem to me with releasing locks early, I'm not sure how
> much it helps in better concurrency as it acquires lower level locks
> such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
> acquires AccessExclusiveLock on the table on the other hand).
>

True, but we have kept it that way from the consistency point of view
as well. We can change it if you think otherwise.

--
With Regards,
Amit Kapila.



RE: pg_upgrade and logical replication

From
"Zhijie Hou (Fujitsu)"
Date:
On Thursday, December 7, 2023 10:23 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Thu, Dec 7, 2023 at 7:26 AM Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
> >
> > On Tue, Dec 5, 2023 at 6:37 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> > > On Tue, Dec 5, 2023 at 10:56 AM Michael Paquier <michael@paquier.xyz>
> wrote:
> > > >
> > > > On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > > > > I have made minor changes in the comments and code at various
> places.
> > > > > See and let me know if you are not happy with the changes. I
> > > > > think unless there are more suggestions or comments, we can
> > > > > proceed with committing it.
> > > >
> > > > Yeah.  I am planning to look more closely at what you have here,
> > > > and it is going to take me a bit more time though (some more stuff
> > > > planned for next CF, an upcoming conference and
> > > > end/beginning-of-year vacations), but I think that targetting the
> > > > beginning of next CF in January would be OK.
> > > >
> > > > Overall, I have the impression that the patch looks pretty solid,
> > > > with a restriction in place for "init" and "ready" relations,
> > > > while there are tests to check all the states that we expect.
> > > > Seeing coverage about all that makes me a happy hacker.
> > > >
> > > > + * If retain_lock is true, then don't release the locks taken in this function.
> > > > + * We normally release the locks at the end of transaction but in
> > > > + binary-upgrade
> > > > + * mode, we expect to release those immediately.
> > > >
> > > > I think that this should be documented in pg_upgrade_support.c
> > > > where the caller expects the locks to be released, and why these
> > > > should be released.  There is a risk that this comment becomes
> > > > obsolete if
> > > > AddSubscriptionRelState() with locks released is called in a
> > > > different code path.  Anyway, I am not sure to get why this is OK,
> > > > or even necessary.  It seems like a good practice to keep the
> > > > locks on the subscription until the transaction that updates its
> > > > state.  If there's a specific reason explaining why that's better,
> > > > the patch should tell why.
> > > >
> > >
> > > It is to be consistent with other code paths in the upgrade. We
> > > followed existing coding rules like what we do in
> > > binary_upgrade_set_missing_value->SetAttrMissing(). The probable
> > > theory is that during the upgrade we are not worried about
> > > concurrent operations being blocked till the transaction ends. As in
> > > this particular case, we know that the apply worker won't try to
> > > sync any of those relations or a concurrent DDL won't try to remove
> > > it from the pg_subscrition_rel. This point is not being explicitly
> > > commented because of its similarity with the existing code.
> >
> > It seems no problem to me with releasing locks early, I'm not sure how
> > much it helps in better concurrency as it acquires lower level locks
> > such as AccessShareLock and RowExclusiveLock though (SetAttrMissing()
> > acquires AccessExclusiveLock on the table on the other hand).
> >
> 
> True, but we have kept it that way from the consistency point of view as well.
> We can change it if you think otherwise.

I also look into the patch and didn't find problems about the locking in
AddSubscriptionRelState.

About concurrency stuff, the lock on subscription object and
pg_subscription_rel only conflicts with ALTER/DROP SUBSCRIPTION which holds
AccessExclusiveLock lock, but since there are not concurrent ALTER
SUBSCRIPTION cmds during upgrade, so I think it's OK to release it earlier.

I also thought about the cache invalidation stuff as we modified the catalog
which will generate catcahe invalidateion. But the apply worker which build
cache based on the pg_subscription_rel is not running, and no concurrent
ALTER/DROP SUBSCRIPTION cmds will be executed, so it looks OK as well.

Best Regards,
Hou zj

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > I have made minor changes in the comments and code at various places.
> > See and let me know if you are not happy with the changes. I think
> > unless there are more suggestions or comments, we can proceed with
> > committing it.
>
> Yeah.  I am planning to look more closely at what you have here, and
> it is going to take me a bit more time though (some more stuff planned
> for next CF, an upcoming conference and end/beginning-of-year
> vacations), but I think that targetting the beginning of next CF in
> January would be OK.
>
> Overall, I have the impression that the patch looks pretty solid, with
> a restriction in place for "init" and "ready" relations, while there
> are tests to check all the states that we expect.  Seeing coverage
> about all that makes me a happy hacker.
>
> + * If retain_lock is true, then don't release the locks taken in this function.
> + * We normally release the locks at the end of transaction but in binary-upgrade
> + * mode, we expect to release those immediately.
>
> I think that this should be documented in pg_upgrade_support.c where
> the caller expects the locks to be released, and why these should be
> released.  There is a risk that this comment becomes obsolete if
> AddSubscriptionRelState() with locks released is called in a different
> code path.  Anyway, I am not sure to get why this is OK, or even
> necessary.  It seems like a good practice to keep the locks on the
> subscription until the transaction that updates its state.  If there's
> a specific reason explaining why that's better, the patch should tell
> why.

Added comments for this.

> +     * However, this shouldn't be a problem as the upgrade ensures
> +     * that all the transactions were replicated before upgrading the
> +     * publisher.
> This wording looks a bit confusing to me, as "the upgrade" could refer
> to the upgrade of a subscriber, but what we want to tell is that the
> replay of the transactions is enforced when doing a publisher upgrade.
> I'd suggest something like "the upgrade of the publisher ensures that
> all the transactions were replicated before upgrading it".

Modified

> +my $result = $old_sub->safe_psql('postgres',
> +   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
> +is($result, qq(t), "Check that the table is in init state");
>
> Hmm.  Not sure that this is safe.  Shouldn't this be a
> poll_query_until(), polling that the state of the relation is what we
> want it to be after requesting a fresh of the publication on the
> subscriber?

This is not required as the table will be added in init state after
"Alter Subscription ... Refresh .." command itself.

Thanks for the comments, the attached v24 version patch has the
changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 7 Dec 2023 at 07:20, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Dec 4, 2023 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Dec 1, 2023 at 11:24 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > The attached v22 version patch has the changes for the same.
> > >
> >
> > I have made minor changes in the comments and code at various places.
> > See and let me know if you are not happy with the changes. I think
> > unless there are more suggestions or comments, we can proceed with
> > committing it.
> >
>
> It seems the patch is already close to ready-to-commit state but I've
> had a look at the v23 patch with fresh eyes. It looks mostly good to
> me and there are some minor comments:
>
> ---
> +   tup = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
> +   if (!HeapTupleIsValid(tup))
> +       ereport(ERROR,
> +               errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +               errmsg("relation %u does not exist", relid));
> +   ReleaseSysCache(tup);
>
> Given what we want to do here is just an existence check, isn't it
> clearer if we use SearchSysCacheExists1() instead?

Modified

> ---
> +        query = createPQExpBuffer();
> +        appendPQExpBuffer(query, "SELECT srsubid, srrelid,
> srsubstate, srsublsn"
> +                                          " FROM
> pg_catalog.pg_subscription_rel"
> +                                          " ORDER BY srsubid");
> +        res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);
> +
>
> Probably we don't need to use PQExpBuffer here since the query to
> execute is a static string.

Modified

> ---
> +# The subscription's running status should be preserved. Old subscription
> +# regress_sub1 should be enabled and old subscription regress_sub2 should be
> +# disabled.
> +$result =
> +  $new_sub->safe_psql('postgres',
> +        "SELECT subenabled FROM pg_subscription ORDER BY subname");
> +is( $result, qq(t
> +f),
> +        "check that the subscription's running status are preserved");
> +
>
> How about showing the subname along with the subenabled so that we can
> check if each subscription is in an expected state in case where
> something error happens?

Modified

> ---
> +# Subscription relations should be preserved
> +$result =
> +  $new_sub->safe_psql('postgres',
> +        "SELECT count(*) FROM pg_subscription_rel WHERE srsubid = $sub_oid");
> +is($result, qq(2),
> +        "there should be 2 rows in pg_subscription_rel(representing
> tab_upgraded1 and tab_upgraded2)"
> +);
>
> Is there any reason why we check only the number of rows in
> pg_subscription_rel? I guess it might be a good idea to check if table
> OIDs there are also preserved.

Modified

> ---
> +# Enable the subscription
> +$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub2 ENABLE");
> +$publisher->wait_for_catchup('regress_sub2');
> +
>
> IIUC after making the subscription regress_sub2 enabled, we will start
> the initial table sync for the table tab_upgraded2. If so, shouldn't
> we use wait_for_subscription_sync() instead?

Modified

> ---
> +# Create another subscription and drop the subscription's replication origin
> +$old_sub->safe_psql('postgres',
> +        "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr'
> PUBLICATION regress_pub3 WITH (enabled=false)"
>
> It's better to put spaces before and after '='.

Modified

> ---
> +my $subid = $old_sub->safe_psql('postgres',
> +        "SELECT oid FROM pg_subscription WHERE subname = 'regress_sub4'");
>
> I think we can reuse $sub_oid.

Modified

Thanks for the comments, the v24 version patch attached at [1] has the
changes for the same.
[1] - https://www.postgresql.org/message-id/CALDaNm27%2BB6hiCS3g3nUDpfwmTaj6YopSY5ovo2%3D__iOSpkPbA%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Masahiko Sawada
Date:
On Thu, Dec 7, 2023 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > > I have made minor changes in the comments and code at various places.
> > > See and let me know if you are not happy with the changes. I think
> > > unless there are more suggestions or comments, we can proceed with
> > > committing it.
> >
> > Yeah.  I am planning to look more closely at what you have here, and
> > it is going to take me a bit more time though (some more stuff planned
> > for next CF, an upcoming conference and end/beginning-of-year
> > vacations), but I think that targetting the beginning of next CF in
> > January would be OK.
> >
> > Overall, I have the impression that the patch looks pretty solid, with
> > a restriction in place for "init" and "ready" relations, while there
> > are tests to check all the states that we expect.  Seeing coverage
> > about all that makes me a happy hacker.
> >
> > + * If retain_lock is true, then don't release the locks taken in this function.
> > + * We normally release the locks at the end of transaction but in binary-upgrade
> > + * mode, we expect to release those immediately.
> >
> > I think that this should be documented in pg_upgrade_support.c where
> > the caller expects the locks to be released, and why these should be
> > released.  There is a risk that this comment becomes obsolete if
> > AddSubscriptionRelState() with locks released is called in a different
> > code path.  Anyway, I am not sure to get why this is OK, or even
> > necessary.  It seems like a good practice to keep the locks on the
> > subscription until the transaction that updates its state.  If there's
> > a specific reason explaining why that's better, the patch should tell
> > why.
>
> Added comments for this.
>
> > +     * However, this shouldn't be a problem as the upgrade ensures
> > +     * that all the transactions were replicated before upgrading the
> > +     * publisher.
> > This wording looks a bit confusing to me, as "the upgrade" could refer
> > to the upgrade of a subscriber, but what we want to tell is that the
> > replay of the transactions is enforced when doing a publisher upgrade.
> > I'd suggest something like "the upgrade of the publisher ensures that
> > all the transactions were replicated before upgrading it".
>
> Modified
>
> > +my $result = $old_sub->safe_psql('postgres',
> > +   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
> > +is($result, qq(t), "Check that the table is in init state");
> >
> > Hmm.  Not sure that this is safe.  Shouldn't this be a
> > poll_query_until(), polling that the state of the relation is what we
> > want it to be after requesting a fresh of the publication on the
> > subscriber?
>
> This is not required as the table will be added in init state after
> "Alter Subscription ... Refresh .." command itself.
>
> Thanks for the comments, the attached v24 version patch has the
> changes for the same.

Thank you for updating the patch.

Here are some minor comments:

+        if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
+                ereport(ERROR,
+                                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("relation %u does not exist", relid));
+

I think the error code should be ERRCODE_UNDEFINED_TABLE, and the
error message should be something like "relation with OID %u does not
exist". Or we might not need such checks since an undefined-object
error is caught by relation_open()?

---
+        /* Fetch the existing tuple. */
+        tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
+                                                  CStringGetDatum(subname));
+        if (!HeapTupleIsValid(tup))
+                ereport(ERROR,
+                                errcode(ERRCODE_UNDEFINED_OBJECT),
+                                errmsg("subscription \"%s\" does not
exist", subname));
+
+        form = (Form_pg_subscription) GETSTRUCT(tup);
+        subid = form->oid;

The above code can be replaced with "get_subscription_oid(subname,
false)". binary_upgrade_replorigin_advance() has the same code.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 13 Dec 2023 at 01:56, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 7, 2023 at 8:15 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 5 Dec 2023 at 10:56, Michael Paquier <michael@paquier.xyz> wrote:
> > >
> > > On Mon, Dec 04, 2023 at 04:30:49PM +0530, Amit Kapila wrote:
> > > > I have made minor changes in the comments and code at various places.
> > > > See and let me know if you are not happy with the changes. I think
> > > > unless there are more suggestions or comments, we can proceed with
> > > > committing it.
> > >
> > > Yeah.  I am planning to look more closely at what you have here, and
> > > it is going to take me a bit more time though (some more stuff planned
> > > for next CF, an upcoming conference and end/beginning-of-year
> > > vacations), but I think that targetting the beginning of next CF in
> > > January would be OK.
> > >
> > > Overall, I have the impression that the patch looks pretty solid, with
> > > a restriction in place for "init" and "ready" relations, while there
> > > are tests to check all the states that we expect.  Seeing coverage
> > > about all that makes me a happy hacker.
> > >
> > > + * If retain_lock is true, then don't release the locks taken in this function.
> > > + * We normally release the locks at the end of transaction but in binary-upgrade
> > > + * mode, we expect to release those immediately.
> > >
> > > I think that this should be documented in pg_upgrade_support.c where
> > > the caller expects the locks to be released, and why these should be
> > > released.  There is a risk that this comment becomes obsolete if
> > > AddSubscriptionRelState() with locks released is called in a different
> > > code path.  Anyway, I am not sure to get why this is OK, or even
> > > necessary.  It seems like a good practice to keep the locks on the
> > > subscription until the transaction that updates its state.  If there's
> > > a specific reason explaining why that's better, the patch should tell
> > > why.
> >
> > Added comments for this.
> >
> > > +     * However, this shouldn't be a problem as the upgrade ensures
> > > +     * that all the transactions were replicated before upgrading the
> > > +     * publisher.
> > > This wording looks a bit confusing to me, as "the upgrade" could refer
> > > to the upgrade of a subscriber, but what we want to tell is that the
> > > replay of the transactions is enforced when doing a publisher upgrade.
> > > I'd suggest something like "the upgrade of the publisher ensures that
> > > all the transactions were replicated before upgrading it".
> >
> > Modified
> >
> > > +my $result = $old_sub->safe_psql('postgres',
> > > +   "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'i'");
> > > +is($result, qq(t), "Check that the table is in init state");
> > >
> > > Hmm.  Not sure that this is safe.  Shouldn't this be a
> > > poll_query_until(), polling that the state of the relation is what we
> > > want it to be after requesting a fresh of the publication on the
> > > subscriber?
> >
> > This is not required as the table will be added in init state after
> > "Alter Subscription ... Refresh .." command itself.
> >
> > Thanks for the comments, the attached v24 version patch has the
> > changes for the same.
>
> Thank you for updating the patch.
>
> Here are some minor comments:
>
> +        if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid)))
> +                ereport(ERROR,
> +                                errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                                errmsg("relation %u does not exist", relid));
> +
>
> I think the error code should be ERRCODE_UNDEFINED_TABLE, and the
> error message should be something like "relation with OID %u does not
> exist". Or we might not need such checks since an undefined-object
> error is caught by relation_open()?

I have removed this as it will be caught by relation_open.

> ---
> +        /* Fetch the existing tuple. */
> +        tup = SearchSysCache2(SUBSCRIPTIONNAME, MyDatabaseId,
> +                                                  CStringGetDatum(subname));
> +        if (!HeapTupleIsValid(tup))
> +                ereport(ERROR,
> +                                errcode(ERRCODE_UNDEFINED_OBJECT),
> +                                errmsg("subscription \"%s\" does not
> exist", subname));
> +
> +        form = (Form_pg_subscription) GETSTRUCT(tup);
> +        subid = form->oid;

Modified

> The above code can be replaced with "get_subscription_oid(subname,
> false)". binary_upgrade_replorigin_advance() has the same code.

Modified

Thanks for the comments, the attached v25 version patch has the
changes for the same.

Regards,
Vignesh

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the comments, the attached v25 version patch has the
> changes for the same.
>

I have looked at it again and made some cosmetic changes like changing
some comments and a minor change in one of the error messages. See, if
the changes look okay to you.

--
With Regards,
Amit Kapila.

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the comments, the attached v25 version patch has the
> > changes for the same.
> >
>
> I have looked at it again and made some cosmetic changes like changing
> some comments and a minor change in one of the error messages. See, if
> the changes look okay to you.

Thanks, the changes look good.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for the comments, the attached v25 version patch has the
> > > changes for the same.
> > >
> >
> > I have looked at it again and made some cosmetic changes like changing
> > some comments and a minor change in one of the error messages. See, if
> > the changes look okay to you.
>
> Thanks, the changes look good.
>

Pushed.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:
> On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:
>> Thanks, the changes look good.
>
> Pushed.

Yeah!  Thanks Amit and everybody involved here!  Thanks also to Julien
for raising the thread and the problem, to start with.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Jan 3, 2024 at 6:21 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:
> > On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:
> >> Thanks, the changes look good.
> >
> > Pushed.
>
> Yeah!  Thanks Amit and everybody involved here!  Thanks also to Julien
> for raising the thread and the problem, to start with.
>

I think the next possible step here is to document how to upgrade the
logical replication nodes as previously discussed in this thread [1].
IIRC, there were a few issues with the steps mentioned but if we want
to document those we can start a separate thread for it as that
involves both publishers and subscribers.

[1] - https://www.postgresql.org/message-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY%2B2iHC40HBYCSLYmyRg0wSw%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Jan 03, 2024 at 11:24:50AM +0530, Amit Kapila wrote:
> I think the next possible step here is to document how to upgrade the
> logical replication nodes as previously discussed in this thread [1].
> IIRC, there were a few issues with the steps mentioned but if we want
> to document those we can start a separate thread for it as that
> involves both publishers and subscribers.
>
> [1] - https://www.postgresql.org/message-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY%2B2iHC40HBYCSLYmyRg0wSw%40mail.gmail.com

Yep.  A second thing is whether it makes sense to have more automated
test coverage when it comes to the interferences between subscribers
and publishers with more complex node structures.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Jan 3, 2024 at 11:33 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jan 03, 2024 at 11:24:50AM +0530, Amit Kapila wrote:
> > I think the next possible step here is to document how to upgrade the
> > logical replication nodes as previously discussed in this thread [1].
> > IIRC, there were a few issues with the steps mentioned but if we want
> > to document those we can start a separate thread for it as that
> > involves both publishers and subscribers.
> >
> > [1] - https://www.postgresql.org/message-id/CALDaNm2pe7SoOGtRkrTNsnZPnaaY%2B2iHC40HBYCSLYmyRg0wSw%40mail.gmail.com
>
> Yep.  A second thing is whether it makes sense to have more automated
> test coverage when it comes to the interferences between subscribers
> and publishers with more complex node structures.
>

I think it would be good to finish the pending patch to improve the
IsBinaryUpgrade check [1] which we decided to do once this patch is
ready. Would you like to take that up or do you want me to finish it?

[1] - https://www.postgresql.org/message-id/ZU2TeVkUg5qEi7Oy%40paquier.xyz
[2] - https://www.postgresql.org/message-id/ZVQtUTdJACnsbbpd%40paquier.xyz

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Jan 03, 2024 at 03:18:50PM +0530, Amit Kapila wrote:
> I think it would be good to finish the pending patch to improve the
> IsBinaryUpgrade check [1] which we decided to do once this patch is
> ready. Would you like to take that up or do you want me to finish it?
>
> [1] - https://www.postgresql.org/message-id/ZU2TeVkUg5qEi7Oy%40paquier.xyz
> [2] - https://www.postgresql.org/message-id/ZVQtUTdJACnsbbpd%40paquier.xyz

Yep, that's on my TODO.  I can send a new version at the beginning of
next week.  No problem.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 3 Jan 2024 at 11:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 3, 2024 at 6:21 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:
> > > On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:
> > >> Thanks, the changes look good.
> > >
> > > Pushed.
> >
> > Yeah!  Thanks Amit and everybody involved here!  Thanks also to Julien
> > for raising the thread and the problem, to start with.
> >
>
> I think the next possible step here is to document how to upgrade the
> logical replication nodes as previously discussed in this thread [1].
> IIRC, there were a few issues with the steps mentioned but if we want
> to document those we can start a separate thread for it as that
> involves both publishers and subscribers.

I have posted a patch for this at:
https://www.postgresql.org/message-id/CALDaNm1_iDO6srWzntqTr0ZDVkk2whVhNKEWAvtgZBfSmuBeZQ%40mail.gmail.com

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Tue, 2 Jan 2024 at 15:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 29, 2023 at 2:26 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, 28 Dec 2023 at 15:59, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 13, 2023 at 12:09 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Thanks for the comments, the attached v25 version patch has the
> > > > changes for the same.
> > > >
> > >
> > > I have looked at it again and made some cosmetic changes like changing
> > > some comments and a minor change in one of the error messages. See, if
> > > the changes look okay to you.
> >
> > Thanks, the changes look good.
> >
>
> Pushed.

Thanks for pushing this patch, I have updated the commitfest entry to
Committed for the same.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Jan 03, 2024 at 03:18:50PM +0530, Amit Kapila wrote:
> I think it would be good to finish the pending patch to improve the
> IsBinaryUpgrade check [1] which we decided to do once this patch is
> ready. Would you like to take that up or do you want me to finish it?
>
> [1] - https://www.postgresql.org/message-id/ZU2TeVkUg5qEi7Oy%40paquier.xyz
> [2] - https://www.postgresql.org/message-id/ZVQtUTdJACnsbbpd%40paquier.xyz

My apologies for the delay, again.  I have sent an update here:
https://www.postgresql.org/message-id/ZZ4f3zKu0YyFndHi@paquier.xyz
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Wed, 14 Feb 2024 at 09:07, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Justin,
>
> > pg_upgrade/t/004_subscription.pl says
> >
> > |my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
> >
> > ..but I think maybe it should not.
> >
> > When you try to use --link, it fails:
> > https://cirrus-ci.com/task/4669494061170688
> >
> > |Adding ".old" suffix to old global/pg_control                 ok
> > |
> > |If you want to start the old cluster, you will need to remove
> > |the ".old" suffix from
> > /tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
> > bscription_old_sub_data/pgdata/global/pg_control.old.
> > |Because "link" mode was used, the old cluster cannot be safely
> > |started once the new cluster has been started.
> > |...
> > |
> > |postgres: could not find the database system
> > |Expected to find it in the directory
> > "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> > ubscription_old_sub_data/pgdata",
> > |but could not open file
> > "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> > ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
> > |# No postmaster PID for node "old_sub"
> > |[19:36:01.396](0.250s) Bail out!  pg_ctl start failed
> >
>
> Good catch! The primal reason of the failure is to reuse the old cluster, even after
> the successful upgrade. The documentation said [1]:
>
> >
> If you use link mode, the upgrade will be much faster (no file copying) and use less
> disk space, but you will not be able to access your old cluster once you start the new
> cluster after the upgrade.
> >
>
> > You could rename pg_control.old to avoid that immediate error, but that doesn't
> > address the essential issue that "the old cluster cannot be safely started once
> > the new cluster has been started."
>
> Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
> IIUC, pg_upgrade would be run third times in 004_subscription.
>
> 1. successful upgrade
> 2. failure due to the insufficient max_replication_slot
> 3. failure because the pg_subscription_rel has 'd' state
>
> And old instance is reused in all of runs. Therefore, the most reasonable fix is to
> change the ordering of tests, i.e., "successful upgrade" should be done at last.
>
> Attached patch modified the test accordingly. Also, it contains some optimizations.

Your proposal to change the tests in the following order: a) failure
due to the insufficient max_replication_slot b) failure because the
pg_subscription_rel has 'd' state c) successful upgrade. looks good to
me.
I have also verified that your changes fixes the issue as the
successful upgrade is moved to the end and the old cluster is no
longer used after upgrade.

One minor suggestion:
There is an extra line break here, this can be removed:
@@ -181,139 +310,5 @@ is($result, qq(1),
        "check the data is synced after enabling the subscription for
the table that was in init state"
 );

-# cleanup

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Justin Pryzby
Date:
On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:
> Pushed.

pg_upgrade/t/004_subscription.pl says

|my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

..but I think maybe it should not.

When you try to use --link, it fails:
https://cirrus-ci.com/task/4669494061170688

|Adding ".old" suffix to old global/pg_control                 ok
|
|If you want to start the old cluster, you will need to remove
|the ".old" suffix from
/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata/global/pg_control.old.
|Because "link" mode was used, the old cluster cannot be safely
|started once the new cluster has been started.
|...
|
|postgres: could not find the database system
|Expected to find it in the directory
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata",
|but could not open file
"/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_subscription_old_sub_data/pgdata/global/pg_control":
Nosuch file or directory
 
|# No postmaster PID for node "old_sub"
|[19:36:01.396](0.250s) Bail out!  pg_ctl start failed

You could rename pg_control.old to avoid that immediate error, but that doesn't
address the essential issue that "the old cluster cannot be safely started once
the new cluster has been started."

-- 
Justin



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Tue, Feb 13, 2024 at 03:05:14PM -0600, Justin Pryzby wrote:
> On Tue, Jan 02, 2024 at 03:58:25PM +0530, Amit Kapila wrote:
> > Pushed.
>
> pg_upgrade/t/004_subscription.pl says
>
> |my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
>
> ..but I think maybe it should not.
>
> When you try to use --link, it fails:
> https://cirrus-ci.com/task/4669494061170688

Thanks.  It is the kind of things we don't want to lose sight on, so I
have taken this occasion to create a wiki page for the open items of
17, and added this one to it:
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items
--
Michael

Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Justin,

> pg_upgrade/t/004_subscription.pl says
>
> |my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
>
> ..but I think maybe it should not.
>
> When you try to use --link, it fails:
> https://cirrus-ci.com/task/4669494061170688
>
> |Adding ".old" suffix to old global/pg_control                 ok
> |
> |If you want to start the old cluster, you will need to remove
> |the ".old" suffix from
> /tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
> bscription_old_sub_data/pgdata/global/pg_control.old.
> |Because "link" mode was used, the old cluster cannot be safely
> |started once the new cluster has been started.
> |...
> |
> |postgres: could not find the database system
> |Expected to find it in the directory
> "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> ubscription_old_sub_data/pgdata",
> |but could not open file
> "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
> |# No postmaster PID for node "old_sub"
> |[19:36:01.396](0.250s) Bail out!  pg_ctl start failed
>

Good catch! The primal reason of the failure is to reuse the old cluster, even after
the successful upgrade. The documentation said [1]:

>
If you use link mode, the upgrade will be much faster (no file copying) and use less
disk space, but you will not be able to access your old cluster once you start the new
cluster after the upgrade.
>

> You could rename pg_control.old to avoid that immediate error, but that doesn't
> address the essential issue that "the old cluster cannot be safely started once
> the new cluster has been started."

Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
IIUC, pg_upgrade would be run third times in 004_subscription.

1. successful upgrade
2. failure due to the insufficient max_replication_slot
3. failure because the pg_subscription_rel has 'd' state

And old instance is reused in all of runs. Therefore, the most reasonable fix is to
change the ordering of tests, i.e., "successful upgrade" should be done at last.

Attached patch modified the test accordingly. Also, it contains some optimizations.
This can pass the test on my env:

```
pg_upgrade]$ PG_TEST_PG_UPGRADE_MODE='--link' PG_TEST_TIMEOUT_DEFAULT=10 make check PROVE_TESTS='t/004_subscription.pl'
...
# +++ tap check in src/bin/pg_upgrade +++
t/004_subscription.pl .. ok
All tests successful.
Files=1, Tests=14,  9 wallclock secs ( 0.03 usr  0.00 sys +  0.55 cusr  1.08 csys =  1.66 CPU)
Result: PASS
```

How do you think?

[1]: https://www.postgresql.org/docs/devel/pgupgrade.html

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/



Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

Thanks for verifying the fix!

> Your proposal to change the tests in the following order: a) failure
> due to the insufficient max_replication_slot b) failure because the
> pg_subscription_rel has 'd' state c) successful upgrade. looks good to
> me.

Right.

> I have also verified that your changes fixes the issue as the
> successful upgrade is moved to the end and the old cluster is no
> longer used after upgrade.

Yeah, it is same as my expectation.

> One minor suggestion:
> There is an extra line break here, this can be removed:
> @@ -181,139 +310,5 @@ is($result, qq(1),
>         "check the data is synced after enabling the subscription for
> the table that was in init state"
>  );
> 
> -# cleanup
>

Removed.

PSA a new version patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Feb 14, 2024 at 9:07 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> > pg_upgrade/t/004_subscription.pl says
> >
> > |my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
> >
> > ..but I think maybe it should not.
> >
> > When you try to use --link, it fails:
> > https://cirrus-ci.com/task/4669494061170688
> >
> > |Adding ".old" suffix to old global/pg_control                 ok
> > |
> > |If you want to start the old cluster, you will need to remove
> > |the ".old" suffix from
> > /tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_su
> > bscription_old_sub_data/pgdata/global/pg_control.old.
> > |Because "link" mode was used, the old cluster cannot be safely
> > |started once the new cluster has been started.
> > |...
> > |
> > |postgres: could not find the database system
> > |Expected to find it in the directory
> > "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> > ubscription_old_sub_data/pgdata",
> > |but could not open file
> > "/tmp/cirrus-ci-build/build/testrun/pg_upgrade/004_subscription/data/t_004_s
> > ubscription_old_sub_data/pgdata/global/pg_control": No such file or directory
> > |# No postmaster PID for node "old_sub"
> > |[19:36:01.396](0.250s) Bail out!  pg_ctl start failed
> >
>
> Good catch! The primal reason of the failure is to reuse the old cluster, even after
> the successful upgrade. The documentation said [1]:
>
> >
> If you use link mode, the upgrade will be much faster (no file copying) and use less
> disk space, but you will not be able to access your old cluster once you start the new
> cluster after the upgrade.
> >
>
> > You could rename pg_control.old to avoid that immediate error, but that doesn't
> > address the essential issue that "the old cluster cannot be safely started once
> > the new cluster has been started."
>
> Yeah, I agreed that it should be avoided to access to the old cluster after the upgrade.
> IIUC, pg_upgrade would be run third times in 004_subscription.
>
> 1. successful upgrade
> 2. failure due to the insufficient max_replication_slot
> 3. failure because the pg_subscription_rel has 'd' state
>
> And old instance is reused in all of runs. Therefore, the most reasonable fix is to
> change the ordering of tests, i.e., "successful upgrade" should be done at last.
>

This sounds like a reasonable way to address the reported problem.
Justin, do let me know if you think otherwise?

Comment:
===========
*
-# Setup an enabled subscription to verify that the running status and failover
-# option are retained after the upgrade.
+# Setup a subscription to verify that the failover option are retained after
+# the upgrade.
 $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
 $old_sub->safe_psql('postgres',
- "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true)"
+ "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
regress_pub1 WITH (failover = true, enabled = false)"
 );

I think it is better not to create a subscription in the early stage
which we wanted to use for the success case. Let's have separate
subscriptions for failure and success cases. I think that will avoid
the newly added ALTER statements in the patch.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Justin Pryzby
Date:
On Wed, Feb 14, 2024 at 03:37:03AM +0000, Hayato Kuroda (Fujitsu) wrote:
> Attached patch modified the test accordingly. Also, it contains some optimizations.
> This can pass the test on my env:

What optimizations?  I can't see them, and since the patch is described
as rearranging test cases (and therefore already difficult to read), I
guess they should be a separate patch, or the optimizations described.

-- 
Justin



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit,

> This sounds like a reasonable way to address the reported problem.

OK, thanks!

> Justin, do let me know if you think otherwise?
> 
> Comment:
> ===========
> *
> -# Setup an enabled subscription to verify that the running status and failover
> -# option are retained after the upgrade.
> +# Setup a subscription to verify that the failover option are retained after
> +# the upgrade.
>  $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
>  $old_sub->safe_psql('postgres',
> - "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> regress_pub1 WITH (failover = true)"
> + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> PUBLICATION
> regress_pub1 WITH (failover = true, enabled = false)"
>  );
> 
> I think it is better not to create a subscription in the early stage
> which we wanted to use for the success case. Let's have separate
> subscriptions for failure and success cases. I think that will avoid
> the newly added ALTER statements in the patch.

I made a patch to avoid creating objects as much as possible, but it
may lead some confusion. I recreated a patch for creating pub/sub
and dropping them at cleanup for every test cases.

PSA a new version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Justin,

Thanks for replying!

> What optimizations?  I can't see them, and since the patch is described
> as rearranging test cases (and therefore already difficult to read), I
> guess they should be a separate patch, or the optimizations described.

The basic idea was to reduce number of CREATE/DROP statement,
but it was changed for now - publications and subscriptions were created and
dropped per testcases.

E.g., In case of successful upgrade, below steps were done:

1. create two publications
2. create a subscription with failover = true
3. avoid further initial sync by setting max_logical_replication_workers = 0
4. create another subscription
5. confirm statuses of tables are either of 'i' or 'r'
6. run pg_upgrade
7. confirm table statuses are preserved
8. confirm replication origins are preserved.

New patch is available in [1].

[1]:
https://www.postgresql.org/message-id/TYCPR01MB12077B16EEDA360BA645B96F8F54C2%40TYCPR01MB12077.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/




Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 16 Feb 2024 at 08:22, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Amit,
>
> > This sounds like a reasonable way to address the reported problem.
>
> OK, thanks!
>
> > Justin, do let me know if you think otherwise?
> >
> > Comment:
> > ===========
> > *
> > -# Setup an enabled subscription to verify that the running status and failover
> > -# option are retained after the upgrade.
> > +# Setup a subscription to verify that the failover option are retained after
> > +# the upgrade.
> >  $publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> >  $old_sub->safe_psql('postgres',
> > - "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr' PUBLICATION
> > regress_pub1 WITH (failover = true)"
> > + "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> > PUBLICATION
> > regress_pub1 WITH (failover = true, enabled = false)"
> >  );
> >
> > I think it is better not to create a subscription in the early stage
> > which we wanted to use for the success case. Let's have separate
> > subscriptions for failure and success cases. I think that will avoid
> > the newly added ALTER statements in the patch.
>
> I made a patch to avoid creating objects as much as possible, but it
> may lead some confusion. I recreated a patch for creating pub/sub
> and dropping them at cleanup for every test cases.
>
> PSA a new version.

Thanks for the updated patch, few suggestions:
1) Can we use a new publication for this subscription too so that the
publication and subscription naming will become consistent throughout
the test case:
+# Table will be in 'd' (data is being copied) state as table sync will fail
+# because of primary key constraint error.
+my $started_query =
+  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
+$old_sub->poll_query_until('postgres', $started_query)
+  or die
+  "Timed out while waiting for the table state to become 'd' (datasync)";
+
+# Create another subscription and drop the subscription's replication origin
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
PUBLICATION regress_pub2 WITH (enabled = false)"
+);

So after the change it will become like subscription regress_sub3 for
publication regress_pub3, subscription regress_sub4 for publication
regress_pub4 and subscription regress_sub5 for publication
regress_pub5.

2) The tab_upgraded1 table can be created along with create
publication and create subscription itself:
$publisher->safe_psql('postgres',
"CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
$old_sub->safe_psql('postgres',
"CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
regress_pub3 WITH (failover = true)"
);

3) The tab_upgraded2 table can be created along with create
publication and create subscription itself to keep it consistent:
 $publisher->safe_psql('postgres',
-       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
+       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
 $old_sub->safe_psql('postgres',
-       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
+       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
PUBLICATION regress_pub4"
+);

With above fixes, the following can be removed:
# Initial setup
$publisher->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);
$old_sub->safe_psql(
'postgres', qq[
CREATE TABLE tab_upgraded1(id int);
CREATE TABLE tab_upgraded2(id int);
]);

Regards,
Vignesh



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

Thanks for reviewing! PSA new version.

> 
> Thanks for the updated patch, few suggestions:
> 1) Can we use a new publication for this subscription too so that the
> publication and subscription naming will become consistent throughout
> the test case:
> +# Table will be in 'd' (data is being copied) state as table sync will fail
> +# because of primary key constraint error.
> +my $started_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
> +$old_sub->poll_query_until('postgres', $started_query)
> +  or die
> +  "Timed out while waiting for the table state to become 'd' (datasync)";
> +
> +# Create another subscription and drop the subscription's replication origin
> +$old_sub->safe_psql('postgres',
> +       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
> PUBLICATION regress_pub2 WITH (enabled = false)"
> +);
>
> So after the change it will become like subscription regress_sub3 for
> publication regress_pub3, subscription regress_sub4 for publication
> regress_pub4 and subscription regress_sub5 for publication
> regress_pub5.

A new publication was defined.

> 2) The tab_upgraded1 table can be created along with create
> publication and create subscription itself:
> $publisher->safe_psql('postgres',
> "CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
> $old_sub->safe_psql('postgres',
> "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
> regress_pub3 WITH (failover = true)"
> );

The definition of tab_upgraded1 was moved to the place you pointed.

> 3) The tab_upgraded2 table can be created along with create
> publication and create subscription itself to keep it consistent:
>  $publisher->safe_psql('postgres',
> -       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
> +       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
>  $old_sub->safe_psql('postgres',
> -       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
> +       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
> PUBLICATION regress_pub4"
> +);

Ditto.

> With above fixes, the following can be removed:
> # Initial setup
> $publisher->safe_psql(
> 'postgres', qq[
> CREATE TABLE tab_upgraded1(id int);
> CREATE TABLE tab_upgraded2(id int);
> ]);
> $old_sub->safe_psql(
> 'postgres', qq[
> CREATE TABLE tab_upgraded1(id int);
> CREATE TABLE tab_upgraded2(id int);
> ]);

Yes, earlier definitions were removed instead.
Also, some comments were adjusted based on these fixes.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Fri, Feb 16, 2024 at 10:50 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Thanks for reviewing! PSA new version.
>

+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");

Why pg_createsubscriber is referred to here? I think it is a typo.

Other than that patch looks good to me.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Fri, 16 Feb 2024 at 10:50, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Vignesh,
>
> Thanks for reviewing! PSA new version.
>
> >
> > Thanks for the updated patch, few suggestions:
> > 1) Can we use a new publication for this subscription too so that the
> > publication and subscription naming will become consistent throughout
> > the test case:
> > +# Table will be in 'd' (data is being copied) state as table sync will fail
> > +# because of primary key constraint error.
> > +my $started_query =
> > +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd'";
> > +$old_sub->poll_query_until('postgres', $started_query)
> > +  or die
> > +  "Timed out while waiting for the table state to become 'd' (datasync)";
> > +
> > +# Create another subscription and drop the subscription's replication origin
> > +$old_sub->safe_psql('postgres',
> > +       "CREATE SUBSCRIPTION regress_sub3 CONNECTION '$connstr'
> > PUBLICATION regress_pub2 WITH (enabled = false)"
> > +);
> >
> > So after the change it will become like subscription regress_sub3 for
> > publication regress_pub3, subscription regress_sub4 for publication
> > regress_pub4 and subscription regress_sub5 for publication
> > regress_pub5.
>
> A new publication was defined.
>
> > 2) The tab_upgraded1 table can be created along with create
> > publication and create subscription itself:
> > $publisher->safe_psql('postgres',
> > "CREATE PUBLICATION regress_pub3 FOR TABLE tab_upgraded1");
> > $old_sub->safe_psql('postgres',
> > "CREATE SUBSCRIPTION regress_sub4 CONNECTION '$connstr' PUBLICATION
> > regress_pub3 WITH (failover = true)"
> > );
>
> The definition of tab_upgraded1 was moved to the place you pointed.
>
> > 3) The tab_upgraded2 table can be created along with create
> > publication and create subscription itself to keep it consistent:
> >  $publisher->safe_psql('postgres',
> > -       "ALTER PUBLICATION regress_pub2 ADD TABLE tab_upgraded2");
> > +       "CREATE PUBLICATION regress_pub4 FOR TABLE tab_upgraded2");
> >  $old_sub->safe_psql('postgres',
> > -       "ALTER SUBSCRIPTION regress_sub2 REFRESH PUBLICATION");
> > +       "CREATE SUBSCRIPTION regress_sub5 CONNECTION '$connstr'
> > PUBLICATION regress_pub4"
> > +);
>
> Ditto.
>
> > With above fixes, the following can be removed:
> > # Initial setup
> > $publisher->safe_psql(
> > 'postgres', qq[
> > CREATE TABLE tab_upgraded1(id int);
> > CREATE TABLE tab_upgraded2(id int);
> > ]);
> > $old_sub->safe_psql(
> > 'postgres', qq[
> > CREATE TABLE tab_upgraded1(id int);
> > CREATE TABLE tab_upgraded2(id int);
> > ]);
>
> Yes, earlier definitions were removed instead.
> Also, some comments were adjusted based on these fixes.

Thanks for the updated patch, Few suggestions:
1)  This can be moved to keep it similar to other tests:
+# Setup a disabled subscription. The upcoming test will check the
+# pg_createsubscriber won't work, so it is sufficient.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],

like below and the extra comment can be removed:
+# ------------------------------------------------------
+# Check that pg_upgrade fails when max_replication_slots configured in the new
+# cluster is less than the number of subscriptions in the old cluster.
+# ------------------------------------------------------
+# Create a disabled subscription.
+$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
+$old_sub->safe_psql('postgres',
+       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
PUBLICATION regress_pub1 WITH (enabled = false)"
+);
+
+$old_sub->stop;
+
+$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
+
+# pg_upgrade will fail because the new cluster has insufficient
+# max_replication_slots.
+command_checks_all(
+       [
+               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
+               '-D', $new_sub->data_dir, '-b', $oldbindir,
+               '-B', $newbindir, '-s', $new_sub->host,
+               '-p', $old_sub->port, '-P', $new_sub->port,
+               $mode, '--check',
+       ],

2) This comment can be slightly changed:
+# Change configuration as well not to start the initial sync automatically
+$new_sub->append_conf('postgresql.conf',
+       "max_logical_replication_workers = 0");

to:
Change configuration so that initial table sync sync does not get
started automatically

3) The old comments were slightly better:
# Resume the initial sync and wait until all tables of subscription
# 'regress_sub5' are synchronized
$new_sub->append_conf('postgresql.conf',
"max_logical_replication_workers = 10");
$new_sub->restart;
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Like:
# Enable the subscription
$new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");

# Wait until all tables of subscription 'regress_sub5' are synchronized
$new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Sat, Feb 17, 2024 at 10:05 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 16 Feb 2024 at 10:50, Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
>
> Thanks for the updated patch, Few suggestions:
> 1)  This can be moved to keep it similar to other tests:
> +# Setup a disabled subscription. The upcoming test will check the
> +# pg_createsubscriber won't work, so it is sufficient.
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> +       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> PUBLICATION regress_pub1 WITH (enabled = false)"
> +);
> +
> +$old_sub->stop;
> +
> +# ------------------------------------------------------
> +# Check that pg_upgrade fails when max_replication_slots configured in the new
> +# cluster is less than the number of subscriptions in the old cluster.
> +# ------------------------------------------------------
> +$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
> +
> +# pg_upgrade will fail because the new cluster has insufficient
> +# max_replication_slots.
> +command_checks_all(
> +       [
> +               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> +               '-D', $new_sub->data_dir, '-b', $oldbindir,
> +               '-B', $newbindir, '-s', $new_sub->host,
> +               '-p', $old_sub->port, '-P', $new_sub->port,
> +               $mode, '--check',
> +       ],
>
> like below and the extra comment can be removed:
> +# ------------------------------------------------------
> +# Check that pg_upgrade fails when max_replication_slots configured in the new
> +# cluster is less than the number of subscriptions in the old cluster.
> +# ------------------------------------------------------
> +# Create a disabled subscription.
>

It is okay to adjust as you are suggesting but I find Kuroda-San's
comment better than just saying: "Create a disabled subscription." as
that explicitly tells why it is okay to create a disabled
subscription.

>
> 3) The old comments were slightly better:
> # Resume the initial sync and wait until all tables of subscription
> # 'regress_sub5' are synchronized
> $new_sub->append_conf('postgresql.conf',
> "max_logical_replication_workers = 10");
> $new_sub->restart;
> $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
> $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
>
> Like:
> # Enable the subscription
> $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5 ENABLE");
>
> # Wait until all tables of subscription 'regress_sub5' are synchronized
> $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
>

I would prefer Kuroda-San's version as his version of the comment
explains the intent of the test better whereas what you are saying is
just exactly what the next line of code is doing and is
self-explanatory.

--
With Regards,
Amit Kapila.



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

Thanks for reviewing! PSA new version.

> 
> Thanks for the updated patch, Few suggestions:
> 1)  This can be moved to keep it similar to other tests:
> +# Setup a disabled subscription. The upcoming test will check the
> +# pg_createsubscriber won't work, so it is sufficient.
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> +       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> PUBLICATION regress_pub1 WITH (enabled = false)"
> +);
> +
> +$old_sub->stop;
> +
> +# ------------------------------------------------------
> +# Check that pg_upgrade fails when max_replication_slots configured in the new
> +# cluster is less than the number of subscriptions in the old cluster.
> +# ------------------------------------------------------
> +$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
> +
> +# pg_upgrade will fail because the new cluster has insufficient
> +# max_replication_slots.
> +command_checks_all(
> +       [
> +               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> +               '-D', $new_sub->data_dir, '-b', $oldbindir,
> +               '-B', $newbindir, '-s', $new_sub->host,
> +               '-p', $old_sub->port, '-P', $new_sub->port,
> +               $mode, '--check',
> +       ],
> 
> like below and the extra comment can be removed:
> +# ------------------------------------------------------
> +# Check that pg_upgrade fails when max_replication_slots configured in the new
> +# cluster is less than the number of subscriptions in the old cluster.
> +# ------------------------------------------------------
> +# Create a disabled subscription.
> +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> +$old_sub->safe_psql('postgres',
> +       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> PUBLICATION regress_pub1 WITH (enabled = false)"
> +);
> +
> +$old_sub->stop;
> +
> +$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
> +
> +# pg_upgrade will fail because the new cluster has insufficient
> +# max_replication_slots.
> +command_checks_all(
> +       [
> +               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> +               '-D', $new_sub->data_dir, '-b', $oldbindir,
> +               '-B', $newbindir, '-s', $new_sub->host,
> +               '-p', $old_sub->port, '-P', $new_sub->port,
> +               $mode, '--check',
> +       ],

Partially fixed. I moved the creation part to below but comments were kept.

> 2) This comment can be slightly changed:
> +# Change configuration as well not to start the initial sync automatically
> +$new_sub->append_conf('postgresql.conf',
> +       "max_logical_replication_workers = 0");
> 
> to:
> Change configuration so that initial table sync sync does not get
> started automatically

Fixed.

> 3) The old comments were slightly better:
> # Resume the initial sync and wait until all tables of subscription
> # 'regress_sub5' are synchronized
> $new_sub->append_conf('postgresql.conf',
> "max_logical_replication_workers = 10");
> $new_sub->restart;
> $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
> ENABLE");
> $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
> 
> Like:
> # Enable the subscription
> $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
> ENABLE");
> 
> # Wait until all tables of subscription 'regress_sub5' are synchronized
> $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');

Per comments from Amit [1], I did not change.

[1]: https://www.postgresql.org/message-id/CAA4eK1Ls%2BRmJtTvOgaRXd%2BeHSY3x-KUE%3DsfEGQoU-JF_UzA62A%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Attachment

Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 19 Feb 2024 at 06:54, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Vignesh,
>
> Thanks for reviewing! PSA new version.
>
> >
> > Thanks for the updated patch, Few suggestions:
> > 1)  This can be moved to keep it similar to other tests:
> > +# Setup a disabled subscription. The upcoming test will check the
> > +# pg_createsubscriber won't work, so it is sufficient.
> > +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> > +$old_sub->safe_psql('postgres',
> > +       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> > PUBLICATION regress_pub1 WITH (enabled = false)"
> > +);
> > +
> > +$old_sub->stop;
> > +
> > +# ------------------------------------------------------
> > +# Check that pg_upgrade fails when max_replication_slots configured in the new
> > +# cluster is less than the number of subscriptions in the old cluster.
> > +# ------------------------------------------------------
> > +$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
> > +
> > +# pg_upgrade will fail because the new cluster has insufficient
> > +# max_replication_slots.
> > +command_checks_all(
> > +       [
> > +               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> > +               '-D', $new_sub->data_dir, '-b', $oldbindir,
> > +               '-B', $newbindir, '-s', $new_sub->host,
> > +               '-p', $old_sub->port, '-P', $new_sub->port,
> > +               $mode, '--check',
> > +       ],
> >
> > like below and the extra comment can be removed:
> > +# ------------------------------------------------------
> > +# Check that pg_upgrade fails when max_replication_slots configured in the new
> > +# cluster is less than the number of subscriptions in the old cluster.
> > +# ------------------------------------------------------
> > +# Create a disabled subscription.
> > +$publisher->safe_psql('postgres', "CREATE PUBLICATION regress_pub1");
> > +$old_sub->safe_psql('postgres',
> > +       "CREATE SUBSCRIPTION regress_sub1 CONNECTION '$connstr'
> > PUBLICATION regress_pub1 WITH (enabled = false)"
> > +);
> > +
> > +$old_sub->stop;
> > +
> > +$new_sub->append_conf('postgresql.conf', "max_replication_slots = 0");
> > +
> > +# pg_upgrade will fail because the new cluster has insufficient
> > +# max_replication_slots.
> > +command_checks_all(
> > +       [
> > +               'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir,
> > +               '-D', $new_sub->data_dir, '-b', $oldbindir,
> > +               '-B', $newbindir, '-s', $new_sub->host,
> > +               '-p', $old_sub->port, '-P', $new_sub->port,
> > +               $mode, '--check',
> > +       ],
>
> Partially fixed. I moved the creation part to below but comments were kept.
>
> > 2) This comment can be slightly changed:
> > +# Change configuration as well not to start the initial sync automatically
> > +$new_sub->append_conf('postgresql.conf',
> > +       "max_logical_replication_workers = 0");
> >
> > to:
> > Change configuration so that initial table sync sync does not get
> > started automatically
>
> Fixed.
>
> > 3) The old comments were slightly better:
> > # Resume the initial sync and wait until all tables of subscription
> > # 'regress_sub5' are synchronized
> > $new_sub->append_conf('postgresql.conf',
> > "max_logical_replication_workers = 10");
> > $new_sub->restart;
> > $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
> > ENABLE");
> > $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
> >
> > Like:
> > # Enable the subscription
> > $new_sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub5
> > ENABLE");
> >
> > # Wait until all tables of subscription 'regress_sub5' are synchronized
> > $new_sub->wait_for_subscription_sync($publisher, 'regress_sub5');
>
> Per comments from Amit [1], I did not change.

Thanks for the updated patch, I don't have any more comments.

Regards,
Vignesh



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Feb 19, 2024 at 6:54 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Thanks for reviewing! PSA new version.
>

Pushed this after making minor changes in the comments.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
vignesh C
Date:
On Mon, 19 Feb 2024 at 12:38, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 19, 2024 at 6:54 AM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Thanks for reviewing! PSA new version.
> >
>
> Pushed this after making minor changes in the comments.

Recently there was a failure in 004_subscription tap test at [1].
In this failure, the tab_upgraded1 table was expected to have 51
records but has only 50 records. Before the upgrade both publisher and
subscriber have 50 records.
After the upgrade we have inserted one record in the publisher, now
tab_upgraded1 will have 51 records in the publisher. Then we start the
subscriber after changing max_logical_replication_workers so that
apply workers get started and apply the changes received. After
starting we enable regress_sub5, wait for sync of regress_sub5
subscription and check for tab_upgraded1 and tab_upgraded2 table data.
In a few random cases the one record that was inserted into
tab_upgraded1 table will not get replicated as we have not waited for
regress_sub4 subscription to apply the changes from the publisher.
The attached patch has changes to wait for regress_sub4 subscription
to apply the changes from the publisher before verifying the data.

[1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2024-03-26%2004%3A23%3A13

Regards,
Vignesh

Attachment

RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Vignesh,

> 
> Recently there was a failure in 004_subscription tap test at [1].
> In this failure, the tab_upgraded1 table was expected to have 51
> records but has only 50 records. Before the upgrade both publisher and
> subscriber have 50 records.

Good catch!

> After the upgrade we have inserted one record in the publisher, now
> tab_upgraded1 will have 51 records in the publisher. Then we start the
> subscriber after changing max_logical_replication_workers so that
> apply workers get started and apply the changes received. After
> starting we enable regress_sub5, wait for sync of regress_sub5
> subscription and check for tab_upgraded1 and tab_upgraded2 table data.
> In a few random cases the one record that was inserted into
> tab_upgraded1 table will not get replicated as we have not waited for
> regress_sub4 subscription to apply the changes from the publisher.
> The attached patch has changes to wait for regress_sub4 subscription
> to apply the changes from the publisher before verifying the data.
> 
> [1] -
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2024-03-
> 26%2004%3A23%3A13

Yeah, I think it is an oversight in f17529. Previously subscriptions which
receiving changes were confirmed to be caught up, I missed to add the line while
restructuring the script. +1 for your fix.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Mar 27, 2024 at 11:57 AM vignesh C <vignesh21@gmail.com> wrote:
>
> The attached patch has changes to wait for regress_sub4 subscription
> to apply the changes from the publisher before verifying the data.
>

Pushed after changing the order of wait as it looks logical to wait
for regress_sub5 after enabling the subscription. Thanks

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
I've been looking into optimizing pg_upgrade's once-in-each-database steps
[0], and I noticed that we are opening a connection to every database in
the cluster and running a query like

    SELECT count(*) FROM pg_catalog.pg_subscription WHERE subdbid = %d;

Then, later on, we combine all of these values in
count_old_cluster_subscriptions() to verify that max_replication_slots is
set high enough.  AFAICT these per-database subscription counts aren't used
for anything else.

This is an extremely expensive way to perform that check, and so I'm
wondering why we don't just do

    SELECT count(*) FROM pg_catalog.pg_subscription;

once in count_old_cluster_subscriptions().

[0] https://commitfest.postgresql.org/48/4995/

-- 
nathan



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
On Fri, Jul 19, 2024 at 03:44:22PM -0500, Nathan Bossart wrote:
> I've been looking into optimizing pg_upgrade's once-in-each-database steps
> [0], and I noticed that we are opening a connection to every database in
> the cluster and running a query like
> 
>     SELECT count(*) FROM pg_catalog.pg_subscription WHERE subdbid = %d;
> 
> Then, later on, we combine all of these values in
> count_old_cluster_subscriptions() to verify that max_replication_slots is
> set high enough.  AFAICT these per-database subscription counts aren't used
> for anything else.
> 
> This is an extremely expensive way to perform that check, and so I'm
> wondering why we don't just do
> 
>     SELECT count(*) FROM pg_catalog.pg_subscription;
> 
> once in count_old_cluster_subscriptions().

Like so...

-- 
nathan

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:
>> This is an extremely expensive way to perform that check, and so I'm
>> wondering why we don't just do
>>
>>     SELECT count(*) FROM pg_catalog.pg_subscription;
>>
>> once in count_old_cluster_subscriptions().
>
> Like so...

Ah, good catch.  That sounds like a good thing to do because we don't
care about the number of subscriptions for each database in the
current code.

This is something that qualifies as an open item, IMO, as this code
is new to PG17.

A comment in get_db_rel_and_slot_infos() becomes incorrect where
get_old_cluster_logical_slot_infos() is called; it is still referring
to the subscription count.

Actually, on the same grounds, couldn't we do the logical slot info
retrieval in get_old_cluster_logical_slot_infos() in a single pass as
well?  pg_replication_slots reports some information about all the
slots, and the current code has a qual on current_database().  It
looks to me that this could be replaced by a single query, ordering
the slots by database names, assigning the slot infos in each
database's DbInfo at the end.  That would be much more efficient if
dealing with a lot of databases.
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:
> >> This is an extremely expensive way to perform that check, and so I'm
> >> wondering why we don't just do
> >>
> >>      SELECT count(*) FROM pg_catalog.pg_subscription;
> >>
> >> once in count_old_cluster_subscriptions().
> >
> > Like so...

Isn't it better to directly invoke get_subscription_count() in
check_new_cluster_subscription_configuration() where it is required
rather than in a db-specific general function?

>
> Ah, good catch.  That sounds like a good thing to do because we don't
> care about the number of subscriptions for each database in the
> current code.
>
> This is something that qualifies as an open item, IMO, as this code
> is new to PG17.
>
> A comment in get_db_rel_and_slot_infos() becomes incorrect where
> get_old_cluster_logical_slot_infos() is called; it is still referring
> to the subscription count.
>
> Actually, on the same grounds, couldn't we do the logical slot info
> retrieval in get_old_cluster_logical_slot_infos() in a single pass as
> well?  pg_replication_slots reports some information about all the
> slots, and the current code has a qual on current_database().  It
> looks to me that this could be replaced by a single query, ordering
> the slots by database names, assigning the slot infos in each
> database's DbInfo at the end.
>

Unlike subscriptions, logical slots are database-specific objects. We
have some checks in the code like the one in CreateDecodingContext()
for MyDatabaseId which may or may not create a problem for this case
as we don't consume changes when checking
LogicalReplicationSlotHasPendingWal via
binary_upgrade_logical_slot_has_caught_up() but I think this needs
more analysis than what Nathan has proposed. So, I suggest taking up
this task for PG18 if we want to optimize this code path.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:
> On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:
>> On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:
>> >> This is an extremely expensive way to perform that check, and so I'm
>> >> wondering why we don't just do
>> >>
>> >>      SELECT count(*) FROM pg_catalog.pg_subscription;
>> >>
>> >> once in count_old_cluster_subscriptions().
>> >
>> > Like so...
> 
> Isn't it better to directly invoke get_subscription_count() in
> check_new_cluster_subscription_configuration() where it is required
> rather than in a db-specific general function?

IIUC the old cluster won't be running at that point.

>> Ah, good catch.  That sounds like a good thing to do because we don't
>> care about the number of subscriptions for each database in the
>> current code.
>>
>> This is something that qualifies as an open item, IMO, as this code
>> is new to PG17.

+1

>> A comment in get_db_rel_and_slot_infos() becomes incorrect where
>> get_old_cluster_logical_slot_infos() is called; it is still referring
>> to the subscription count.

I removed this comment since IMHO it doesn't add much.

>> Actually, on the same grounds, couldn't we do the logical slot info
>> retrieval in get_old_cluster_logical_slot_infos() in a single pass as
>> well?  pg_replication_slots reports some information about all the
>> slots, and the current code has a qual on current_database().  It
>> looks to me that this could be replaced by a single query, ordering
>> the slots by database names, assigning the slot infos in each
>> database's DbInfo at the end.
> 
> Unlike subscriptions, logical slots are database-specific objects. We
> have some checks in the code like the one in CreateDecodingContext()
> for MyDatabaseId which may or may not create a problem for this case
> as we don't consume changes when checking
> LogicalReplicationSlotHasPendingWal via
> binary_upgrade_logical_slot_has_caught_up() but I think this needs
> more analysis than what Nathan has proposed. So, I suggest taking up
> this task for PG18 if we want to optimize this code path.

I see what you mean.

-- 
nathan

Attachment

Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Mon, Jul 22, 2024 at 09:46:29AM -0500, Nathan Bossart wrote:
> On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:
>> On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:
>>> A comment in get_db_rel_and_slot_infos() becomes incorrect where
>>> get_old_cluster_logical_slot_infos() is called; it is still referring
>>> to the subscription count.
>
> I removed this comment since IMHO it doesn't add much.

WFM.

>>> Actually, on the same grounds, couldn't we do the logical slot info
>>> retrieval in get_old_cluster_logical_slot_infos() in a single pass as
>>> well?  pg_replication_slots reports some information about all the
>>> slots, and the current code has a qual on current_database().  It
>>> looks to me that this could be replaced by a single query, ordering
>>> the slots by database names, assigning the slot infos in each
>>> database's DbInfo at the end.
>>
>> Unlike subscriptions, logical slots are database-specific objects. We
>> have some checks in the code like the one in CreateDecodingContext()
>> for MyDatabaseId which may or may not create a problem for this case
>> as we don't consume changes when checking
>> LogicalReplicationSlotHasPendingWal via
>> binary_upgrade_logical_slot_has_caught_up() but I think this needs
>> more analysis than what Nathan has proposed. So, I suggest taking up
>> this task for PG18 if we want to optimize this code path.
>
> I see what you mean.

I am not sure to get the reason why get_old_cluster_logical_slot_infos()
could not be optimized, TBH.  LogicalReplicationSlotHasPendingWal()
uses the fast forward mode where no changes are generated, hence there
should be no need for a dependency to a connection to a specific
database :)

Combined to a hash table based on the database name and/or OID to know
to which dbinfo to attach the information of a slot, then it should be
possible to use one query, making the slot info gathering closer to
O(N) rather than the current O(N^2).
--
Michael

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Tue, Jul 23, 2024 at 4:33 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Jul 22, 2024 at 09:46:29AM -0500, Nathan Bossart wrote:
> > On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:
> >>
> >> Unlike subscriptions, logical slots are database-specific objects. We
> >> have some checks in the code like the one in CreateDecodingContext()
> >> for MyDatabaseId which may or may not create a problem for this case
> >> as we don't consume changes when checking
> >> LogicalReplicationSlotHasPendingWal via
> >> binary_upgrade_logical_slot_has_caught_up() but I think this needs
> >> more analysis than what Nathan has proposed. So, I suggest taking up
> >> this task for PG18 if we want to optimize this code path.
> >
> > I see what you mean.
>
> I am not sure to get the reason why get_old_cluster_logical_slot_infos()
> could not be optimized, TBH.  LogicalReplicationSlotHasPendingWal()
> uses the fast forward mode where no changes are generated, hence there
> should be no need for a dependency to a connection to a specific
> database :)
>
> Combined to a hash table based on the database name and/or OID to know
> to which dbinfo to attach the information of a slot, then it should be
> possible to use one query, making the slot info gathering closer to
> O(N) rather than the current O(N^2).
>

The point is that unlike subscriptions logical slots are not
cluster-level objects. So, this needs more careful design decisions
rather than a fix-up patch for PG-17. One more thing after collecting
slot-level, we also want to consider the creation of slots which again
are created at per-database level.

--
With Regards,
Amit Kapila.



RE: pg_upgrade and logical replication

From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, Michael,

> > I am not sure to get the reason why get_old_cluster_logical_slot_infos()
> > could not be optimized, TBH.  LogicalReplicationSlotHasPendingWal()
> > uses the fast forward mode where no changes are generated, hence there
> > should be no need for a dependency to a connection to a specific
> > database :)
> >
> > Combined to a hash table based on the database name and/or OID to know
> > to which dbinfo to attach the information of a slot, then it should be
> > possible to use one query, making the slot info gathering closer to
> > O(N) rather than the current O(N^2).
> >
> 
> The point is that unlike subscriptions logical slots are not
> cluster-level objects. So, this needs more careful design decisions
> rather than a fix-up patch for PG-17. One more thing after collecting
> slot-level, we also want to consider the creation of slots which again
> are created at per-database level.

I also considered the combination with the optimization (parallelization) of
pg_upgrade [1]. IIUC, the patch tries to connect to some databases in parallel
and run commands. The current style of create_logical_replication_slots() can be
easily adapted because tasks are divided per database.

However, if we change like get_old_cluster_logical_slot_infos() to do in a single
pass, we may have to shift LogicalSlotInfoArr to cluster-wide data and store the
database name in LogicalSlotInfo. Also, in create_logical_replication_slots(),
we may have to check the located database for every slot and connect to the
appropriate database. These changes make it difficult to parallelize the operation.

[1]: https://www.postgresql.org/message-id/flat/20240516211638.GA1688936@nathanxps13

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Mon, Jul 22, 2024 at 8:16 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 03:45:19PM +0530, Amit Kapila wrote:
> > On Mon, Jul 22, 2024 at 7:35 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> On Sat, Jul 20, 2024 at 09:03:07PM -0500, Nathan Bossart wrote:
> >> >> This is an extremely expensive way to perform that check, and so I'm
> >> >> wondering why we don't just do
> >> >>
> >> >>      SELECT count(*) FROM pg_catalog.pg_subscription;
> >> >>
> >> >> once in count_old_cluster_subscriptions().
> >> >
> >> > Like so...
> >
> > Isn't it better to directly invoke get_subscription_count() in
> > check_new_cluster_subscription_configuration() where it is required
> > rather than in a db-specific general function?
>
> IIUC the old cluster won't be running at that point.
>

Right, the other option would be to move it to the place where we call
check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
the specific function (get_db_rel_and_slot_infos) as we were
mainlining the count at the per-database level but now as we are
changing that I am not sure if calling it from the same place is a
good idea. But OTOH, it is okay to keep it at the place where we
retrieve the required information from the old cluster.

One minor point is the comment atop get_subscription_count() still
refers to the function name as get_db_subscription_count().

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
On Tue, Jul 23, 2024 at 09:05:05AM +0530, Amit Kapila wrote:
> Right, the other option would be to move it to the place where we call
> check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
> the specific function (get_db_rel_and_slot_infos) as we were
> mainlining the count at the per-database level but now as we are
> changing that I am not sure if calling it from the same place is a
> good idea. But OTOH, it is okay to keep it at the place where we
> retrieve the required information from the old cluster.

I moved it to where you suggested.

> One minor point is the comment atop get_subscription_count() still
> refers to the function name as get_db_subscription_count().

Oops, fixed.

-- 
nathan

Attachment

Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Jul 24, 2024 at 1:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Tue, Jul 23, 2024 at 09:05:05AM +0530, Amit Kapila wrote:
> > Right, the other option would be to move it to the place where we call
> > check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
> > the specific function (get_db_rel_and_slot_infos) as we were
> > mainlining the count at the per-database level but now as we are
> > changing that I am not sure if calling it from the same place is a
> > good idea. But OTOH, it is okay to keep it at the place where we
> > retrieve the required information from the old cluster.
>
> I moved it to where you suggested.
>
> > One minor point is the comment atop get_subscription_count() still
> > refers to the function name as get_db_subscription_count().
>
> Oops, fixed.
>

LGTM.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:
> LGTM.

Thanks for reviewing.  Committed and back-patched to v17.

-- 
nathan



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Wed, Jul 24, 2024 at 10:03 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:
> > LGTM.
>
> Thanks for reviewing.  Committed and back-patched to v17.
>

Shall we close the open items? I think even if we want to improve the
slot fetching/creation mechanism, it should be part of PG18.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Amit Kapila
Date:
On Thu, Jul 25, 2024 at 8:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 24, 2024 at 10:03 PM Nathan Bossart
> <nathandbossart@gmail.com> wrote:
> >
> > On Wed, Jul 24, 2024 at 11:32:47AM +0530, Amit Kapila wrote:
> > > LGTM.
> >
> > Thanks for reviewing.  Committed and back-patched to v17.
> >
>
> Shall we close the open items?
>

Sorry for the typo. There is only one open item corresponding to this:
"Subscription and slot information retrieval inefficiency in
pg_upgrade" which according to me should be closed after your commit.

--
With Regards,
Amit Kapila.



Re: pg_upgrade and logical replication

From
Nathan Bossart
Date:
On Thu, Jul 25, 2024 at 08:43:03AM +0530, Amit Kapila wrote:
>> Shall we close the open items?
> 
> Sorry for the typo. There is only one open item corresponding to this:
> "Subscription and slot information retrieval inefficiency in
> pg_upgrade" which according to me should be closed after your commit.

Oops, I forgot to do that.  I've moved it to the "resolved before 17beta3"
section.

-- 
nathan



Re: pg_upgrade and logical replication

From
Michael Paquier
Date:
On Wed, Jul 24, 2024 at 10:16:51PM -0500, Nathan Bossart wrote:
> On Thu, Jul 25, 2024 at 08:43:03AM +0530, Amit Kapila wrote:
>>> Shall we close the open items?
>>
>> Sorry for the typo. There is only one open item corresponding to this:
>> "Subscription and slot information retrieval inefficiency in
>> pg_upgrade" which according to me should be closed after your commit.
>
> Oops, I forgot to do that.  I've moved it to the "resolved before 17beta3"
> section.

Removing the item sounds good to me.  Thanks.
--
Michael

Attachment