RE: pg_upgrade and logical replication - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: pg_upgrade and logical replication
Date
Msg-id TYAPR01MB5866401A4B923A07C535D59EF5999@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: pg_upgrade and logical replication  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: pg_upgrade and logical replication  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
Dear Julien,

> I didn't really look into it, mostly because I don't think it's a sensible
> use case.  Logical sync of a relation is a heavy and time consuming operation
> that requires to retain the xmin for quite some time.  This can already lead to
> some bad effect on the publisher, so adding a pg_upgrade in the middle of that
> would just make things worse.  Upgrading a subscriber is a rare event that has
> to be well planned (you need to test your application with the new version and
> so on), initial sync of relation shouldn't happen continually, so having to
> wait for the sync to be finished doesn't seem like a source of problem but
> might instead avoid some for users who may not fully realize the implications.
>
> If someone has a scenario where running pg_upgrade in the middle of a logical
> sync is mandatory I can try to look at it, but for now I just don't see a good
> reason to add even more complexity to this part of the code, especially since
> adding regression tests seems a bit troublesome.

I do not have any scenarios which run pg_upgrade while synchronization because I
agree that upgrading can be well planned. So it may be OK not to add it in order
to keep the patch simpler.

> > Here the oid of the table is directly specified, but is it really kept between
> > old and new node?
>
> Yes, pg_upgrade does need to preserve relation's oid.

I confirmed and agreed. dumpTableSchema() dumps an additional function
pg_catalog.binary_upgrade_set_next_heap_pg_class_oid() before each CREATE TABLE
statements. The function force the table to have the specified OID.

> > Similar command ALTER PUBLICATION requires the name of table,
> > not the oid.
>
> Yes, but those are user facing commands, while ALTER SUBSCRIPTION name
> ADD
> TABLE is only used internally for pg_upgrade.  My goal is to make this command
> a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
> existing requirements.  If that's really a problem I can use the name instead
> but I didn't hear any argument against it for now.

OK, make sense.

>
> > 3.
> > Currently getSubscriptionRels() is called from the getSubscriptions(), but I
> could
> > not find the reason why we must do like that. Other functions like
> > getPublicationTables() is directly called from getSchemaData(), so they should
> > be followed.
>
> I think you're right, doing a single getSubscriptionRels() rather than once
> per subscription should be more efficient.

Yes, we do not have to divide reading pg_subscription_rel per subscriptions.

> > Additionaly, I found two problems.
> >
> > * Only tables that to be dumped should be included. See getPublicationTables().
>
> This is only done during pg_upgrade where all tables are dumped, so there
> shouldn't be any need to filter the list.
>
> > * dropStmt for subscription relations seems not to be needed.
>
> I'm not sure I understand this one.  I agree that a dropStmt isn't needed, and
> there's no such thing in the patch.  Are you saying that you agree with it?

Sorry for unclear suggestion. I meant to say that we could keep current style even
if getSubscriptionRels() is called separately. Your understanding which it is not
needed is right.

> > * Maybe security label and comments should be also dumped.
>
> Subscription's security labels and comments are already dumped (well should be
> dumped, AFAICS pg_dump was never taught to look at shared security label on
> objects other than databases but still try to emit them, pg_dumpall instead
> handles pg_authid and pg_tablespace), and we can't add security label or
> comment on subscription's relations so I don't think this patch is missing
> something?
>
> So unless I'm missing something it looks like shared security label handling is
> partly broken, but that's orthogonal to this patch.
>
> > Followings are minor comments.
> >
> >
> > 4. parse_subscription_options
> >
> > ```
> > +                       opts->state = defGetString(defel)[0];
> > ```
> >
> > [0] is not needed.
>
> It still needs to be dereferenced, I personally find [0] a bit clearer in that
> situation but I'm not opposed to a plain *.

Sorry, I was confused. You are right.

> > 5. AlterSubscription
> >
> > ```
> > +                               supported_opts = SUBOPT_RELID |
> SUBOPT_STATE | SUBOPT_LSN;
> > +                               parse_subscription_options(pstate,
> stmt->options,
> > +
> supported_opts, &opts);
> > +
> > +                               /* relid and state should always be
> provided. */
> > +                               Assert(IsSet(opts.specified_opts,
> SUBOPT_RELID));
> > +                               Assert(IsSet(opts.specified_opts,
> SUBOPT_STATE));
> > +
> > ```
> >
> > SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
> > reject it?
>
> If you mean have an Assert for that I agree.  It's not supposed to be used by
> users so I don't think having non debug check is sensible, as any user provided
> value has no reason to be correct anyway.

Yes, I meant to request to add an Assert. Maybe you can add:
Assert(IsSet(opts.specified_opts, SUBOPT_LSN) && !XLogRecPtrIsInvalid(opts.lsn));

>
After looking at the code I remember that I kept the lsn optional in ALTER
SUBSCRIPTION name ADD TABLE command processing.  For now pg_upgrade checks that
all subscriptions have a valid remote_lsn so there should indeed always be a
value different from InvalidLSN/none specified, but it's still unclear to me
whether this check will eventually be weakened or not, so for now I think it's
better to keep AlterSubscription accept this case, here and in all other code
paths.

If there's a hard objection I will just make the lsn mandatory.
>

I have tested, but srsublsn became NULL if copy_data was specified as off.
This is because when copy_data is false, all tuples in pg_subscription_rels are filled
as state = 'r' and srsublsn = NULL, and tablesync workers will never boot.
See CreateSubscription().
Doesn't it mean that there is a possibility that LSN option is not specified while
ALTER SUBSCRIPTION ADD TABLE?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: segfault tied to "IS JSON predicate" commit
Next
From: David Rowley
Date:
Subject: Re: Unexpected (wrong?) result querying boolean partitioned table with NULL partition