Re: pg_upgrade and logical replication - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: pg_upgrade and logical replication
Date
Msg-id 20230413025110.gt6nahvrerngoz2m@jrouhaud
Whole thread Raw
In response to RE: pg_upgrade and logical replication  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Responses Re: pg_upgrade and logical replication
RE: pg_upgrade and logical replication
List pgsql-hackers
Hi,

On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote:
>
> Thank you for updating the patch. I checked yours.
> Followings are general or non-minor questions:

Thanks!

> 1.
> Feature freeze for PG16 has already come. So I think there is no reason to rush
> making the patch. Based on above, could you allow to upgrade while synchronizing
> data? Personally it can be added as 0002 patch which extends the feature. Or
> have you already found any problem?

I didn't really look into it, mostly because I don't think it's a sensible
use case.  Logical sync of a relation is a heavy and time consuming operation
that requires to retain the xmin for quite some time.  This can already lead to
some bad effect on the publisher, so adding a pg_upgrade in the middle of that
would just make things worse.  Upgrading a subscriber is a rare event that has
to be well planned (you need to test your application with the new version and
so on), initial sync of relation shouldn't happen continually, so having to
wait for the sync to be finished doesn't seem like a source of problem but
might instead avoid some for users who may not fully realize the implications.

If someone has a scenario where running pg_upgrade in the middle of a logical
sync is mandatory I can try to look at it, but for now I just don't see a good
reason to add even more complexity to this part of the code, especially since
adding regression tests seems a bit troublesome.

> 2.
> I have a questions about the SQL interface:
>
> ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y'])
>
> Here the oid of the table is directly specified, but is it really kept between
> old and new node?

Yes, pg_upgrade does need to preserve relation's oid.

> Similar command ALTER PUBLICATION requires the name of table,
> not the oid.

Yes, but those are user facing commands, while ALTER SUBSCRIPTION name ADD
TABLE is only used internally for pg_upgrade.  My goal is to make this command
a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade
existing requirements.  If that's really a problem I can use the name instead
but I didn't hear any argument against it for now.

> 3.
> Currently getSubscriptionRels() is called from the getSubscriptions(), but I could
> not find the reason why we must do like that. Other functions like
> getPublicationTables() is directly called from getSchemaData(), so they should
> be followed.

I think you're right, doing a single getSubscriptionRels() rather than once
per subscription should be more efficient.

> Additionaly, I found two problems.
>
> * Only tables that to be dumped should be included. See getPublicationTables().

This is only done during pg_upgrade where all tables are dumped, so there
shouldn't be any need to filter the list.

> * dropStmt for subscription relations seems not to be needed.

I'm not sure I understand this one.  I agree that a dropStmt isn't needed, and
there's no such thing in the patch.  Are you saying that you agree with it?

> * Maybe security label and comments should be also dumped.

Subscription's security labels and comments are already dumped (well should be
dumped, AFAICS pg_dump was never taught to look at shared security label on
objects other than databases but still try to emit them, pg_dumpall instead
handles pg_authid and pg_tablespace), and we can't add security label or
comment on subscription's relations so I don't think this patch is missing
something?

So unless I'm missing something it looks like shared security label handling is
partly broken, but that's orthogonal to this patch.

> Followings are minor comments.
>
>
> 4. parse_subscription_options
>
> ```
> +                       opts->state = defGetString(defel)[0];
> ```
>
> [0] is not needed.

It still needs to be dereferenced, I personally find [0] a bit clearer in that
situation but I'm not opposed to a plain *.

> 5. AlterSubscription
>
> ```
> +                               supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN;
> +                               parse_subscription_options(pstate, stmt->options,
> +                                                                                  supported_opts, &opts);
> +
> +                               /* relid and state should always be provided. */
> +                               Assert(IsSet(opts.specified_opts, SUBOPT_RELID));
> +                               Assert(IsSet(opts.specified_opts, SUBOPT_STATE));
> +
> ```
>
> SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to
> reject it?

If you mean have an Assert for that I agree.  It's not supposed to be used by
users so I don't think having non debug check is sensible, as any user provided
value has no reason to be correct anyway.

> 6. dumpSubscription()
>
> ```
> +       if (dopt->binary_upgrade && dopt->preserve_subscriptions &&
> +               subinfo->suboriginremotelsn)
> +       {
> +               appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn);
> +       }
> ```
>
> {} is not needed.

Yes, but the condition being on two lines it makes it more readable.  I think a
lot of code uses curly braces in similar case already.

> 7. pg_dump.h
>
> ```
> +/*
> + * The SubRelInfo struct is used to represent subscription relation.
> + */
> +typedef struct _SubRelInfo
> +{
> +       Oid             srrelid;
> +       char    srsubstate;
> +       char   *srsublsn;
> +} SubRelInfo;
> ```
>
> This typedef must be added to typedefs.list.

Right!

> 8. check_for_subscription_state
>
> ```
>             nb = atooid(PQgetvalue(res, 0, 0));
>             if (nb != 0)
>             {
>                 is_error = true;
>                 pg_log(PG_WARNING,
>                        "\nWARNING:  %d subscription have invalid remote_lsn",
>                        nb);
>             }
> ```
>
> I think no need to use atooid. Additionaly, isn't it better to show the name of
> subscriptions which have invalid remote_lsn?

Agreed.

> ```
>         nb = atooid(PQgetvalue(res, 0, 0));
>         if (nb != 0)
>         {
>             is_error = true;
>             pg_log(PG_WARNING,
>                    "\nWARNING: database \"%s\" has %d subscription "
>                    "relations(s) in non-ready state", active_db->db_name, nb);
>         }
> ```
>
> Same as above.

Agreed.

> 9. parseCommandLine
>
> ```
> +       user_opts.preserve_subscriptions = false;
> ```
>
> I think this initialization is not needed because it is default.

It's not strictly needed because of C rules but I think it doesn't really hurt
to make it explicit and not have to remember what the standard says.

> And maybe you missed to run pgindent.

I indeed haven't.  There will probably be a global pgindent done soon so I will
do one for this patch afterwards.



pgsql-hackers by date:

Previous
From: Yurii Rashkovskii
Date:
Subject: Re: [PATCH] Allow Postgres to pick an unused port to listen
Next
From: Richard Guo
Date:
Subject: Re: Unexpected (wrong?) result querying boolean partitioned table with NULL partition