Re: pg_upgrade and logical replication - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: pg_upgrade and logical replication |
Date | |
Msg-id | 20230413025110.gt6nahvrerngoz2m@jrouhaud Whole thread Raw |
In response to | RE: pg_upgrade and logical replication ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Responses |
Re: pg_upgrade and logical replication
RE: pg_upgrade and logical replication |
List | pgsql-hackers |
Hi, On Wed, Apr 12, 2023 at 09:48:15AM +0000, Hayato Kuroda (Fujitsu) wrote: > > Thank you for updating the patch. I checked yours. > Followings are general or non-minor questions: Thanks! > 1. > Feature freeze for PG16 has already come. So I think there is no reason to rush > making the patch. Based on above, could you allow to upgrade while synchronizing > data? Personally it can be added as 0002 patch which extends the feature. Or > have you already found any problem? I didn't really look into it, mostly because I don't think it's a sensible use case. Logical sync of a relation is a heavy and time consuming operation that requires to retain the xmin for quite some time. This can already lead to some bad effect on the publisher, so adding a pg_upgrade in the middle of that would just make things worse. Upgrading a subscriber is a rare event that has to be well planned (you need to test your application with the new version and so on), initial sync of relation shouldn't happen continually, so having to wait for the sync to be finished doesn't seem like a source of problem but might instead avoid some for users who may not fully realize the implications. If someone has a scenario where running pg_upgrade in the middle of a logical sync is mandatory I can try to look at it, but for now I just don't see a good reason to add even more complexity to this part of the code, especially since adding regression tests seems a bit troublesome. > 2. > I have a questions about the SQL interface: > > ALTER SUBSCRIPTION name ADD TABLE (relid = XYZ, state = 'x' [, lsn = 'X/Y']) > > Here the oid of the table is directly specified, but is it really kept between > old and new node? Yes, pg_upgrade does need to preserve relation's oid. > Similar command ALTER PUBLICATION requires the name of table, > not the oid. Yes, but those are user facing commands, while ALTER SUBSCRIPTION name ADD TABLE is only used internally for pg_upgrade. My goal is to make this command a bit faster by avoiding an extra cache lookup each time, relying on pg_upgrade existing requirements. If that's really a problem I can use the name instead but I didn't hear any argument against it for now. > 3. > Currently getSubscriptionRels() is called from the getSubscriptions(), but I could > not find the reason why we must do like that. Other functions like > getPublicationTables() is directly called from getSchemaData(), so they should > be followed. I think you're right, doing a single getSubscriptionRels() rather than once per subscription should be more efficient. > Additionaly, I found two problems. > > * Only tables that to be dumped should be included. See getPublicationTables(). This is only done during pg_upgrade where all tables are dumped, so there shouldn't be any need to filter the list. > * dropStmt for subscription relations seems not to be needed. I'm not sure I understand this one. I agree that a dropStmt isn't needed, and there's no such thing in the patch. Are you saying that you agree with it? > * Maybe security label and comments should be also dumped. Subscription's security labels and comments are already dumped (well should be dumped, AFAICS pg_dump was never taught to look at shared security label on objects other than databases but still try to emit them, pg_dumpall instead handles pg_authid and pg_tablespace), and we can't add security label or comment on subscription's relations so I don't think this patch is missing something? So unless I'm missing something it looks like shared security label handling is partly broken, but that's orthogonal to this patch. > Followings are minor comments. > > > 4. parse_subscription_options > > ``` > + opts->state = defGetString(defel)[0]; > ``` > > [0] is not needed. It still needs to be dereferenced, I personally find [0] a bit clearer in that situation but I'm not opposed to a plain *. > 5. AlterSubscription > > ``` > + supported_opts = SUBOPT_RELID | SUBOPT_STATE | SUBOPT_LSN; > + parse_subscription_options(pstate, stmt->options, > + supported_opts, &opts); > + > + /* relid and state should always be provided. */ > + Assert(IsSet(opts.specified_opts, SUBOPT_RELID)); > + Assert(IsSet(opts.specified_opts, SUBOPT_STATE)); > + > ``` > > SUBOPT_LSN accepts "none" string, which means InvalidLSN. Isn't it better to > reject it? If you mean have an Assert for that I agree. It's not supposed to be used by users so I don't think having non debug check is sensible, as any user provided value has no reason to be correct anyway. > 6. dumpSubscription() > > ``` > + if (dopt->binary_upgrade && dopt->preserve_subscriptions && > + subinfo->suboriginremotelsn) > + { > + appendPQExpBuffer(query, ", lsn = '%s'", subinfo->suboriginremotelsn); > + } > ``` > > {} is not needed. Yes, but the condition being on two lines it makes it more readable. I think a lot of code uses curly braces in similar case already. > 7. pg_dump.h > > ``` > +/* > + * The SubRelInfo struct is used to represent subscription relation. > + */ > +typedef struct _SubRelInfo > +{ > + Oid srrelid; > + char srsubstate; > + char *srsublsn; > +} SubRelInfo; > ``` > > This typedef must be added to typedefs.list. Right! > 8. check_for_subscription_state > > ``` > nb = atooid(PQgetvalue(res, 0, 0)); > if (nb != 0) > { > is_error = true; > pg_log(PG_WARNING, > "\nWARNING: %d subscription have invalid remote_lsn", > nb); > } > ``` > > I think no need to use atooid. Additionaly, isn't it better to show the name of > subscriptions which have invalid remote_lsn? Agreed. > ``` > nb = atooid(PQgetvalue(res, 0, 0)); > if (nb != 0) > { > is_error = true; > pg_log(PG_WARNING, > "\nWARNING: database \"%s\" has %d subscription " > "relations(s) in non-ready state", active_db->db_name, nb); > } > ``` > > Same as above. Agreed. > 9. parseCommandLine > > ``` > + user_opts.preserve_subscriptions = false; > ``` > > I think this initialization is not needed because it is default. It's not strictly needed because of C rules but I think it doesn't really hurt to make it explicit and not have to remember what the standard says. > And maybe you missed to run pgindent. I indeed haven't. There will probably be a global pgindent done soon so I will do one for this patch afterwards.
pgsql-hackers by date: