Re: Logical Replication of sequences - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Logical Replication of sequences |
Date | |
Msg-id | CAHut+PvEgiPyV4sKbDaokVYV7K0wJXkP33W6-9P6Q1=edwQB1A@mail.gmail.com Whole thread Raw |
In response to | Re: Logical Replication of sequences (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: Logical Replication of sequences
|
List | pgsql-hackers |
Hi Vignesh, Here are my review comments for your latest 0730_2* patches. Patch v20240730_2-0001 looks good to me. Patch v20240730_2-0002 looks good to me. My comments for the v20240730_2-0003 patch are below: ////////// GENERAL 1. Inconsistent terms I've noticed there are many variations of how the sequence sync worker is known: - "sequencesync worker" - "sequence sync worker" - "sequence-sync worker" - "sequence synchronization worker" - more? We must settle on some standardized name. AFAICT we generally use "table synchronization worker" in the docs, and "tablesync worker" in the code and comments. IMO, we should do same as that for sequences -- e.g. "sequence synchronization worker" in the docs, and "sequencesync worker" in the code and comments. ====== doc/src/sgml/catalogs.sgml nitpick - the links should jump directly to REFRESH PUBLICATION or REFRESH PUBLICATION SEQUENCES. Currently they go to the top of the ALTER SUBSCRIPTION page which is not as useful. ====== src/backend/commands/sequence.c do_setval: nitpick - minor wording in the function header nitpick - change some param names to more closely resemble the fields they get assigned to (/logcnt/log_cnt/, /iscalled/is_called/) ~ 2. seq->is_called = iscalled; - seq->log_cnt = 0; + seq->log_cnt = (logcnt == SEQ_LOG_CNT_INVALID) ? 0: logcnt; The logic here for SEQ_LOG_CNT_INVALID seemed strange. Why not just #define SEQ_LOG_CNT_INVALID as 0 in the first place if that is what you will assign for invalid? Then you won't need to do anything here except seq->log_cnt = log_cnt; ====== src/backend/catalog/pg_subscription.c HasSubscriptionRelations: nitpick - I think the comment "If even a single tuple exists..." is not quite accurate. e.g. It also has to be the right kind of tuple. ~~ GetSubscriptionRelations: nitpick - Give more description in the function header about the other parameters. nitpick - I felt that a better name for 'all_relations' is all_states. Because in my mind *all relations* sounds more like when both 'all_tables' and 'all_sequences' are true. nitpick - IMO add an Assert to be sure something is being fetched. Assert(get_tables || get_sequences); nitpick - Rephrase the "skip the tables" and "skip the sequences" comments to be more aligned with the code condition. ~ 3. - if (not_ready) + /* Get the relations that are not in ready state */ + if (get_tables && !all_relations) ScanKeyInit(&skey[nkeys++], Anum_pg_subscription_rel_srsubstate, BTEqualStrategyNumber, F_CHARNE, CharGetDatum(SUBREL_STATE_READY)); + /* Get the sequences that are in init state */ + else if (get_sequences && !all_relations) + ScanKeyInit(&skey[nkeys++], + Anum_pg_subscription_rel_srsubstate, + BTEqualStrategyNumber, F_CHAREQ, + CharGetDatum(SUBREL_STATE_INIT)); This is quite tricky, using multiple flags (get_tables and get_sequences) in such a way. It might even be a bug -- e.g. Is the 'else' keyword correct? Otherwise, when both get_tables and get_sequences are true, and all_relations is false, then the sequence part wouldn't even get executed (???). ====== src/backend/commands/subscriptioncmds.c CreateSubscription: nitpick - let's move the 'tables' declaration to be beside the 'sequences' var for consistency. (in passing move other vars too) nitpick - it's not strictly required for the patch, but let's change the 'tables' loop to be consistent with the new sequences loop. ~~~ 4. AlterSubscription_refresh My first impression (from the function comment) is that these function parameters are a bit awkward. For example, - It says: If 'copy_data' parameter is true, the function will set the state to "init"; otherwise, it will set the state to "ready". - It also says: "If 'all_relations' is true, mark all objects with "init" state..." Those statements seem to clash. e.g. if copy_data is false but all_relations is true, then what (???) ~ nitpick - tweak function comment wording. nitpick - introduce a 'relkind' variable to avoid multiple calls of get_rel_relkind(relid) nitpick - use an existing 'relkind' variable instead of calling get_rel_relkind(relid); nitpick - add another comment about skipping (for dropping tablesync slots) ~ 5. + /* + * If all the relations should be re-synchronized, then set the + * state to init for re-synchronization. This is currently + * supported only for sequences. + */ + else if (all_relations) + { + ereport(DEBUG1, + (errmsg_internal("sequence \"%s.%s\" of subscription \"%s\" set to INIT state", get_namespace_name(get_rel_namespace(relid)), get_rel_name(relid), sub->name))); + UpdateSubscriptionRelState(sub->oid, relid, SUBREL_STATE_INIT, + InvalidXLogRecPtr); (This is a continuation of my doubts regarding 'all_relations' in the previous review comment #4 above) Here are some more questions about it: ~ 5a. Why is this an 'else' of the !bsearch? It needs more explanation what this case means. ~ 5b. Along with more description, it might be better to reverse the !bsearch condition, so this ('else') code is not so distantly separated from the condition. ~ 5c. Saying "only supported for sequences" seems strange: e.g. what would it even mean to "re-synchronize" tables? They would all have to be truncated first -- so if re-sync for tables has no meaning maybe the parameter is misnamed and should just be 'resync_all_sequences' or similar? In any case, an Assert here might be good. ====== src/backend/replication/logical/launcher.c logicalrep_worker_find: nitpick - I feel the function comment "We are only interested in..." is now redundant since you are passing the exact worker type you want. nitpick - I added an Assert for the types you are expecting to look for nitpick - The comment "Search for attached worker..." is stale now because there are more search criteria nitpick - IMO the "Skip parallel apply workers." code is no longer needed now that you are matching the worker type. ~~~ 6. logicalrep_worker_launch * - must be valid worker type * - tablesync workers are only ones to have relid * - parallel apply worker is the only kind of subworker + * - sequencesync workers will not have relid */ Assert(wtype != WORKERTYPE_UNKNOWN); Assert(is_tablesync_worker == OidIsValid(relid)); Assert(is_parallel_apply_worker == (subworker_dsm != DSM_HANDLE_INVALID)); + Assert(!is_sequencesync_worker || !OidIsValid(relid)); On further reflection, is that added comment and added Assert even needed? I think they can be removed because saying "tablesync workers are only ones to have relid" seems to already cover what we needed to say/assert. ~~~ logicalrep_worker_stop: nitpick - /type/wtype/ for readability ~~~ 7. /* * Count the number of registered (not necessarily running) sync workers * for a subscription. */ int logicalrep_sync_worker_count(Oid subid) ~ I thought this function should count the sequencesync worker as well. ====== .../replication/logical/sequencesync.c fetch_remote_sequence_data: nitpick - tweaked function comment nitpick - /value/last_value/ for readability ~ 8. + *lsn = DatumGetInt64(slot_getattr(slot, 4, &isnull)); + Assert(!isnull); Should that be DatumGetUInt64? ~~~ copy_sequence: nitpick - tweak function header. nitpick - renamed the sequence vars for consistency, and declared them all together. ====== src/backend/replication/logical/tablesync.c 9. void invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue) { - table_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD; + relation_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD; } I assume you changed the 'table_states_validity' name because this is no longer exclusively for tables. So, should the function name also be similarly changed? ~~~ process_syncing_sequences_for_apply: nitpick - tweaked the function comment nitpick - cannot just say "if there is not one already." a sequence syn worker might not even be needed. nitpick - added blank line for readability ~ 10. + if (syncworker) + { + /* Now safe to release the LWLock */ + LWLockRelease(LogicalRepWorkerLock); + break; + } + else + { This 'else' can be removed if you wish to pull back all the indentation. ~~~ 11. process_syncing_tables(XLogRecPtr current_lsn) Is the function name still OK given that is is now also syncing for sequences? ~~~ FetchTableStates: nitpick - Reworded some of the function comment nitpick - Function comment is stale because it is still referring to the function parameter which this patch removed. nitpick - tweak a comment ====== src/include/commands/sequence.h 12. +#define SEQ_LOG_CNT_INVALID (-1) See a previous review comment (#2 above) where I wondered why not use value 0 for this. ~~~ 13. extern void SequenceChangePersistence(Oid relid, char newrelpersistence); extern void DeleteSequenceTuple(Oid relid); extern void ResetSequence(Oid seq_relid); +extern void do_setval(Oid relid, int64 next, bool iscalled, int64 logcnt); extern void ResetSequenceCaches(void); do_setval() was an OK function name when it was static, but as an exposed API it seems like a terrible name. IMO rename it to something like 'SetSequence' to match the other API functions nearby. ~ nitpick - same change to the parameter names as suggested for the implementation. ====== Kind Regards, Peter Smith. Fujitsu Australia
Attachment
pgsql-hackers by date: