Re: Handle infinite recursion in logical replication setup - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Handle infinite recursion in logical replication setup |
Date | |
Msg-id | CALDaNm0XtQVX3taeLKWE-gPQyppqs34ipXawAPOyO=he37MQSg@mail.gmail.com Whole thread Raw |
In response to | Re: Handle infinite recursion in logical replication setup (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Handle infinite recursion in logical replication setup
Re: Handle infinite recursion in logical replication setup Re: Handle infinite recursion in logical replication setup Re: Handle infinite recursion in logical replication setup |
List | pgsql-hackers |
On Fri, Jun 10, 2022 at 10:23 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are some review comments for the patch v18-0003 > > 1. Commit message > > This patch does a couple of things: > change 1) Checks and throws an error if the publication tables were also > subscribing data in the publisher from other publishers when copy_data > parameter is specified as 'ON' and origin parameter is specified as > 'local'. > change 2) Adds force value for copy_data parameter. > > SUGGESTION > This patch does a couple of things: > change 1) Checks and throws an error if 'copy_data = on' and 'origin = > local' but the publication tables were also subscribing data in the > publisher from other publishers. > change 2) Adds 'force' value for copy_data parameter. Modified > ~~~ > > 2. Commit message - about the example > > All my following review comments for the commit message are assuming > that the example steps are as they are written in the patch, but > actually I felt that the example might be more complex than it needed > to be: e.g > - You don’t really need the node2 to have data > - Therefore you don’t need all the added TRUNCATE complications > > E.g. I think you only need node1 (with data) and node2 (no data). > > Then node1 subscribes node2 with (origin=local, copy_data=off). > Then node2 subscribes node1 with (origin=local, copy_data=on). > - Demonstrates exception happens because node1 already had a subscription > - Demonstrates need for the copy_data=force to override that exception > > So please consider using a simpler example for this commit message Modified > ~~~ > > 3. Commit message > > The below help us understand how the first change will be useful: > > If copy_data parameter was used with 'on' in step 5, then an error > will be thrown > to alert the user to prevent inconsistent data being populated: > CREATE SUBSCRIPTION sub_node2_node1 CONNECTION '<node1 details>' > PUBLICATION pub_node1 WITH (copy_data = on, origin = local); > ERROR: CREATE/ALTER SUBSCRIPTION with origin and copy_data as true is not > allowed when the publisher might have replicated data > > SUGGESTION > The steps below help to demonstrate how the new exception is useful: > > The initial copy phase has no way to know the origin of the row data, > so if 'copy_data = on' in the step 5 below, then an error will be > thrown to prevent any potentially non-local data from being copied: > <blank line> > e.g > CREATE SUBSCRIPTION ... Modified > ~~~ > > 4. Commit message > > The following will help us understand how the second change will be useful: > Let's take a simple case where user is trying to setup bidirectional logical > replication between node1 and node2 where the two nodes have some pre-existing > data like below: > node1: Table t1 (c1 int) has data 11, 12, 13, 14 > node2: Table t1 (c1 int) has data 21, 22, 23, 24 > > SUGGESTION > The following steps help to demonstrate how the 'copy_data = force' > change will be useful: > <blank line> > Let's take a scenario where the user wants to set up bidirectional > logical replication between node1 and node2 where the same table on > each node has pre-existing data. e.g. > node1: Table t1 (c1 int) has data 11, 12, 13, 14 > node2: Table t1 (c1 int) has data 21, 22, 23, 24 Modified > ~~~ > > 5. Commit message > > step 4: > node2=# CREATE SUBSCRIPTION sub_node2_node1 Connection '<node1 details>' > node2-# PUBLICATION pub_node1; > CREATE SUBSCRIPTION > > "Connection" => "CONNECTION" Modified > ~~~ > > 6. Commit message > > If table t1 has a unique key, it will cause a unique key > violation and replication won't proceed. > > SUGGESTION > In case, table t1 has a unique key, it will lead to a unique key > violation and replication won't proceed. Modified > ~~~ > > 7. Commit message > > step 3: Create a subscription in node1 to subscribe to node2. Use > copy_data specified as on so that the existing table data is copied during > initial sync: > > SUGGESTION > step 3: Create a subscription in node1 to subscribe to node2. Use > 'copy_data = on' so that the existing table data is copied during > initial sync: Modified > ~~~ > > 8. Commit message > > step 4: Adjust the publication publish settings so that truncate is not > published to the subscribers and truncate the table data in node2: > > SUGGESTION (only added a comma) > step 4: Adjust the publication publish settings so that truncate is > not published to the subscribers, and truncate the table data in > node2: This content is not required any more, I have removed it. > ~~~ > > 9. Commit message > > step 5: Create a subscription in node2 to subscribe to node1. Use copy_data > specified as force when creating a subscription to node1 so that the existing > table data is copied during initial sync: > > SUGGESTION > step 5: Create a subscription in node2 to subscribe to node1. Use > 'copy_data = force' when creating a subscription to node1 so that the > existing table data is copied during initial sync: Modified > ====== > > 10. doc/src/sgml/ref/create_subscription.sgml > > @@ -383,6 +398,15 @@ CREATE SUBSCRIPTION <replaceable > class="parameter">subscription_name</replaceabl > can have non-existent publications. > </para> > > + <para> > + If subscription is created with <literal>origin = local</literal> and > + <literal>copy_data = on</literal>, it will check if the publisher tables are > + being subscribed to any other publisher and throw an error to prevent > + inconsistent data in the subscription. The user can continue with the copy > + operation without throwing any error in this case by specifying > + <literal>copy_data = force</literal>. > + </para> > > SUGGESTION (minor rewording) > If the subscription is created with <literal>origin = local</literal> > and <literal>copy_data = on</literal>, it will check if the publisher > tables are being subscribed to any other publisher and, if so, then > throw an error to prevent possible non-local data from being copied. > The user can override this check and continue with the copy operation > by specifying <literal>copy_data = force</literal>. Modified > ====== > > 11. src/backend/commands/subscriptioncmds.c - parse_subscription_options > > From [1]: > >> What about also allowing copy_data = 2, and making it equivalent to "force"? > > Vignesh: I felt the existing looks ok, no need to support 2. It might confuse the user. > > I don't think it would be confusing, but I also don't feel strongly > enough to debate it. Anyway, I could not find a similar precedent, so > your decision is fine. Ok > ~~~ > > 12. src/backend/commands/subscriptioncmds.c - parse_subscription_options > > @@ -339,17 +406,16 @@ parse_subscription_options(ParseState *pstate, > List *stmt_options, > errmsg("%s and %s are mutually exclusive options", > "connect = false", "create_slot = true"))); > > - if (opts->copy_data && > - IsSet(opts->specified_opts, SUBOPT_COPY_DATA)) > + if (opts->copy_data && IsSet(opts->specified_opts, SUBOPT_COPY_DATA)) > ereport(ERROR, > > This is just a formatting change. Is it needed for this patch? patch. Modified > ~~~ > > 13. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh > > @@ -730,7 +798,8 @@ CreateSubscription(ParseState *pstate, > CreateSubscriptionStmt *stmt, > * PENDING, to allow ALTER SUBSCRIPTION ... REFRESH > * PUBLICATION to work. > */ > - if (opts.twophase && !opts.copy_data && tables != NIL) > + if (opts.twophase && opts.copy_data == COPY_DATA_OFF && > + tables != NIL) > twophase_enabled = true; > > Why is this change needed? I thought the original code is OK now since > COPY_DATA_OFF = 0 Modified > ~~~ > > 14. src/backend/commands/subscriptioncmds.c - AlterSubscription > > @@ -1265,7 +1337,8 @@ AlterSubscription(ParseState *pstate, > AlterSubscriptionStmt *stmt, > * > * For more details see comments atop worker.c. > */ > - if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && opts.copy_data) > + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && > + opts.copy_data) > > This is just a formatting change. Is it needed for this patch? Modified > ~~~ > > 15. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed > > + > + if (!origin || (strcmp(origin, "local") != 0) || copydata != COPY_DATA_ON) > + return; > + > > This condition could be rearranged to put the strcmp last so it is not > called unless absolutely necessary. Modified > ~~~ > > 16. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed > > + appendStringInfoString(&cmd, > + "SELECT DISTINCT N.nspname AS schemaname, C.relname AS tablename, > PS.srrelid as replicated\n" > + "FROM pg_publication P,\n" > > The line is too long; needs wrapping. Modified > ~~~ > > 17. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed > > + if (!slot_attisnull(slot, 3)) > + ereport(ERROR, > + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("table:%s.%s might have replicated data in the publisher", > + nspname, relname), > + errdetail("CREATE/ALTER SUBSCRIPTION with origin as local and > copy_data as true is not allowed when the publisher might have > replicated data."), > + errhint("Use CREATE/ALTER SUBSCRIPTION with copy_data = off or force.")); > > I felt the errmsg may be easier to read using "=" instead of "as". > Anyway, it would be more consistent with the errhint. Also, change the > "true" to "on" to be consistent with the errhint. > > SUGGESTION > errdetail("CREATE/ALTER SUBSCRIPTION with origin = local and copy_data > = on is not allowed when the publisher might have replicated data."), Modified Thanks for the comments, the attached v20 patch has the changes for the same. Regards, Vignesh
Attachment
pgsql-hackers by date: