Re: Optionally automatically disable logical replication subscriptions on error - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Optionally automatically disable logical replication subscriptions on error |
Date | |
Msg-id | CAHut+Pvfmav1MaGKNZR+PZqCT6wZv1ekTi5E+LuWYFH++ek7rA@mail.gmail.com Whole thread Raw |
In response to | RE: Optionally automatically disable logical replication subscriptions on error ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
Responses |
RE: Optionally automatically disable logical replication subscriptions on error
|
List | pgsql-hackers |
Please see below my review comments for v22. ====== 1. Commit message "table sync worker" -> "tablesync worker" ~~~ 2. doc/src/sgml/catalogs.sgml + <para> + If true, the subscription will be disabled when subscription + workers detect any errors + </para></entry> It felt a bit strange to say "subscription" 2x in the sentence, but I am not sure how to improve it. Maybe like below? BEFORE If true, the subscription will be disabled when subscription workers detect any errors SUGGESTED If true, the subscription will be disabled if one of its workers detects an error ~~~ 3. src/backend/replication/logical/worker.c - DisableSubscriptionOnError @@ -2802,6 +2803,69 @@ LogicalRepApplyLoop(XLogRecPtr last_received) } /* + * Disable the current subscription, after error recovery processing. + */ +static void +DisableSubscriptionOnError(void) I thought the "after error recovery processing" part was a bit generic and did not really say what it was doing. BEFORE Disable the current subscription, after error recovery processing. SUGGESTED Disable the current subscription, after logging the error that caused this function to be called. ~~~ 4. src/backend/replication/logical/worker.c - start_apply + if (MySubscription->disableonerr) + { + DisableSubscriptionOnError(); + return; + } + + MemoryContextSwitchTo(ecxt); + PG_RE_THROW(); + } + PG_END_TRY(); The current code looks correct, but I felt it is a bit tricky to easily see the execution path after the return. Since it will effectively just exit anyhow I think it will be simpler just to do that explicitly right here instead of the 'return'. This will also make the code consistent with the same 'disableonerr' logic of the start_start_sync. SUGGESTION if (MySubscription->disableonerr) { DisableSubscriptionOnError(); proc_exit(0); } ~~~ 5. src/bin/pg_dump/pg_dump.c @@ -4463,6 +4473,9 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo) if (strcmp(subinfo->subtwophasestate, two_phase_disabled) != 0) appendPQExpBufferStr(query, ", two_phase = on"); + if (strcmp(subinfo->subdisableonerr, "f") != 0) + appendPQExpBufferStr(query, ", disable_on_error = true"); + Although the code is correct, I think it would be more natural to set this option as true when the user wants it true. e.g. check for "t" same as 'subbinary' is doing. This way, even if there was some unknown/corrupted value the code would do nothing, which is the behaviour you want... SUGGESTION if (strcmp(subinfo->subdisableonerr, "t") == 0) ~~~ 6. src/include/catalog/pg_subscription.h @@ -67,6 +67,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW char subtwophasestate; /* Stream two-phase transactions */ + bool subdisableonerr; /* True if occurrence of apply errors + * should disable the subscription */ The comment seems not quite right because it's not just about apply errors. E.g. I think any error in tablesync will cause disablement too. BEFORE True if occurrence of apply errors should disable the subscription SUGGESTED True if a worker error should cause the subscription to be disabled ~~~ 7. src/test/regress/sql/subscription.sql - whitespace +-- now it works +CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, disable_on_error = false); + +\dRs+ + +ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true); + +\dRs+ +ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE); +DROP SUBSCRIPTION regress_testsub; + I think should be a blank line after that last \dRs+ just like the other one, because it belongs logically with the code above it, not with the ALTER slot_name. ~~~ 8. src/test/subscription/t/028_disable_on_error.pl - filename The 028 number needs to be bumped because there is already a TAP test called 028 now ~~~ 9. src/test/subscription/t/028_disable_on_error.pl - missing test There was no test case for the last combination where the user correct the apply worker problem: E.g. After a previous error/disable of the subscriber, remove the index, publish the inserts again, and check they get applied properly. ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: