Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Greg Nancarrow |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAJcOf-fBLiHZdUKpOQ=gN1BZrt4imMwAVUFGK=TyWyVfdf0VHg@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Fri, Dec 10, 2021 at 4:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I've attached an updated patch. The new syntax is like "ALTER > SUBSCRIPTION testsub SKIP (xid = '123')". > I have some review comments: (1) Patch comment - some suggested wording improvements BEFORE: If incoming change violates any constraint, logical replication stops AFTER: If an incoming change violates any constraint, logical replication stops BEFORE: The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating pg_subscription.subskipxid field, telling the apply worker to skip the transaction. AFTER: The user can specify the XID of the transaction to skip using ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating the pg_subscription.subskipxid field, telling the apply worker to skip the transaction. src/sgml/logical-replication.sgml (2) Some suggested wording improvements (i) Missing "the" BEFORE: + the existing data. When a conflict produce an error, it is shown in AFTER: + the existing data. When a conflict produce an error, it is shown in the (ii) Suggest starting a new sentence BEFORE: + and it is also shown in subscriber's server log as follows: AFTER: + The error is also shown in the subscriber's server log as follows: (iii) Context message should say "at ..." instead of "with commit timestamp ...", to match the actual output from the current code BEFORE: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00 AFTER: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00 (iv) The following paragraph seems out of place, with the information presented in the wrong order: + <para> + In this case, you need to consider changing the data on the subscriber so that it + doesn't conflict with incoming changes, or dropping the conflicting constraint or + unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes, or as a last resort, by skipping the whole transaction. + They skip the whole transaction, including changes that may not violate any + constraint. They may easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin. + </para> How about rearranging it as follows: + <para> + These methods skip the whole transaction, including changes that may not violate + any constraint. They may easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin, and should + be used as a last resort. + Alternatively, you might consider changing the data on the subscriber so that it + doesn't conflict with incoming changes, or dropping the conflicting constraint or + unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes. + </para> doc/src/sgml/ref/alter_subscription.sgml (3) (i) Doc needs clarification BEFORE: + the whole transaction. The logical replication worker skips all data AFTER: + the whole transaction. For the latter case, the logical replication worker skips all data (ii) "Setting -1 means to reset the transaction ID" Shouldn't it be explained what resetting actually does and when it can be, or is needed to be, done? Isn't it automatically reset? I notice that negative values (other than -1) seem to be regarded as valid - is that right? Also, what happens if this option is set multiple times? Does it just override and use the latest setting? (other option handling errors out with errorConflictingDefElem()). e.g. alter subscription sub skip (xid = 721, xid = 722); src/backend/replication/logical/worker.c (4) Shouldn't the "done skipping logical replication transaction" message also include the skipped XID value at the end? src/test/subscription/t/027_skip_xact.pl (5) Some suggested wording improvements (i) BEFORE: +# Test skipping the transaction. This function must be called after the caller +# inserting data that conflict with the subscriber. After waiting for the +# subscription worker stats are updated, we skip the transaction in question +# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue +# working by inserting $nonconflict_data on the publisher. AFTER: +# Test skipping the transaction. This function must be called after the caller +# inserts data that conflicts with the subscriber. After waiting for the +# subscription worker stats to be updated, we skip the transaction in question +# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue +# working by inserting $nonconflict_data on the publisher. (ii) BEFORE: +# will conflict with the data replicated from publisher later. AFTER: +# will conflict with the data replicated later from the publisher. Regards, Greg Nancarrow Fujitsu Australia
pgsql-hackers by date: