Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAD21AoC1w-BMnqRaSO6PZrUz=Ez-prVO_6JiOer9brrGaPnAsg@mail.gmail.com Whole thread Raw |
In response to | RE: Skipping logical replication transactions on subscriber side ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I've attached a rebased patch. > Thank you for your rebase ! > > Several review comments on v8. Thank you for the comments! > > (1) doc/src/sgml/logical-replication.sgml > > + > + <para> > + To resolve conflicts, you need to consider changing the data on the subscriber so > + that it doesn't conflict with incoming changes, or dropping the conflicting constraint > + or unique index, or writing a trigger on the subscriber to suppress or redirect > + conflicting incoming changes, or as a last resort, by skipping the whole transaction. > + Skipping the whole transaction includes skipping changes that may not violate > + any constraint. This can easily make the subscriber inconsistent, especially if > + a user specifies the wrong transaction ID or the position of origin. > + </para> > > The first sentence is too long and lack of readability slightly. > One idea to sort out listing items is to utilize "itemizedlist". > For instance, I imagined something like below. > > <para> > To resolve conflicts, you need to consider following actions: > <itemizedlist> > <listitem> > <para> > Change the data on the subscriber so that it doesn't conflict with incoming changes > </para> > </listitem> > ... > <listitem> > <para> > As a last resort, skip the whole transaction > </para> > </listitem> > </itemizedlist> > .... > </para> > > What did you think ? > > By the way, in case only when you want to keep the current sentence style, > I have one more question. Do we need "by" in the part > "by skipping the whole transaction" ? If we focus on only this action, > I think the sentence becomes "you need to consider skipping the whole transaction". > If this is true, we don't need "by" in the part. I personally prefer to keep the current sentence since listing them seems not suitable in this case. But I agree that "by" is not necessary here. > > (2) > > Also, in the same paragraph, we write > > + ... This can easily make the subscriber inconsistent, especially if > + a user specifies the wrong transaction ID or the position of origin. > > The subject of this sentence should be "Those" or "Some of those" ? > because we want to mention either "new skip xid feature" or > "pg_replication_origin_advance". I think "This" in the sentence refers to "Skipping the whole transaction". In the previous paragraph, we describe that there are two methods for skipping the whole transaction: this new feature and pg_replication_origin_advance(). And in this paragraph, we don't mention any specific methods for skipping the whole transaction but describe that skipping the whole transaction per se can easily make the subscriber inconsistent. The current structure is fine with me. > > (3) doc/src/sgml/ref/alter_subscription.sgml > > Below change contains unnecessary spaces. > + the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command> > > Need to change > From: > <command> ALTER SUBSCRIPTION ... SKIP </command> > To: > <command>ALTER SUBSCRIPTION ... SKIP</command> Will remove. > > (4) comment in clear_subscription_skip_xid > > + * the flush position the transaction will be sent again and the user > + * needs to be set subskipxid again. We can reduce the possibility by > > Shoud change > From: > the user needs to be set... > To: > the user needs to set... Will remove. > > (5) clear_subscription_skip_xid > > + if (!HeapTupleIsValid(tup)) > + elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name); > > Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ? > This suggestion has another aspect that in within one patch, we don't mix > both ereport and elog at the same time. I don’t think we need to set errcode since this error is a should-not-happen error. > > (6) apply_handle_stream_abort > > @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s) > > logicalrep_read_stream_abort(s, &xid, &subxid); > > + /* > + * We don't expect the user to set the XID of the transaction that is > + * rolled back but if the skip XID is set, clear it. > + */ > + if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid) > + clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0); > + > > In my humble opinion, this still cares about subtransaction xid still. > If we want to be consistent with top level transactions only, > I felt checking MySubscription->skipxid == xid should be sufficient. I thought if we can clear subskipxid whose value has already been processed on the subscriber with a reasonable cost it makes sense to do that because it can reduce the possibility of the issue that XID is wraparound while leaving the wrong in subskipxid. But as you pointed out, the current behavior doesn’t match the description in the doc: After the logical replication successfully skips the transaction, the transaction ID (stored in pg_subscription.subskipxid) is cleared. and We don't support skipping individual subtransactions. I'll remove it in the next version patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: