Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoC1w-BMnqRaSO6PZrUz=Ez-prVO_6JiOer9brrGaPnAsg@mail.gmail.com
Whole thread Raw
In response to RE: Skipping logical replication transactions on subscriber side  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached a rebased patch.
> Thank you for your rebase !
>
> Several review comments on v8.

Thank you for the comments!

>
> (1) doc/src/sgml/logical-replication.sgml
>
> +
> +  <para>
> +   To resolve conflicts, you need to consider changing the data on the subscriber so
> +   that it doesn't conflict with incoming changes, or dropping the conflicting constraint
> +   or unique index, or writing a trigger on the subscriber to suppress or redirect
> +   conflicting incoming changes, or as a last resort, by skipping the whole transaction.
> +   Skipping the whole transaction includes skipping changes that may not violate
> +   any constraint.  This can easily make the subscriber inconsistent, especially if
> +   a user specifies the wrong transaction ID or the position of origin.
> +  </para>
>
> The first sentence is too long and lack of readability slightly.
> One idea to sort out listing items is to utilize "itemizedlist".
> For instance, I imagined something like below.
>
>   <para>
>     To resolve conflicts, you need to consider following actions:
>     <itemizedlist>
>       <listitem>
>         <para>
>           Change the data on the subscriber so that it doesn't conflict with incoming changes
>         </para>
>       </listitem>
>       ...
>       <listitem>
>         <para>
>           As a last resort, skip the whole transaction
>         </para>
>       </listitem>
>     </itemizedlist>
>     ....
>   </para>
>
> What did you think ?
>
> By the way, in case only when you want to keep the current sentence style,
> I have one more question. Do we need "by" in the part
> "by skipping the whole transaction" ? If we focus on only this action,
> I think the sentence becomes "you need to consider skipping the whole transaction".
> If this is true, we don't need "by" in the part.

I personally prefer to keep the current sentence since listing them
seems not suitable in this case. But I agree that "by" is not
necessary here.

>
> (2)
>
> Also, in the same paragraph, we write
>
> + ... This can easily make the subscriber inconsistent, especially if
> +   a user specifies the wrong transaction ID or the position of origin.
>
> The subject of this sentence should be "Those" or "Some of those" ?
> because we want to mention either "new skip xid feature" or
> "pg_replication_origin_advance".

I think "This" in the sentence refers to "Skipping the whole
transaction". In the previous paragraph, we describe that there are
two methods for skipping the whole transaction: this new feature and
pg_replication_origin_advance(). And in this paragraph, we don't
mention any specific methods for skipping the whole transaction but
describe that skipping the whole transaction per se can easily make
the subscriber inconsistent. The current structure is fine with me.

>
> (3) doc/src/sgml/ref/alter_subscription.sgml
>
> Below change contains unnecessary spaces.
> +      the whole transaction.  Using <command> ALTER SUBSCRIPTION ... SKIP </command>
>
> Need to change
> From:
> <command> ALTER SUBSCRIPTION ... SKIP </command>
> To:
> <command>ALTER SUBSCRIPTION ... SKIP</command>

Will remove.

>
> (4) comment in clear_subscription_skip_xid
>
> +        * the flush position the transaction will be sent again and the user
> +        * needs to be set subskipxid again.  We can reduce the possibility by
>
> Shoud change
> From:
> the user needs to be set...
> To:
> the user needs to set...

Will remove.

>
> (5) clear_subscription_skip_xid
>
> +       if (!HeapTupleIsValid(tup))
> +               elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
>
> Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ?
> This suggestion has another aspect that in within one patch, we don't mix
> both ereport and elog at the same time.

I don’t think we need to set errcode since this error is a
should-not-happen error.

>
> (6) apply_handle_stream_abort
>
> @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
>
>         logicalrep_read_stream_abort(s, &xid, &subxid);
>
> +       /*
> +        * We don't expect the user to set the XID of the transaction that is
> +        * rolled back but if the skip XID is set, clear it.
> +        */
> +       if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
> +               clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
> +
>
> In my humble opinion, this still cares about subtransaction xid still.
> If we want to be consistent with top level transactions only,
> I felt checking MySubscription->skipxid == xid should be sufficient.

I thought if we can clear subskipxid whose value has already been
processed on the subscriber with a reasonable cost it makes sense to
do that because it can reduce the possibility of the issue that XID is
wraparound while leaving the wrong in subskipxid. But as you pointed
out, the current behavior doesn’t match the description in the doc:

After the logical replication successfully skips the transaction, the
transaction ID (stored in pg_subscription.subskipxid) is cleared.

and

We don't support skipping individual subtransactions.

I'll remove it in the next version patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Greg Nancarrow
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Amit Langote
Date:
Subject: Re: a misbehavior of partition row movement (?)