Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoBWNfZbfR4TD54_obYWWajrDL3oMJ-phTpnwaOLVR5PrA@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Thu, Sep 2, 2021 at 9:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>

Thank you for the comments!

> Some initial comments for the v12-0003 patch:
>
> (1) Patch comment
> "This commit introduces another way to skip the transaction in question."
>
> I think it should further explain: "This commit introduces another way
> to skip the transaction in question, other than manually updating the
> subscriber's database or using pg_replication_origin_advance()."

Updated.

>
>
> doc/src/sgml/logical-replication.sgml
> (2)
>
> Suggested minor update:
>
> BEFORE:
> +   transaction that conflicts with the existing data.  When a conflict produce
> +   an error, it is shown in
> <structname>pg_stat_subscription_errors</structname>
> +   view as follows:
> AFTER:
> +   transaction that conflicts with the existing data.  When a conflict produces
> +   an error, it is recorded in the
> <structname>pg_stat_subscription_errors</structname>
> +   view as follows:

Fixed.

>
> (3)
> +   found from those outputs (transaction ID 740 in the above case).
> The transaction
>
> Shouldn't it be transaction ID 716?

Right, fixed.

>
> (4)
> +   can be skipped by setting <replaceable>skip_xid</replaceable> to
> the subscription
>
> Is it better to say here ... "on the subscription" ?

Okay, fixed.

>
> (5)
> Just skipping a transaction could make a subscriber inconsistent, right?
>
> Would it be better as follows?
>
> BEFORE:
> +   In either way, those should be used as a last resort. They skip the whole
> +   transaction including changes that may not violate any constraint and easily
> +   make subscriber inconsistent if a user specifies the wrong transaction ID or
> +   the position of origin.
>
> AFTER:
> +   Either way, those transaction skipping methods should be used as a
> last resort.
> +   They skip the whole transaction, including changes that may not violate any
> +   constraint.  They may easily make the subscriber inconsistent,
> especially if a
> +   user specifies the wrong transaction ID or the position of origin.

Agreed, fixed.

>
> (6)
> The grammar is not great in the following description, so here's a
> suggested improvement:
>
> BEFORE:
> +          incoming change or by skipping the whole transaction.  This option
> +          specifies transaction ID that logical replication worker skips to
> +          apply.  The logical replication worker skips all data modification
>
> AFTER:
> +          incoming changes or by skipping the whole transaction.  This option
> +          specifies the ID of the transaction whose application is to
> be skipped
> +          by the logical replication worker.  The logical replication worker
> +          skips all data modification

Fixed.

>
>
> src/backend/postmaster/pgstat.c
> (7)
> BEFORE:
> + * Tell the collector about clear the error of subscription.
> AFTER:
> + * Tell the collector to clear the subscription error.

Fixed.

>
>
> src/backend/replication/logical/worker.c
> (8)
> + * subscription is invalidated and* MySubscription->skipxid gets
> changed or reset.
>
> There is a "*" after "and".

Fixed.

>
> (9)
> Do these lines really need to be moved up?
>
> + /* extract XID of the top-level transaction */
> + stream_xid = logicalrep_read_stream_start(s, &first_segment);
> +

I had missed to revert this change, fixed.

>
> src/backend/postmaster/pgstat.c
> (10)
>
> + bool m_clear; /* clear all fields except for last_failure and
> + * last_errmsg */
>
> I think it should say: clear all fields except for last_failure,
> last_errmsg and stat_reset_timestamp.

Fixed.

Those comments including your comments on the v12-0001 and v12-0002
are incorporated into local branch. I'll submit the updated patches
after incorporating all other comments.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side