Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1LoUULzh-WreQe3TcL5s5oUG3c1bibC8rSThqknvvKvPQ@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Mon, Jan 17, 2022 at 9:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 6.
> > +static void
> > +maybe_start_skipping_changes(TransactionId xid)
> > +{
> > + Assert(!is_skipping_changes());
> > + Assert(!in_remote_transaction);
> > + Assert(!in_streamed_transaction);
> > +
> > + /* Make sure subscription cache is up-to-date */
> > + maybe_reread_subscription();
> >
> > Why do we need to update the cache here by calling
> > maybe_reread_subscription() and at other places in the patch? It is
> > sufficient to get the skip_xid value at the start of the worker via
> > GetSubscription().
>
> MySubscription could be out-of-date after a user changes the catalog.
> In non-skipping change cases, we check it when starting the
> transaction in begin_replication_step() which is called, e.g., when
> applying an insert change. But I think we need to make sure it’s
> up-to-date at the beginning of applying changes, that is, before
> starting a transaction. Otherwise, we may end up skipping the
> transaction based on out-of-dated subscription cache.
>

I thought the user would normally set skip_xid only after an error
which means that the value should be as new as the time of the start
of the worker. I am slightly worried about the cost we might need to
pay for this additional look-up in case skip_xid is not changed. Do
you see any valid user scenario where we might not see the required
skip_xid? I am okay with calling this if we really need it.

> >
> > 7. In maybe_reread_subscription(), isn't there a need to check whether
> > skip_xid is changed where we exit and launch the worker and compare
> > other subscription parameters?
>
> IIUC we relaunch the worker here when subscription parameters such as
> slot_name was changed. In the current implementation, I think that
> relaunching the worker is not necessarily necessary when skip_xid is
> changed. For instance, when skipping the prepared transaction, we
> deliberately don’t clear subskipxid of pg_subscription and do that at
> commit-prepared or rollback-prepared case. There are chances that the
> user changes skip_xid before commit-prepared or rollback-prepared. But
> we tolerate this case.
>

I think between prepare and commit prepared, the user only needs to
change it if there is another error in which case we will anyway
restart and load the new value of same. But, I understand that we
don't need to restart if skip_xid is changed as it might not impact
remote connection in any way, so I am fine for not doing anything for
this.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Greg Nancarrow
Date:
Subject: Re: row filtering for logical replication
Next
From: Noah Misch
Date:
Subject: Re: A test for replay of regression tests