Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoD_Fvj-rbEUNGmOJ=Usg0Q669md=bFXTznm1jivXXfvJQ@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > I've attached updated patches.
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > Few comments:
> > > > > ==============
> > > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > > is successful (say when we reset skip_xid or when tablesync is
> > > > > finished) or when we drop subscription? I think the same applies to
> > > > > apply worker. I think we may want to track it in some way whether an
> > > > > error has occurred before sending the message but relying completely
> > > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > > drop subscription we can simply send the message as that is not a
> > > > > frequent operation. I might be missing something here because in the
> > > > > tests after drop subscription you are expecting the entries from the
> > > > > view to get cleared
> > > >
> > > > Yes, I think we can have tablesync worker send a message to drop stats
> > > > once tablesync is successful. But if we do that also when dropping a
> > > > subscription, I think we need to do that only the transaction is
> > > > committed since we can drop a subscription that doesn't have a
> > > > replication slot and rollback the transaction. Probably we can send
> > > > the message only when the subscritpion does have a replication slot.
> > > >
> > >
> > > Right. And probably for apply worker after updating skip xid.
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped.
> >
>
> Fair enough. So statistics can be removed either by vacuum or drop
> subscription. Also, if we go by this logic then there is no harm in
> retaining the stat entries for tablesync errors. Why have different
> behavior for apply and tablesync workers?

My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped. Also, I’m concerned about a situation like where a lot of
table sync failed. In which case, if we don’t drop table sync worker
statistics after completing its job, we end up having a lot of entries
in the view unless the subscription is dropped.

>
> I have another question in this regard. Currently, the reset function
> seems to be resetting only the first stat entry for a subscription.
> But can't we have multiple stat entries for a subscription considering
> the view's cumulative nature?

I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription. And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: parallel vacuum comments
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side