Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1JxY6xM=xpBp6vKLeULs+49ZSU52AMOoO-4OhB4ygjGag@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Fri, Jul 16, 2021 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Sounds good. I'll incorporate this in the next version patch that I'm
> > planning to submit this week.
>
> Sorry, I could not make it this week. I'll submit them early next week.
>

No problem.

> While updating the patch I thought we need to have more design
> discussion on two points of clearing error details after the error is
> resolved:
>
> 1. How to clear apply worker errors. IIUC we've discussed that once
> the apply worker skipped the transaction we leave the error entry
> itself but clear its fields except for some fields such as failure
> counts. But given that the stats messages could be lost, how can we
> ensure to clear those error details? For table sync workers’ error, we
> can have autovacuum workers periodically check entires of
> pg_subscription_rel and clear the error entry if the table sync worker
> completes table sync (i.g., checking if srsubstate = ‘r’). But there
> is no such information for the apply workers and subscriptions.
>

But won't the corresponding subscription (pg_subscription) have the
XID as InvalidTransactionid once the xid is skipped or at least a
different XID then we would have in pg_stat view? Can we use that to
reset entry via vacuum?

> In
> addition to sending the message clearing the error details just after
> skipping the transaction, I thought that we can have apply workers
> periodically send the message clearing the error details but it seems
> not good.
>

Yeah, such things should be a last resort.

> 2. Do we really want to leave the table sync worker even after the
> error is resolved and the table sync completes? Unlike the apply
> worker error, the number of table sync worker errors could be very
> large, for example, if a subscriber subscribes to many tables. If we
> leave those errors in the stats view, it uses more memory space and
> could affect writing and reading stats file performance. If such left
> table sync error entries are not helpful in practice I think we can
> remove them rather than clear some fields. What do you think?
>

Sounds reasonable to me. One might think to update the subscription
error count by including table_sync errors but not sure if that is
helpful and even if that is helpful, we can extend it later.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: O_DIRECT on macOS
Next
From: Amit Khandekar
Date:
Subject: Re: speed up verifying UTF-8