On Mon, Jul 19, 2021 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 19, 2021 at 12:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Jul 17, 2021 at 12:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > 1. How to clear apply worker errors. IIUC we've discussed that once
> > > the apply worker skipped the transaction we leave the error entry
> > > itself but clear its fields except for some fields such as failure
> > > counts. But given that the stats messages could be lost, how can we
> > > ensure to clear those error details? For table sync workers’ error, we
> > > can have autovacuum workers periodically check entires of
> > > pg_subscription_rel and clear the error entry if the table sync worker
> > > completes table sync (i.g., checking if srsubstate = ‘r’). But there
> > > is no such information for the apply workers and subscriptions. In
> > > addition to sending the message clearing the error details just after
> > > skipping the transaction, I thought that we can have apply workers
> > > periodically send the message clearing the error details but it seems
> > > not good.
> >
> > I think that the motivation behind the idea of leaving error entries
> > and clearing theirs some fields is that users can check if the error
> > is successfully resolved and the worker is working find. But we can
> > check it also in another way, for example, checking
> > pg_stat_subscription view. So is it worth considering leaving the
> > apply worker errors as they are?
> >
>
> I think so. Basically, we will send the clear message after skipping
> the exact but I think it is fine if that message is lost. At worst, it
> will be displayed as the last error details. If there is another error
> it will be overwritten or probably we should have a function *_reset()
> which allows the user to reset a particular subscription's error info.
That makes sense. I'll incorporate this idea in the next version patch.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/