On Thu, Feb 24, 2022 at 7:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > ~~~
> >
> > 6. doc/src/sgml/monitoring.sgml - why two counters?
> >
> > Please forgive this noob question...
> >
> > I see there are 2 error_count columns (one for each kind of worker)
> > but I was wondering why it is useful for users to be able to
> > distinguish if the error came from the tablesync workers or from the
> > apply workers? Do you have any example?
> >
> > Also, IIRC sometimes the tablesync might actually do a few "apply"
> > changes itself... so the distinction may become a bit fuzzy...
>
> I think that the tablesync phase and the apply phase can fail for
> different reasons. So these values would be a good indicator for users
> to check if each phase works fine.
>
> After more thoughts, I think it's better to increment sync_error_count
> also when a tablesync worker fails while applying the changes.
>
This sounds reasonable to me because even if we are applying the
changes in tablesync worker, it is only for that particular table. So,
it seems okay to increment it under category with the description:
"Number of times the error occurred during the initial table
synchronization".
--
With Regards,
Amit Kapila.