Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Design of pg_stat_subscription_workers vs pgstats
Date
Msg-id CAD21AoAqEt=TbUDY96Uhf9pNuh-NDR_3-CByd+xR=tFEq==1NA@mail.gmail.com
Whole thread Raw
In response to Re: Design of pg_stat_subscription_workers vs pgstats  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Design of pg_stat_subscription_workers vs pgstats
List pgsql-hackers
On Wed, Feb 16, 2022 at 5:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Feb 15, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2022-02-03 13:33:08 +0900, Masahiko Sawada wrote:
> > > I see that important information such as error-XID that can be used
> > > for ALTER SUBSCRIPTION SKIP needs to be stored in a reliable way, and
> > > using system catalogs is a reasonable way for this purpose. But it's
> > > still unclear to me why all error information that is currently shown
> > > in pg_stat_subscription_workers view, including error-XID and the
> > > error message, relation OID, action, etc., need to be stored in the
> > > catalog. The information other than error-XID doesn't necessarily need
> > > to be reliable compared to error-XID. I think we can have
> > > error-XID/LSN in the pg_subscription catalog and have other error
> > > information in pg_stat_subscription_workers view. After the user
> > > checks the current status of logical replication by checking
> > > error-XID/LSN, they can check pg_stat_subscription_workers for
> > > details.
> >
> > The stats system isn't geared towards storing error messages and
> > such. Generally it is about storing counts of events etc, not about
> > information about a single event. Obviously there are a few cases where that
> > boundary isn't that clear...
> >
>
> True, in the beginning, we discussed this information to be stored in
> system catalog [1] (See .... Also, I am thinking that instead of a
> stat view, do we need to consider having a system table .. ) but later
> discussion led us to store this as stats.
>
> > IOW, storing information like:
> > - subscription oid
> > - retryable error count
> > - hard error count
> > - #replicated inserts
> > - #replicated updates
> > - #replicated deletes
> >
> > is what pgstats is for.
> >
>
> Some of this and similar ((like error count, last_error_time)) is
> present in stats and something on the lines of the other information
> is proposed in [2] (xacts successfully replicated (committed),
> aborted, etc) to be stored along with it.
>
> > But not
> > - subscription oid
> > - error message
> > - xid of error
> > - ...
> >
>
> I think from the current set of things we are capturing, the other
> thing in this list will be error_command (insert/update/delete..) and
> or probably error_code. So, we can keep this information in a system
> table.

Agreed. Also, we can have also commit-LSN or replace error-XID with error-LSN?

>
> Based on this discussion, it appears to me what we want here is to
> store the error info in some persistent way (system table) and the
> counters (both error and success) of logical replication in stats. If
> we can't achieve this work (separation) in the next few weeks (before
> the feature freeze) then I'll revert the work related to stats.

There was an idea to use shmem to store error info but it seems to be
better to use a system catalog to persist them.

I'll summarize changes we discussed and make a plan of changes and
feature designs toward the feature freeze (and v16). I think that once
we get a consensus on them we can start implementation and move it
forward.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Fix a comment in worker.c
Next
From: Amul Sul
Date:
Subject: Re: Proposal for internal Numeric to Uint64 conversion function.