RE: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Failed transaction statistics to measure the logical replication progress
Date
Msg-id OSBPR01MB4888C88E1163DE2CEC5FED84EDEA9@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Failed transaction statistics to measure the logical replication progress  (Ajin Cherian <itsajin@gmail.com>)
List pgsql-hackers
On Tuesday, July 27, 2021 3:59 PM Ajin Cherian <itsajin@gmail.com> wrote:
> On Thu, Jul 8, 2021 at 4:55 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> 
> > Attached file is the POC patch for this.
> > Current design is to save failed stats data in the ReplicationSlot struct.
> > This is because after the error, I'm not able to access the ReorderBuffer
> object.
> > Thus, I chose the object where I can interact with at the
> ReplicationSlotRelease timing.
> 
> I think this is a good idea to capture the failed replication stats.
> But I'm wondering how you are deciding if the replication failed or not? Not all
> cases of ReplicationSLotRelease are due to a failure. It could also be due to a
> planned dropping of subscription or disable of subscription. I have not tested
> this but won't the failed stats be updated in this case as well? Is that correct?
Yes, what you said is true. Currently, when I run DROP SUBSCRIPTION or
ALTER SUBSCRIPTION DISABLE, failed stats values are added
to pg_stat_replication_slots unintentionally, if they have some left values.
This is because all those commands, like the subscriber apply failure
by duplication error, have the publisher get 'X' message at ProcessRepliesIfAny()
and go into the path to call ReplicationSlotRelease().

Also, other opportunities like server stop call the same in the end,
which leads to a situation that after the server restart,
the value of failed stats catch up with the (successful) existing stats values.
Accordingly, I need to change the patch to adjust those situations.
Thank you.


Best Regards,
    Takamichi Osumi


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Reduce the number of special cases to build contrib modules on windows
Next
From: Robert Haas
Date:
Subject: Re: Showing applied extended statistics in explain