RE: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Failed transaction statistics to measure the logical replication progress
Date
Msg-id OSBPR01MB48881C48F510320E4B5F7DBDED149@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Failed transaction statistics to measure the logical replication progress  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Tuesday, July 13, 2021 2:50 PM vignesh C <vignesh21@gmail.com> wrote:
> > When the current HEAD fails during logical decoding, the failure
> > increments txns count in pg_stat_replication_slots - [1] and adds the
> > transaction size to the sum of bytes in the same repeatedly on the
> > publisher, until the problem is solved.
> > One of the good examples is duplication error on the subscriber side
> > and this applies to both streaming and spill cases as well.
> >
> > This update prevents users from grasping the exact number and size of
> > successful and unsuccessful transactions. Accordingly, we need to have
> > new columns of failed transactions that will work to differentiate
> > both of them for all types, which means spill, streaming and normal
> > transactions. This will help users to measure the exact status of
> > logical replication.
> >
> > Attached file is the POC patch for this.
> > Current design is to save failed stats data in the ReplicationSlot struct.
> > This is because after the error, I'm not able to access the ReorderBuffer
> object.
> > Thus, I chose the object where I can interact with at the
> ReplicationSlotRelease timing.
> > Any ideas and comments are welcome.
...
> +1 for having logical replication failed statistics. Currently if
> there is any transaction failure in the subscriber after sending the decoded
> data to the subscriber like constraint violation, object not exist, the statistics
> will include the failed decoded transaction info and there is no way to identify
> the actual successful transaction data. This patch will help in measuring the
> actual decoded transaction data.
Yeah, we can apply this improvement to other error cases.
Thank you for sharing ideas to make this enhancement more persuasive.

Best Regards,
    Takamichi Osumi


pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: row filtering for logical replication
Next
From: Ronan Dunklau
Date:
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort