Re: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Failed transaction statistics to measure the logical replication progress |
Date | |
Msg-id | CALDaNm0=J+xhZ=qJ+Otc36O1E-EfO_9_--1Ed__zYCv+a58Uig@mail.gmail.com Whole thread Raw |
In response to | RE: Failed transaction statistics to measure the logical replication progress ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>) |
Responses |
RE: Failed transaction statistics to measure the logical replication progress
|
List | pgsql-hackers |
On Tue, Dec 7, 2021 at 3:12 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Monday, December 6, 2021 11:27 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the updated patch, few comments: > Thank you for your review ! > > > 1) We can keep the documentation similar to mention the count includes both > > table sync worker / main apply worker in case of commit_count/error_count > > and abort_count to keep it consistent. > > + <structfield>commit_count</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Number of transactions successfully applied in this subscription. > > + COMMIT and COMMIT PREPARED increments this counter. > > + </para></entry> > > + </row> > > + > > + <row> > > + <entry role="catalog_table_entry"><para role="column_definition"> > > + <structfield>error_count</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Number of transactions that failed to be applied by the table > > + sync worker or main apply worker in this subscription. > > + </para></entry> > > + </row> > > + > > + <row> > > + <entry role="catalog_table_entry"><para role="column_definition"> > > + <structfield>abort_count</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Number of transactions aborted in this subscription. > > + ROLLBACK PREPARED increments this counter. > > + </para></entry> > > + </row> > Yeah, you are right. Fixed. > Note that abort_count is not used by table sync worker. > > > > 2) Can this be changed: > > + /* > > + * If this is a new error reported by table sync worker, > > consolidate this > > + * error count into the entry of apply worker. > > + */ > > + if (OidIsValid(msg->m_subrelid)) > > + { > > + /* Gain the apply worker stats */ > > + subwentry = pgstat_get_subworker_entry(dbentry, > > + msg->m_subid, > > + > > InvalidOid, true); > > + subwentry->error_count++; > > + } > > + else > > + subwentry->error_count++; /* increment the apply > > worker's counter. */ > > To: > > + /* > > + * If this is a new error reported by table sync worker, > > consolidate this > > + * error count into the entry of apply worker. > > + */ > > + if (OidIsValid(msg->m_subrelid)) > > + /* Gain the apply worker stats */ > > + subwentry = pgstat_get_subworker_entry(dbentry, > > + msg->m_subid, > > + > > InvalidOid, true); > > + > > + subwentry->error_count++; /* increment the apply > > worker's counter. */ > Your suggestion looks better. > Also, I fixed some comments of this part > so that we don't need to add a separate comment at the bottom > for the increment of the apply worker. > > > > 3) Since both 026_worker_stats and 027_worker_xact_stats.pl are testing > > pg_stat_subscription_workers, can we move the tests to 026_worker_stats.pl. > > If possible the error_count validation can be combined with the existing tests. > > diff --git a/src/test/subscription/t/027_worker_xact_stats.pl > > b/src/test/subscription/t/027_worker_xact_stats.pl > > new file mode 100644 > > index 0000000..31dbea1 > > --- /dev/null > > +++ b/src/test/subscription/t/027_worker_xact_stats.pl > > @@ -0,0 +1,162 @@ > > + > > +# Copyright (c) 2021, PostgreSQL Global Development Group > > + > > +# Tests for subscription worker statistics during apply. > > +use strict; > > +use warnings; > > +use PostgreSQL::Test::Cluster; > > +use PostgreSQL::Test::Utils; > > +use Test::More tests => 1; > > + > > +# Create publisher node > Right. I've integrated my tests with 026_worker_stats.pl. > I think error_count validations are combined as you suggested. > Another change I did is to introduce one function > to contribute to better readability of the stats tests. > > Here, the 026_worker_stats.pl didn't look aligned by > pgperltidy. This is not a serious issue at all. > Yet, when I ran pgperltidy, the existing codes > that required adjustments came into my patch. > Therefore, I made a separate part for this. Thanks for the updated patch, few comments: 1) Can we change this: /* + * Report the success of table sync as one commit to consolidate all + * transaction stats into one record. + */ + pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid, + LOGICAL_REP_MSG_COMMIT); + To: /* Report the success of table sync */ + pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid, + LOGICAL_REP_MSG_COMMIT); + 2) Typo: ealier should be earlier + /* + * Report ealier than the call of process_syncing_tables() not to miss an + * increment of commit_count in case it leads to the process exit. See + * process_syncing_tables_for_apply(). + */ + pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid, + LOGICAL_REP_MSG_COMMIT); + 3) Should we add an Assert for subwentry: + /* + * If this is a new error reported by table sync worker, consolidate this + * error count into the entry of apply worker, by swapping the stats + * entries. + */ + if (OidIsValid(msg->m_subrelid)) + subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid, + InvalidOid, true); + subwentry->error_count++; 4) Can we slightly change it to :We can change it: +# Check the update of stats counters. +confirm_transaction_stats_update( + $node_subscriber, + 'commit_count = 1', + 'the commit_count increment by table sync'); + +confirm_transaction_stats_update( + $node_subscriber, + 'error_count = 1', + 'the error_count increment by table sync'); to: +# Check updation of subscription worker transaction count statistics. +confirm_transaction_stats_update( + $node_subscriber, + 'commit_count = 1', + 'check table sync worker commit count is updated'); + +confirm_transaction_stats_update( + $node_subscriber, + 'error_count = 1', + 'check table sync worker error count is updated'); Regards, Vignesh
pgsql-hackers by date: