Re: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

From vignesh C
Subject Re: Failed transaction statistics to measure the logical replication progress
Date
Msg-id CALDaNm0=J+xhZ=qJ+Otc36O1E-EfO_9_--1Ed__zYCv+a58Uig@mail.gmail.com
Whole thread Raw
In response to RE: Failed transaction statistics to measure the logical replication progress  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Failed transaction statistics to measure the logical replication progress
List pgsql-hackers
On Tue, Dec 7, 2021 at 3:12 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, December 6, 2021 11:27 PM vignesh C <vignesh21@gmail.com> wrote:
> > Thanks for the updated patch, few comments:
> Thank you for your review !
>
> > 1) We can keep the documentation similar to mention the count includes both
> > table sync worker / main apply worker in case of commit_count/error_count
> > and abort_count to keep it consistent.
> > +       <structfield>commit_count</structfield> <type>bigint</type>
> > +      </para>
> > +      <para>
> > +       Number of transactions successfully applied in this subscription.
> > +       COMMIT and COMMIT PREPARED increments this counter.
> > +      </para></entry>
> > +     </row>
> > +
> > +     <row>
> > +      <entry role="catalog_table_entry"><para role="column_definition">
> > +       <structfield>error_count</structfield> <type>bigint</type>
> > +      </para>
> > +      <para>
> > +       Number of transactions that failed to be applied by the table
> > +       sync worker or main apply worker in this subscription.
> > +       </para></entry>
> > +     </row>
> > +
> > +     <row>
> > +      <entry role="catalog_table_entry"><para role="column_definition">
> > +       <structfield>abort_count</structfield> <type>bigint</type>
> > +      </para>
> > +      <para>
> > +       Number of transactions aborted in this subscription.
> > +       ROLLBACK PREPARED increments this counter.
> > +      </para></entry>
> > +     </row>
> Yeah, you are right. Fixed.
> Note that abort_count is not used by table sync worker.
>
>
> > 2) Can this be changed:
> > +       /*
> > +        * If this is a new error reported by table sync worker,
> > consolidate this
> > +        * error count into the entry of apply worker.
> > +        */
> > +       if (OidIsValid(msg->m_subrelid))
> > +       {
> > +               /* Gain the apply worker stats */
> > +               subwentry = pgstat_get_subworker_entry(dbentry,
> > + msg->m_subid,
> > +
> >                     InvalidOid, true);
> > +               subwentry->error_count++;
> > +       }
> > +       else
> > +               subwentry->error_count++;       /* increment the apply
> > worker's counter. */
> > To:
> > +       /*
> > +        * If this is a new error reported by table sync worker,
> > consolidate this
> > +        * error count into the entry of apply worker.
> > +        */
> > +       if (OidIsValid(msg->m_subrelid))
> > +               /* Gain the apply worker stats */
> > +               subwentry = pgstat_get_subworker_entry(dbentry,
> > + msg->m_subid,
> > +
> >                     InvalidOid, true);
> > +
> > +       subwentry->error_count++;       /* increment the apply
> > worker's counter. */
> Your suggestion looks better.
> Also, I fixed some comments of this part
> so that we don't need to add a separate comment at the bottom
> for the increment of the apply worker.
>
>
> > 3) Since both 026_worker_stats and 027_worker_xact_stats.pl are testing
> > pg_stat_subscription_workers, can we move the tests to 026_worker_stats.pl.
> > If possible the error_count validation can be combined with the existing tests.
> > diff --git a/src/test/subscription/t/027_worker_xact_stats.pl
> > b/src/test/subscription/t/027_worker_xact_stats.pl
> > new file mode 100644
> > index 0000000..31dbea1
> > --- /dev/null
> > +++ b/src/test/subscription/t/027_worker_xact_stats.pl
> > @@ -0,0 +1,162 @@
> > +
> > +# Copyright (c) 2021, PostgreSQL Global Development Group
> > +
> > +# Tests for subscription worker statistics during apply.
> > +use strict;
> > +use warnings;
> > +use PostgreSQL::Test::Cluster;
> > +use PostgreSQL::Test::Utils;
> > +use Test::More tests => 1;
> > +
> > +# Create publisher node
> Right. I've integrated my tests with 026_worker_stats.pl.
> I think error_count validations are combined as you suggested.
> Another change I did is to introduce one function
> to contribute to better readability of the stats tests.
>
> Here, the 026_worker_stats.pl didn't look aligned by
> pgperltidy. This is not a serious issue at all.
> Yet, when I ran pgperltidy, the existing codes
> that required adjustments came into my patch.
> Therefore, I made a separate part for this.

Thanks for the updated patch, few comments:
1) Can we change this:
        /*
+        * Report the success of table sync as one commit to consolidate all
+        * transaction stats into one record.
+        */
+       pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
  LOGICAL_REP_MSG_COMMIT);
+
To:
        /* Report the success of table sync */
+       pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
  LOGICAL_REP_MSG_COMMIT);
+

2) Typo: ealier should be earlier
+       /*
+        * Report ealier than the call of process_syncing_tables() not
to miss an
+        * increment of commit_count in case it leads to the process exit. See
+        * process_syncing_tables_for_apply().
+        */
+       pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
  LOGICAL_REP_MSG_COMMIT);
+

3) Should we add an Assert for subwentry:
+       /*
+        * If this is a new error reported by table sync worker,
consolidate this
+        * error count into the entry of apply worker, by swapping the stats
+        * entries.
+        */
+       if (OidIsValid(msg->m_subrelid))
+               subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+
                    InvalidOid, true);
+       subwentry->error_count++;

4) Can we slightly change it to :We can change it:
+# Check the update of stats counters.
+confirm_transaction_stats_update(
+       $node_subscriber,
+       'commit_count = 1',
+       'the commit_count increment by table sync');
+
+confirm_transaction_stats_update(
+       $node_subscriber,
+       'error_count = 1',
+       'the error_count increment by table sync');
to:
+# Check updation of subscription worker transaction count statistics.
+confirm_transaction_stats_update(
+       $node_subscriber,
+       'commit_count = 1',
+       'check table sync worker commit count is updated');
+
+confirm_transaction_stats_update(
+       $node_subscriber,
+       'error_count = 1',
+       'check table sync worker error count is updated');

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: speed up verifying UTF-8
Next
From: Tom Lane
Date:
Subject: Re: Remove pg_strtouint64(), use strtoull() directly