Home > mailing lists

Re: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

From	vignesh C
Subject	Re: Failed transaction statistics to measure the logical replication progress
Date	December 3, 2021 06:11:32
Msg-id	CALDaNm32HHjdwmuoF+Nw5CU70r819Kq+Tmat3bzkkvSv_2u=gA@mail.gmail.com Whole thread
In response to	RE: Failed transaction statistics to measure the logical replication progress ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses	RE: Failed transaction statistics to measure the logical replication progress
List	pgsql-hackers

Tree view

On Wed, Dec 1, 2021 at 3:04 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Friday, November 19, 2021 11:11 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Besides that, I’m not sure how useful commit_bytes, abort_bytes, and
> > error_bytes are. I originally thought these statistics track the size of received
> > data, i.g., how much data is transferred from the publisher and processed on
> > the subscriber. But what the view currently has is how much memory is used in
> > the subscription worker. The subscription worker emulates
> > ReorderBufferChangeSize() on the subscriber side but, as the comment of
> > update_apply_change_size() mentions, the size in the view is not accurate:
> ...
> > I guess that the purpose of these values is to compare them to total_bytes,
> > stream_byte, and spill_bytes but if the calculation is not accurate, does it mean
> > that the more stats are updated, the more the stats will be getting inaccurate?
> Thanks for your comment !
>
> I tried to solve your concerns about byte columns but there are really difficult issues to solve.
> For example, to begin with the messages of apply worker are different from those of
> reorder buffer.
>
> Therefore, I decided to split the previous patch and make counter columns go first.
> v14 was checked by pgperltidy and pgindent.
>
> This patch can be applied to the PG whose commit id is after 8d74fc9 (introduction of
> pg_stat_subscription_workers).

Thanks for the updated patch.
Currently we are storing the commit count, error_count and abort_count
for each table of the table sync operation. If we have thousands of
tables, we will be storing the information for each of the tables.
Shouldn't we be storing the consolidated information in this case.
diff --git a/src/backend/replication/logical/tablesync.c
b/src/backend/replication/logical/tablesync.c
index f07983a..02e9486 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1149,6 +1149,11 @@ copy_table_done:
        MyLogicalRepWorker->relstate_lsn = *origin_startpos;
        SpinLockRelease(&MyLogicalRepWorker->relmutex);

+       /* Report the success of table sync. */
+       pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
+
  MyLogicalRepWorker->relid,
+
  0 /* no logical message type */ );

postgres=# select * from pg_stat_subscription_workers ;
 subid | subname | subrelid | commit_count | error_count | abort_count
| last_error_relid | last_error_command | last_error_xid |
last_error_count | last_error_message | last_error_time

-------+---------+----------+--------------+-------------+-------------+------------------+--------------------+----------------+------------------+--------------------+-----------------
 16411 | sub1    |    16387 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16396 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16390 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16393 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16402 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16408 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16384 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16399 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
 16411 | sub1    |    16405 |            1 |           0 |           0
|                  |                    |                |
   0 |                    |
(9 rows)

Regards,
Vignesh

pgsql-hackers by date:

From: "houzj.fnst@fujitsu.com"
Date: 03 December 2021, 05:54:21
Subject: RE: Data is copied twice when specifying both child and parent table in publication

From: Dilip Kumar
Date: 03 December 2021, 06:26:58
Subject: Re: suboverflowed subtransactions concurrency performance optimize

Re: Failed transaction statistics to measure the logical replication progress - Mailing list pgsql-hackers

Previous

Next