Inconsistent increment of pg_stat_database.xact_rollback with logical replication - Mailing list pgsql-bugs

From Rafael Thofehrn Castro
Subject Inconsistent increment of pg_stat_database.xact_rollback with logical replication
Date
Msg-id CAG0ozMo_xWQn+Avv8jzbbhePGp5OnhdO+YWTkdg4faWSXz0Jzg@mail.gmail.com
Whole thread Raw
List pgsql-bugs
Column xact_rollback from pg_stat_database gets inconsistently incremented when logical replication is being used (on publisher side).

This can be easily reproduced in latest code from master branch:

- Publisher

postgres=# select xact_commit, xact_rollback from pg_stat_database where datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 20

xact_rollback | 0


postgres=# insert into t1 values (1);

INSERT 0 1

postgres=# insert into t1 values (2);

INSERT 0 1

postgres=# insert into t1 values (3);

INSERT 0 1

postgres=# insert into t1 values (4);

INSERT 0 1

postgres=# insert into t1 values (5);

INSERT 0 1

postgres=# insert into t1 values (6);

INSERT 0 1

postgres=# insert into t1 values (7);

INSERT 0 1

postgres=# insert into t1 values (8);

INSERT 0 1

postgres=# insert into t1 values (9);

INSERT 0 1

postgres=# insert into t1 values (10);

INSERT 0 1


postgres=# select xact_commit, xact_rollback from pg_stat_database where datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 33

xact_rollback | 0


- Subscriber


postgres=# alter subscription sub disable;

ALTER SUBSCRIPTION


- Publisher


postgres=# select xact_commit, xact_rollback from pg_stat_database where datname = 'postgres';

-[ RECORD 1 ]-+---

xact_commit   | 36

xact_rollback | 10


What seems to be happening is that the amount of transactions decoded by the walsender are being added in pg_stat_database.xact_rollback. But these changes are only flushed to global stats when the walsender gets terminated.


On a quick look look at the source I would suspect that the issue starts here: https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/reorderbuffer.c#L2545

All decoded transactions are aborted for cleanup purposes. Following the source code flow after calling AbortCurrentTransaction() we eventually reach the part that increments rollback stats here: https://github.com/postgres/postgres/blob/master/src/backend/utils/activity/pgstat_database.c#L249

This is causing inconsistency in monitoring TPS metric of a database where we eventually see sudden spikes of TPS in the order of millions.

Regards,

Rafael Castro.

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #18510: jsonpath does not support trailing backslash at the end of the query
Next
From: Lex Vorona
Date:
Subject: Re: BUG #18510: jsonpath does not support trailing backslash at the end of the query