Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAJcOf-fg3Bwahf6pW+t32Ts1FMq3xgMw8r2nTP8hGBTsZmLdWg@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

I have some initial feedback on the v12-0001 patch.
Most of these are suggested improvements to wording, and some typo fixes.


(0) Patch comment

Suggestion to improve the patch comment:

BEFORE:
Add pg_stat_subscription_errors statistics view.

This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.

The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.

It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.

AFTER:
Add a subscription errors statistics view "pg_stat_subscription_errors".

This commits adds a new system view pg_stat_logical_replication_errors,
that records information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.

The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.

It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.



doc/src/sgml/monitoring.sgml:

(1)
BEFORE:
+      <entry>One row per error that happened on subscription, showing
information about
+       the subscription errors.
AFTER:
+      <entry>One row per error that occurred on subscription,
providing information about
+       each subscription error.

(2)
BEFORE:
+   The <structname>pg_stat_subscription_errors</structname> view will
contain one
AFTER:
+   The <structname>pg_stat_subscription_errors</structname> view contains one


(3)
BEFORE:
+        Name of the database in which the subscription is created.
AFTER:
+        Name of the database in which the subscription was created.


(4)
BEFORE:
+       OID of the relation that the worker is processing when the
+       error happened.
AFTER:
+       OID of the relation that the worker was processing when the
+       error occurred.


(5)
BEFORE:
+        Name of command being applied when the error happened.  This
+        field is always NULL if the error is reported by
+        <literal>tablesync</literal> worker.
AFTER:
+        Name of command being applied when the error occurred.  This
+        field is always NULL if the error is reported by a
+        <literal>tablesync</literal> worker.

(6)
BEFORE:
+        Transaction ID of publisher node being applied when the error
+        happened.  This field is always NULL if the error is reported
+        by <literal>tablesync</literal> worker.
AFTER:
+        Transaction ID of the publisher node being applied when the error
+        happened.  This field is always NULL if the error is reported
+        by a <literal>tablesync</literal> worker.

(7)
BEFORE:
+        Type of worker reported the error: <literal>apply</literal> or
+        <literal>tablesync</literal>.
AFTER:
+        Type of worker reporting the error: <literal>apply</literal> or
+        <literal>tablesync</literal>.


(8)
BEFORE:
+       Number of times error happened on the worker.
AFTER:
+       Number of times the error occurred in the worker.

[or "Number of times the worker reported the error" ?]


(9)
BEFORE:
+       Time at which the last error happened.
AFTER:
+       Time at which the last error occurred.

(10)
BEFORE:
+       Error message which is reported last failure time.
AFTER:
+       Error message which was reported at the last failure time.

Maybe this should just say "Last reported error message" ?


(11)
You shouldn't call hash_get_num_entries() on a NULL pointer.

Suggest swappling lines, as shown below:

BEFORE:
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+    continue;
AFTER:
+ int32 nerrors;
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+    continue;
+ nerrors = hash_get_num_entries(subent->suberrors);


(12)
Typo:  legnth -> length

+ * contains the fixed-legnth error message string which is



src/backend/postmaster/pgstat.c

(13)
"Subscription stat entries" hashtable is created in two different
places, one with HASH_CONTEXT and the other without. Is this
intentional?
Shouldn't there be a single function for creating this?


(14)
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.

Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscriptions." ?

(15)
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.

Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscription errors." ?


Regards,
Greg Nancarrow
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Next
From: Fujii Masao
Date:
Subject: Re: corruption of WAL page header is never reported