Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoCu+VoSLKqKmVhZToXn5+84k_skSZRAQRXRAN+ObuZfaA@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Thu, Sep 2, 2021 at 12:06 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>
> I have some initial feedback on the v12-0001 patch.
> Most of these are suggested improvements to wording, and some typo fixes.

Thank you for the comments!

>
>
> (0) Patch comment
>
> Suggestion to improve the patch comment:
>
> BEFORE:
> Add pg_stat_subscription_errors statistics view.
>
> This commits adds new system view pg_stat_logical_replication_error,

Oops, I realized that it should be pg_stat_subscription_errors.

> showing errors happening during applying logical replication changes
> as well as during performing initial table synchronization.
>
> The subscription error entries are removed by autovacuum workers when
> the table synchronization competed in table sync worker cases and when
> dropping the subscription in apply worker cases.
>
> It also adds SQL function pg_stat_reset_subscription_error() to
> reset the single subscription error.
>
> AFTER:
> Add a subscription errors statistics view "pg_stat_subscription_errors".
>
> This commits adds a new system view pg_stat_logical_replication_errors,
> that records information about any errors which occur during application
> of logical replication changes as well as during performing initial table
> synchronization.

I think that views don't have any data so "show information" seems
appropriate to me here. Thoughts?

>
> The subscription error entries are removed by autovacuum workers after
> table synchronization completes in table sync worker cases and after
> dropping the subscription in apply worker cases.
>
> It also adds an SQL function pg_stat_reset_subscription_error() to
> reset a single subscription error.
>
>
>
> doc/src/sgml/monitoring.sgml:
>
> (1)
> BEFORE:
> +      <entry>One row per error that happened on subscription, showing
> information about
> +       the subscription errors.
> AFTER:
> +      <entry>One row per error that occurred on subscription,
> providing information about
> +       each subscription error.

Fixed.

>
> (2)
> BEFORE:
> +   The <structname>pg_stat_subscription_errors</structname> view will
> contain one
> AFTER:
> +   The <structname>pg_stat_subscription_errors</structname> view contains one
>

I think that descriptions of other statistics view also say "XXX view
will contain ...".

>
> (3)
> BEFORE:
> +        Name of the database in which the subscription is created.
> AFTER:
> +        Name of the database in which the subscription was created.

Fixed.

>
> (4)
> BEFORE:
> +       OID of the relation that the worker is processing when the
> +       error happened.
> AFTER:
> +       OID of the relation that the worker was processing when the
> +       error occurred.
>

Fixed.

>
> (5)
> BEFORE:
> +        Name of command being applied when the error happened.  This
> +        field is always NULL if the error is reported by
> +        <literal>tablesync</literal> worker.
> AFTER:
> +        Name of command being applied when the error occurred.  This
> +        field is always NULL if the error is reported by a
> +        <literal>tablesync</literal> worker.

Fixed.

> (6)
> BEFORE:
> +        Transaction ID of publisher node being applied when the error
> +        happened.  This field is always NULL if the error is reported
> +        by <literal>tablesync</literal> worker.
> AFTER:
> +        Transaction ID of the publisher node being applied when the error
> +        happened.  This field is always NULL if the error is reported
> +        by a <literal>tablesync</literal> worker.

Fixed.

> (7)
> BEFORE:
> +        Type of worker reported the error: <literal>apply</literal> or
> +        <literal>tablesync</literal>.
> AFTER:
> +        Type of worker reporting the error: <literal>apply</literal> or
> +        <literal>tablesync</literal>.

Fixed.

>
> (8)
> BEFORE:
> +       Number of times error happened on the worker.
> AFTER:
> +       Number of times the error occurred in the worker.
>
> [or "Number of times the worker reported the error" ?]

I prefer "Number of times the error occurred in the worker."

>
> (9)
> BEFORE:
> +       Time at which the last error happened.
> AFTER:
> +       Time at which the last error occurred.

Fixed.

>
> (10)
> BEFORE:
> +       Error message which is reported last failure time.
> AFTER:
> +       Error message which was reported at the last failure time.
>
> Maybe this should just say "Last reported error message" ?

Fixed.

>
>
> (11)
> You shouldn't call hash_get_num_entries() on a NULL pointer.
>
> Suggest swappling lines, as shown below:
>
> BEFORE:
> + int32 nerrors = hash_get_num_entries(subent->suberrors);
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> +    continue;
> AFTER:
> + int32 nerrors;
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> +    continue;
> + nerrors = hash_get_num_entries(subent->suberrors);

Right. Fixed.

>
>
> (12)
> Typo:  legnth -> length
>
> + * contains the fixed-legnth error message string which is

Fixed.

>
>
> src/backend/postmaster/pgstat.c
>
> (13)
> "Subscription stat entries" hashtable is created in two different
> places, one with HASH_CONTEXT and the other without. Is this
> intentional?
> Shouldn't there be a single function for creating this?

Yes, it's intentional. It's consistent with hash tables for other statistics.

>
>
> (14)
> + * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscriptions." ?

Yes, fixed.

>
> (15)
> + * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
> + * errors.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscription errors." ?

Thanks, fixed.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: stat() vs ERROR_DELETE_PENDING, round N + 1
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side