Re: Design of pg_stat_subscription_workers vs pgstats - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Design of pg_stat_subscription_workers vs pgstats
Date
Msg-id CAD21AoB_yY6veu5h4mgdCzH64KY4Y9Ln2L2r7wmurV1+y+ig4w@mail.gmail.com
Whole thread Raw
In response to Re: Design of pg_stat_subscription_workers vs pgstats  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: Design of pg_stat_subscription_workers vs pgstats  (Peter Smith <smithpb2250@gmail.com>)
Re: Design of pg_stat_subscription_workers vs pgstats  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Design of pg_stat_subscription_workers vs pgstats  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

Thank you for the comments!

On Thu, Feb 24, 2022 at 4:20 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Thu, Feb 24, 2022 9:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for the comments! I've attached the latest version patch
> > that incorporated all comments I got so far. The primary change from
> > the previous version is that the subscription statistics live globally
> > rather than per-database.
> >
>
> Thanks for updating the patch.
>
> Few comments:
>
> 1.
> I think we should add some doc for column stats_reset in pg_stat_subscription_activity view.

Added.

>
> 2.
> +CREATE VIEW pg_stat_subscription_activity AS
>      SELECT
> -        w.subid,
> +        a.subid,
>          s.subname,
> ...
> +        a.apply_error_count,
> +        a.sync_error_count,
> +       a.stats_reset
> +    FROM pg_subscription as s,
> +        pg_stat_get_subscription_activity(oid) as a;
>
> The line "a.stats_reset" uses a Tab, and we'd better use spaces here.

Fixed.

On Thu, Feb 24, 2022 at 5:54 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi. Below are my review comments for the v2 patch.
>
> ======
>
> 1. Commit message
>
> This patch changes the pg_stat_subscription_workers view (introduced
> by commit 8d74fc9) so that it stores only statistics counters:
> apply_error_count and sync_error_count, and has one entry for
> subscription.
>
> SUGGESTION
> "and has one entry for subscription." --> "and has one entry for each
> subscription."

Fixed.

>
> ~~~
>
> 2. Commit message
>
> After removing these error details, there are no longer relation
> information, so the subscription statistics are now a cluster-wide
> statistics.
>
> SUGGESTION
> "there are no longer relation information," --> "there is no longer
> any relation information,"

Fixed.

>
> ~~~
>
> 3. doc/src/sgml/monitoring.sgml
>
> -      <para>
> -       The error message
> +       Number of times the error occurred during the application of changes
>        </para></entry>
>       </row>
>
>       <row>
>        <entry role="catalog_table_entry"><para role="column_definition">
> -       <structfield>last_error_time</structfield> <type>timestamp
> with time zone</type>
> +       <structfield>sync_error_count</structfield> <type>bigint</type>
>        </para>
>        <para>
> -       Last time at which this error occurred
> +       Number of times the error occurred during the initial table
> +       synchronization
>        </para></entry>
>
> SUGGESTION (both places)
> "Number of times the error occurred ..." --> "Number of times an error
> occurred ..."

Fixed.

>
> ~~~
>
> 4. doc/src/sgml/monitoring.sgml - missing column
>
> (duplicate - also reported by [Tang-v2])
>
> The PG docs for the new "stats_reset" column are missing.
>
> ~~~
>
> 5. src/backend/catalog/system_views.sql - whitespace
>
> (duplicate - also reported by [Tang-v2])
>
> -          JOIN pg_subscription s ON (w.subid = s.oid);
> +        a.apply_error_count,
> +        a.sync_error_count,
> + a.stats_reset
> +    FROM pg_subscription as s,
> +        pg_stat_get_subscription_activity(oid) as a;
>
> inconsistent tab/space indenting for 'a.stats_reset'.
>
> ~~~
>
> 6. src/backend/postmaster/pgstat.c - function name
>
> +/* ----------
> + * pgstat_reset_subscription_counter() -
> + *
> + * Tell the statistics collector to reset a single subscription
> + * counter, or all subscription counters (when subid is InvalidOid).
> + *
> + * Permission checking for this function is managed through the normal
> + * GRANT system.
> + * ----------
> + */
> +void
> +pgstat_reset_subscription_counter(Oid subid)
>
> SUGGESTION (function name)
> "pgstat_reset_subscription_counter" -->
> "pgstat_reset_subscription_counters" (plural)

Fixed.

>
> ~~
>
> 7. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
>
> @@ -5645,6 +5598,51 @@
> pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
>   }
>  }
>
> +/* ----------
> + * pgstat_recv_resetsubcounter() -
> + *
> + * Reset some subscription statistics of the cluster.
> + * ----------
> + */
> +static void
> +pgstat_recv_resetsubcounter(PgStat_MsgResetsubcounter *msg, int len)
>
>
> "Reset some" seems a bit vague. Why not describe that it is all or
> none according to the msg->m_subid?

I think it reset none, one, or all statistics, actually. Given other
pgstat_recv_reset* functions also have similar comments, I think we
can use it rather than elaborating.

>
> ~~~
>
> 8. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
>
> + if (!OidIsValid(msg->m_subid))
> + {
> + HASH_SEQ_STATUS sstat;
> +
> + /* Clear all subscription counters */
> + hash_seq_init(&sstat, subscriptionStatHash);
> + while ((subentry = (PgStat_StatSubEntry *) hash_seq_search(&sstat)) != NULL)
> + pgstat_reset_subscription(subentry, ts);
> + }
> + else
> + {
> + /* Get the subscription statistics to reset */
> + subentry = pgstat_get_subscription_entry(msg->m_subid, false);
> +
> + /*
> + * Nothing to do if the given subscription entry is not found.  This
> + * could happen when the subscription with the subid is removed and
> + * the corresponding statistics entry is also removed before receiving
> + * the reset message.
> + */
> + if (!subentry)
> + return;
> +
> + /* Reset the stats for the requested replication slot */
> + pgstat_reset_subscription(subentry, ts);
> + }
> +}
>
> Why not reverse the if/else?
>
> Checking OidIsValid(...) seems more natural than checking !OidIsValid(...)

Yes, but it's because we use the same pattern in the near function
(see pgstat_recv_resetreplslotcounter()).

>
> ~~~
>
> 9. src/backend/postmaster/pgstat.c - pgstat_recv_subscription_purge
>
> static void
> pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
> {
> /* Return if we don't have replication subscription statistics */
> if (subscriptionStatHash == NULL)
> return;
>
> /* Remove from hashtable if present; we don't care if it's not */
> (void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
>    HASH_REMOVE, NULL);
> }
>
> SUGGESTION
> Wouldn't the above code be simpler written like:
>
> if (subscriptionStatHash)
> {
> /* Remove from hashtable if present; we don't care if it's not */
> (void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
>    HASH_REMOVE, NULL);
> }

Similarly, as Amit also mentioned, there is a similar pattern in the
near function. So keep it as it is

> ~~~
>
> 10. src/backend/replication/logical/worker.c
>
> (from my previous [Peter-v1] #9)
>
> >> + /* Report the worker failed during table synchronization */
> >> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, false);
> >>
> >> and
> >>
> >> + /* Report the worker failed during the application of the change */
> >> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, true);
> >>
> >>
> >> Why don't these use MySubscription->oid instead of MyLogicalRepWorker->subid?
>
> > It's just because we used to use MyLogicalRepWorker->subid, is there
> > any particular reason why we should use MySubscription->oid here?
>
> I felt MySubscription->oid is a more natural and more direct way of
> expressing the same thing.
>
> Consider:  "the oid of the current subscription" versus "the oid of
> the subscription of the current worker". IMO the first one is simpler.
> YMMV.
>
> Also, it is shorter :)

Changed.

>
> ~~~
>
> 11. src/include/pgstat.h  - enum order
>
> (follow-on from my previous v1 review comment #10)
>
> >I assume you're concerned about binary compatibility or something. I
> >think it should not be a problem since both
> >PGSTAT_MTYPE_SUBWORKERERROR and PGSTAT_MTYPE_SUBSCRIPTIONPURGE are
> >introduced to PG15.
>
> Yes, maybe it is OK for those ones. But now in v2 there is a new
> PGSTAT_MTYPE_RESETSUBCOUNTER.
>
> Shouldn't at least that one be put at the end for the same reason?

I think it's better to put it near similar RESET enums.

>
> ~~~
>
> 12. src/include/pgstat.h  - PgStat_MsgResetsubcounter
>
> Maybe a better name for this is "PgStat_MsgResetsubcounters" (plural)?

I think it's better to be consistent with other similar message
structs (e.g., msg_resetsharedcounter and msg_resetslrucounter).

>
> ~~~
>
> 14. src/tools/pgindent/typedefs.list
>
> PgStat_MsgResetsubcounter (from pgstat.h) is missing?
>

Added.

> ------
>
> 15. pg_stat_subscription_activity view name?
>
> Has the view name already been decided or still under discussion - I
> was not sure?
>
> If is it already decided then fine, but if not then my vote would be
> for something different like:
> e.g.1 - pg_stat_subscription_errors
> e.g.2 - pg_stat_subscription_counters
> e.g.3 - pg_stat_subscription_metrics
>
> Maybe "activity" was chosen to be deliberately vague in case some
> future unknown stats columns get added? But it means now there is a
> corresponding function "pg_stat_reset_subscription_activity", when in
> practice you don't really reset activity - what you want to do is
> reset some statistics *about* the activity... so it all seems a bit
> odd to me.

Yes, it still needs discussion.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: create_index test fails when synchronous_commit = off @ master
Next
From: Brar Piening
Date:
Subject: Re: Add id's to various elements in protocol.sgml