Re: shared-memory based stats collector - v70 - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: shared-memory based stats collector - v70
Date
Msg-id 20220408.134443.298969491538816073.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: shared-memory based stats collector - v70  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
At Thu, 7 Apr 2022 20:59:21 -0700, Andres Freund <andres@anarazel.de> wrote in 
> Hi,
> 
> On 2022-04-08 11:10:14 +0900, Kyotaro Horiguchi wrote:
> > I can read it. But I'm not sure that the difference is obvious for
> > average users between "starting a standby from a basebackup" and
> > "starting a standby after a normal shutdown"..
> 
> Yea, that's what I was concerned about. How about:
> 
>   <para>
>    Cumulative statistics are collected in shared memory. Every
>    <productname>PostgreSQL</productname> process collects statistics locally
>    then updates the shared data at appropriate intervals.  When a server,
>    including a physical replica, shuts down cleanly, a permanent copy of the
>    statistics data is stored in the <filename>pg_stat</filename> subdirectory,
>    so that statistics can be retained across server restarts.  In contrast,
>    when starting from an unclean shutdown (e.g., after an immediate shutdown,
>    a server crash, starting from a base backup, and point-in-time recovery),
>    all statistics counters are reset.
>   </para>

Looks perfect generally, and especially in regard to the concern.

> I think I like my version above a bit better?

Quite a bit.  It didn't answer for the concern.

> > > 2)
> > > The edit is not a problem, but it's hard to understand what the existing
> > > paragraph actually means?
> > > 
> > > diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
> > > index 3247e056663..8bfb584b752 100644
> > > --- a/doc/src/sgml/high-availability.sgml
> > > +++ b/doc/src/sgml/high-availability.sgml
> > > @@ -2222,17 +2222,17 @@ HINT:  You can then restart the server after making the necessary configuration
> > > ...
> > >     <para>
> > > -    The statistics collector is active during recovery. All scans, reads, blocks,
> > > +    The cumulative statistics system is active during recovery. All scans, reads, blocks,
> > >      index usage, etc., will be recorded normally on the standby. Replayed
> > >      actions will not duplicate their effects on primary, so replaying an
> > >      insert will not increment the Inserts column of pg_stat_user_tables.
> > >      The stats file is deleted at the start of recovery, so stats from primary
> > >      and standby will differ; this is considered a feature, not a bug.
> > >     </para>
> > > 
> > >     <para>
> > 
> > Agreed partially. It's too detailed.  It might not need to mention WAL
> > replay.
> 
> My concern is more that it seems halfway nonsensical. "Replayed actions will
> not duplicate their effects on primary" - I can guess what that means, but not
> more. There's no "Inserts" column of pg_stat_user_tables.
> 
> 
>    <para>
>     The cumulative statistics system is active during recovery. All scans,
>     reads, blocks, index usage, etc., will be recorded normally on the
>     standby. However, WAL replay will not increment relation and database
>     specific counters. I.e. replay will not increment pg_stat_all_tables
>     columns (like n_tup_ins), nor will reads or writes performed by the
>     startup process be tracked in the pg_statio views, nor will associated
>     pg_stat_database columns be incremented.
>    </para>

Looks clearer since it mention user-facing interfaces with concrete
example columns.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Windows now has fdatasync()
Next
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical replication timeout problem