Re: add checkpoint stats of snapshot and mapping files of pg_logical dir - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Date
Msg-id CALj2ACUcSZnmJMAvpxgSvnzy4At0w4o+_i1pGdi4EDpbyw_Efg@mail.gmail.com
Whole thread Raw
In response to Re: add checkpoint stats of snapshot and mapping files of pg_logical dir  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
List pgsql-hackers
On Tue, Aug 23, 2022 at 11:19 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Wed, Aug 17, 2022 at 11:17:24AM +0530, Bharath Rupireddy wrote:
> > +                                             "logical decoding file(s) processing time=%ld.%03d s",
>
> I would suggest shortening this to something like "logical decoding
> processing" or "logical replication processing."

"logical decoding processing" seems fine to me, see [1]. I've found
other error messages using "logical decoding".

                 errmsg("logical decoding requires wal_level >= logical")));
                 errmsg("logical decoding requires a database connection")));
                 errmsg("logical decoding cannot be used while in recovery")));
                 errmsg("logical decoding cannot be used while in recovery")));
                (errmsg("logical decoding found consistent point at %X/%X",
                (errmsg("logical decoding found initial starting point
at %X/%X",
                (errmsg("logical decoding found initial consistent
point at %X/%X",

> >       CheckPointRelationMap();
> >       CheckPointReplicationSlots();
> > +
> > +     CheckpointStats.l_dec_ops_start_t = GetCurrentTimestamp();
> >       CheckPointSnapBuild();
> >       CheckPointLogicalRewriteHeap();
> > +     CheckpointStats.l_dec_ops_end_t = GetCurrentTimestamp();
> > +
> >       CheckPointReplicationOrigin();
>
> Shouldn't we include CheckPointReplicationSlots() and
> CheckPointReplicationOrigin() in this new stat?

In my experience, I haven't seen those functions take a long time.
CheckPointReplicationSlots() runs a for loop for all slots and may not
take a huge amount of time and it deals with all the slots, not just
the logical slots. Whereas CheckPointReplicationOrigin() does deal
with logical replication and has some heavyweight operation such as
XLogFlush() for all logical slots, hence, counting it is reasonable.

On Wed, Aug 24, 2022 at 10:04 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Aug 24, 2022 at 10:48:01AM +0900, Kyotaro Horiguchi wrote:
> > By the way, I think we use INSTR_TIME_* macros to do masure internal
> > durations (mainly for the monotonic clock characteristics, and to
> > reduce performance degradation on Windows?). I'm not sure that's
> > crutial here but I don't think there's any reason to use
> > GetCurrentTimestamp() instead.
>
> This implies two calls of gettimeofday(), but that does not worry me
> much in this code path.  There is some consistency with
> CheckpointGuts() where we take timestamps for the sync requests.

+1 to use GetCurrentTimestamp() for consistency.

I've also improved the commit message a bit.

Please review the v12 patch attached.

[1] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s)
added, 0 removed, 0 recycled; write=0.003 s, sync=0.002 s, total=0.011
s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB,
estimate=0 kB; lsn=0/1496528, redo lsn=0/14964F0; logical decoding
processing=0.005 s

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: static libpq (and other libraries) overwritten on aix
Next
From: Nathan Bossart
Date:
Subject: Re: use ARM intrinsics in pg_lfind32() where available