Re: Tracking last scan time - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Tracking last scan time
Date
Msg-id 20220901183540.cyfje2lsiouj4qi4@awork3.anarazel.de
Whole thread Raw
In response to Re: Tracking last scan time  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Tracking last scan time
List pgsql-hackers
Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:
> On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres@anarazel.de> wrote:
> > But given this is done when stats are flushed, which only happens after the
> > transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> > we got to flushing the transaction stats we'll already have computed that.
> 
> I'm not entirely happy with that, as that would still add function
> call overhead, and potentially still call GetCurrentTimestamp() in
> this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to acquiring
locks etc.


> As an alternative, we could wire the `now` variable in
> pgstat_report_stat (generated from
> GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
> and then into flush_pending_cb (or, store this in a static variable)
> so that we can reuse that value, saving any potential function call
> overhead.

Passing it in doesn't clearly seem an improvement, but I also don't have a
strong opinion on it. I am strongly against the static variable approach.


> > >       tabentry->numscans += lstats->t_counts.t_numscans;
> > > +     if (pgstat_track_scans && lstats->t_counts.t_numscans)
> > > +             tabentry->lastscan = GetCurrentTimestamp();
> >
> > Besides replacing GetCurrentTimestamp() with
> > GetCurrentTransactionStopTimestamp(), this should then also check if
> > tabentry->lastscan is already newer than the new timestamp.
> 
> I wonder how important that is. This value only gets set in a stats
> flush, which may skew the stat update by several seconds (up to
> PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
> long that it will set the values to It is possible, but I think it is
> extremely unlikely that this is going to be important when you
> consider that these stat flushes are not expected to run for more than
> 1 second.

I think it'll be confusing if you have values going back and forth, even if
just by a little. And it's cheap to defend against, so why not just do that?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: vignesh C
Date:
Subject: Re: Handle infinite recursion in logical replication setup