Re: Higher level questions around shared memory stats - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Higher level questions around shared memory stats
Date
Msg-id CA+TgmoZ9so7jiq0LwFqnuCU=hkNg62f7gQa3MxgD6Xxpo27wFg@mail.gmail.com
Whole thread Raw
In response to Re: Higher level questions around shared memory stats  (Andres Freund <andres@anarazel.de>)
Responses Re: Higher level questions around shared memory stats
List pgsql-hackers
On Tue, Mar 29, 2022 at 5:01 PM Andres Freund <andres@anarazel.de> wrote:
> I think it's reasonably rare because in cases there'd be corruption, we'd
> typically not even have written them out / throw them away explicitly - we
> only read stats when starting without crash recovery.
>
> So the "expected" case of corruption afaicts solely is a OS crash just after
> the shutdown checkpoint completed?

Can we prevent that case from occurring, so that there are no expected cases?

> > And maybe we should have, inside the stats system, something that
> > keeps track of when the stats file was last recreated from scratch because
> > of a corruption event, separately from when it was last intentionally reset.
>
> That would be easy to add. Don't think we have a good place to show the
> information right now - perhaps just new functions not part of any view?

I defer to you on where to put it.

> I can think of these different times:
>
> - Last time stats were removed due to starting up in crash recovery
> - Last time stats were created from scratch, because no stats data file was
>   present at startup
> - Last time stats were thrown away due to corruption
> - Last time a subset of stats were reset using one of the pg_reset* functions
>
> Makes sense?

Yes. Possibly that last could be broken in to two: when all stats were
last reset, when some stats were last reset.

> > Does redo update the stats?
>
> With "update" do you mean generate new stats? In the shared memory stats patch
> it triggers stats to be dropped, on HEAD it just resets all stats at startup.
>
> Redo itself doesn't generate stats, but bgwriter, checkpointer, backends do.

Well, I guess what I'm trying to figure out is what happens if we run
in recovery for a long time -- say, a year -- and then get promoted.
Do we have reasons to expect that the stats will be accurate enough to
use at that point, or will they be way off?

I don't have a great understanding of how this all works, but if
running recovery for a long time is going to lead to a situation where
the stats progressively diverge from reality, then preserving them
doesn't seem as valuable as if they're going to be more or less
accurate.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Commitfest Update
Next
From: Peter Eisentraut
Date:
Subject: Re: range_agg with multirange inputs