Hi,
On 2023-04-25 16:00:24 -0400, Robert Haas wrote:
> On Tue, Apr 25, 2023 at 2:39 PM Andres Freund <andres@anarazel.de> wrote:
> > I'm mildly inclined to not consider it a bug, given that this looks to have
> > been true for other stats for quite a while? But it does still seem worth
> > improving upon - I'd make the consideration when to apply the relevant patches
> > depend on the complexity. I'm worried we'd need to introduce sufficiently new
> > infrastructure that 16 doesn't seem like a good idea. Let's come up with a
> > patch and judge it after?
>
> ISTM that it's pretty desirable to do something about this. If the
> process isn't going to report statistics properly, at least remove it
> from the view.
It's populated after crash recovery, when shutting down and at the time of
promotion, that isn't *completely* crazy.
> If it can be made to report properly, that would be even better. But
> shipping a new view with information that will nearly always be zeroes
> instead of real data seems like a bad call, even if there are existing cases
> that have the same problem.
I refreshed my memory: The startup process has indeed behaved that way for
much longer than pg_stat_io existed - but it's harder to spot, because the
stats are more coarsely aggregated :/. And it's very oddly inconsistent:
The startup process doesn't report per-relation read/hit (it might when we
create a fake relcache entry, to lazy to see what happens exactly), because we
key those stats by oid. However, it *does* report the read/write time. But
only at process exit, of course. The weird part is that the startup process
does *NOT* increase pg_stat_database.blks_read/blks_hit, because instead of
basing those on pgBufferUsage.shared_blks_read etc, we compute them based on
the relation level stats. pgBufferUsage is just used for EXPLAIN. This isn't
recent, afaict.
TL;DR: Currently the startup process maintains blk_read_time, blk_write_time,
but doesn't maintain blks_read, blks_hit - which doesn't make sense.
Yikes.
Greetings,
Andres Freund