Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset
Date
Msg-id 20220330192308.ltges32v75cyaiop@alap3.anarazel.de
Whole thread Raw
In response to Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
Hi,

On 2022-03-30 14:57:25 -0400, Robert Haas wrote:
> On Wed, Mar 23, 2022 at 8:55 PM Andres Freund <andres@anarazel.de> wrote:
> > This behaviour can be trivially (and is) implemented for the shared memory
> > stats patch. But every time I read over that part of the code it feels just
> > profoundly wrong to me.  Way worse than *not* resetting
> > pg_stat_database.stats_reset.
> >
> > Anybody that uses the time since the stats reset as part of a calculation of
> > transactions / sec, reads / sec or such will get completely bogus results
> > after a call to pg_stat_reset_single_table_counters().
> 
> Sure, but that's unavoidable anyway. If some stats have been reset and
> other stats have not, you can't calculate a meaningful average over
> time unless you have a specific reset time for each statistic.

Individual pg_stat_database columns can't be reset independently. Other views
summarizing large parts of the system, like pg_stat_bgwriter, pg_stat_wal etc
have a stats_reset column that is only reset if their counters is also
reset. So the only reason we can't do that for pg_stat_database is that we
don't know since when pg_stat_database counters are counting.


> To me, the current behavior feels more correct than what you propose.
> Imagine for example that you are looking for tables/indexes where the
> counters are 0 as a way of finding unused objects. If you know that no
> counters have been zeroed in a long time, you know that this is
> reliable. But under your proposal, there's no way to know this. All
> you know is that the entire system wasn't reset, and therefore some of
> the 0s that you are seeing might be for individual objects that were
> reset.

My current proposal is to just have two reset times. One for the contents of
pg_stat_database (i.e. not affected by pg_stat_reset_single_*_counters()), and
one for stats within the entire database.


> I think of this mechanism like as answering the question "When's the
> last time anybody tinkered with this thing by hand?". If it's recent,
> the tinkering has a good chance of being related to whatever problem
> I'm trying to solve. If it's not, it's probably unrelated.

When I look at a database with a problem, I'll often look at pg_stat_database
to get a first impression of the type of workload running. The fact that
stats_reset doesn't reflect the age of other pg_stat_database columns makes
that much harder.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))
Next
From: "David G. Johnston"
Date:
Subject: Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset