Re: [PROPOSAL] timestamp informations to pg_stat_statements - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PROPOSAL] timestamp informations to pg_stat_statements
Date
Msg-id 0fe51355-40c3-9d0d-0fc3-4404e0cd70e2@2ndquadrant.com
Whole thread Raw
In response to Re: [PROPOSAL] timestamp informations to pg_stat_statements  (legrand legrand <legrand_legrand@hotmail.com>)
Responses Re: [PROPOSAL] timestamp informations to pg_stat_statements  (legrand legrand <legrand_legrand@hotmail.com>)
List pgsql-hackers

On 03/10/2018 04:43 PM, legrand legrand wrote:
> +1
> Having the time of first occurence of a statement is very usefull
> for trouble shouting, it permits for exemple to retrieve the order of
> operations in some complex cases (and thoses informations aren't
> taken by any third party collecting tool, that will only be able to
> provide a time range of occurence).
> 

I really don't see how this would be useful in reconstructing order of
operations, particularly in complex cases where I'd expect the queries
to be executed repeatedly / interleaved in different ways. Furthermore,
it would only work for the very first execution of all statements, so
you would probably have to reset the stats over and over - which seems
to directly contradict the purpose of pg_stat_statements (aggregation of
data over longer periods of time).

So unfortunately this seems rather useless, and log_statements=all seems
like a much better / reliable approach to achieve that.

I also doubt it really allows computation of averages, e.g. queries per
second, because practical workloads usually have daily/weekly patterns
(and different queries may follow different patterns).

So I agree with Peter Geoghegan that tools regularly snapshotting
pg_stat_statements and processing that are a better approach. I've seen
a bunch of simple scripts doing just that, actually.

> I thougth that pgss rows where removed randomly when max rows was
> reached, wouldn't having last_executed information permit a kind of
> LRU removal ?
> 

No, the entries are not removed randomly. We track "usage" for each
entry (essentially +1 for each time the query got executed, with a decay
factor applied on each eviction (and we evict 5% at a time).

It's not immediately obvious why something based on time of the
first/last execution would be better than the current algorithm.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Henry
Date:
Subject: Re: [RFC] What would be difficult to make data models pluggable formaking PostgreSQL a multi-model database?
Next
From: Tom Lane
Date:
Subject: Re: Bogus use of canonicalize_qual