On 03/10/2018 04:43 PM, legrand legrand wrote:
> +1
> Having the time of first occurence of a statement is very usefull
> for trouble shouting, it permits for exemple to retrieve the order of
> operations in some complex cases (and thoses informations aren't
> taken by any third party collecting tool, that will only be able to
> provide a time range of occurence).
>
I really don't see how this would be useful in reconstructing order of
operations, particularly in complex cases where I'd expect the queries
to be executed repeatedly / interleaved in different ways. Furthermore,
it would only work for the very first execution of all statements, so
you would probably have to reset the stats over and over - which seems
to directly contradict the purpose of pg_stat_statements (aggregation of
data over longer periods of time).
So unfortunately this seems rather useless, and log_statements=all seems
like a much better / reliable approach to achieve that.
I also doubt it really allows computation of averages, e.g. queries per
second, because practical workloads usually have daily/weekly patterns
(and different queries may follow different patterns).
So I agree with Peter Geoghegan that tools regularly snapshotting
pg_stat_statements and processing that are a better approach. I've seen
a bunch of simple scripts doing just that, actually.
> I thougth that pgss rows where removed randomly when max rows was
> reached, wouldn't having last_executed information permit a kind of
> LRU removal ?
>
No, the entries are not removed randomly. We track "usage" for each
entry (essentially +1 for each time the query got executed, with a decay
factor applied on each eviction (and we evict 5% at a time).
It's not immediately obvious why something based on time of the
first/last execution would be better than the current algorithm.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services