Hi,
Someone raised an interested point recently on pg_stat_kcache extension for
handling nested statements, which also applies to pg_stat_statements.
The root issue is that when pg_stat_statements tracks nested statements,
there's no way to really make sense of the counters, as top level statements
will also account for underlying statements. Using a naive example:
=# CREATE FUNCTION f1() RETURNS VOID AS $$BEGIN PERFORM pg_sleep(5); END; $$ LANGUAGE plpgsql;
CREATE FUNCTION
=# SELECT pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------
(1 row)
=# SELECT f1();
f1
----
(1 row)
=# select sum(total_exec_time) from pg_stat_statements;
sum
--------------
10004.403601
(1 row)
It looks like there was 10s total execution time overall all statements, which
doesn't really make sense.
It's of course possible to avoid that using track = top, but tracking all
nested statements is usually quite useful so it could be better to find a way
to better address that problem.
The only idea I have for that is to add a new field to entry key, for instance
is_toplevel. The immediate cons is obviously that it could amplify quite a lot
the number of entries tracked, so people may need to increase
pg_stat_statements.max to avoid slowdown if that makes them reach frequent
entry eviction.
Should we address the problem, and in that case do you see a better way for
that, or should we just document this behavior?