Thread: pg_stat_statements HLD for futur developments

pg_stat_statements HLD for futur developments

From
legrand legrand
Date:
Hello Kackers,

As a new user of PostgreSQL, I have started using pg_stat_statements, and
was pleased but a little surprised: 

First of all, the normalized form of the query string makes it impossible to
be used in EXPLAIN commands.
Second, normalized constants and parameters values where missing to be able
to test optimizations results manually with EXPLAIN.
Third, execution plan was not available (making usage of AUTO_EXPLAIN
mandatory)

I searched a pg_stat_statements.normalized = false GUC 
that isn't available yet 
nb: that would be easy to implement by changing two lines in
pg_stat_statements.c
line 820    // JumbleQuery(&jstate, query);
line 1156    // if (queryId == UINT64CONST(0))

I have also looked at other developments turning arround this subject (see
the following list), 
and found (so) many things, that wounder if a High Level Design was
available somewhere 
that would permit to make all those branches mergeable ...

Would'nt a view like:

PK
    dbid, 
    userid,
    queryid, 
    +planid 

Added columns 
    first created
    last_updated
    planing_time
    first plan
    first query parameters
...

permits to answer all current needs ( and still be compatible with an
aggregating tool) ?

additional GUCs are to be defined :
- normalized query (true,false),
- plan (none, text, json, ...),
- query parameters (none, text, json, ...),
- query_max_size
- plan_max_size
- parameter_max_size 
...

I can try to maintain this design if desired.

Regards
PAscal



Discussions/custom extensions regarding p_stat_statements:

"[FEATURE PATCH] pg_stat_statements with plans"
http://www.postgresql-archive.org/FEATURE-PATCH-pg-stat-statements-with-plans-td5940964.html

"Sample values for pg_stat_statements"
http://www.postgresql-archive.org/Sample-values-for-pg-stat-statements-tc5998728.html
--> will not work for good and bad plans found before


There was an idea about collecting created and last updated times
"[PROPOSAL] timestamp informations to pg_stat_statements"
http://www.postgresql-archive.org/PROPOSAL-timestamp-informations-to-pg-stat-statements-td5912306.html
--> this is partly done here (but only for good or bad plan, one time)

"Planning counters in pg_stat_statements"
http://www.postgresql-archive.org/Planning-counters-in-pg-stat-statements-td5990933.html
--> this could be interesting for all kinds of plans 


custom extension pg_stat_plan
https://github.com/2ndQuadrant/pg_stat_plans

custom extension pg_store_plans 
https://github.com/ossc-db/pg_store_plans

...



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


Re: pg_stat_statements HLD for futur developments

From
Fabien COELHO
Date:
Hello,

> As a new user of PostgreSQL, I have started using pg_stat_statements, and
> was pleased but a little surprised:
>
> First of all, the normalized form of the query string makes it impossible to
> be used in EXPLAIN commands.

Yes, because of the normalization.

> Second, normalized constants and parameters values where missing to be able
> to test optimizations results manually with EXPLAIN.

The normalization is entirely voluntary. Otherwise, probably every query 
would generate a distinct entry, which for a high load system would be a 
very bad idea, and there would be no point to the extension.

Note that with recent pg you can reuse the query with with a PREPARE, and 
then EXPLAIN on the EXECUTE.

   PREPARE foo(INT) AS <the-query-with-its-dollar-vars>;
   EXPLAIN EXECUTE foo(5432);
   EXPLAIN EXECUTE foo(2345);
   DEALLOCATE foo;

> Third, execution plan was not available (making usage of AUTO_EXPLAIN

Yep, but ISTM that the plan might differ depending on the actual values, 
so it would not make much sense to keep it anyway.

-- 
Fabien.


RE: pg_stat_statements HLD for futur developments

From
legrand legrand
Date:

Thank you Fabien.


My question is more as there are so many developments arround pg_stat_statements (see the list at the end of the initial post):


What is the roadmap for its design ? 


Could a PLANID column be added to help new developments working on plans and parameters ?

Could all the new columns be added to the actual view, or should they be added to others

like pg_stat_statements_plans or pg_stat_statements_parameters reusing pgss's pk (userid, dbid, queryid, planid) ?


Regards

PAscal






De : Fabien COELHO <coelho@cri.ensmp.fr>
Envoyé : jeudi 22 mars 2018 09:32:13
À : legrand legrand
Cc : pgsql-hackers@postgresql.org
Objet : Re: pg_stat_statements HLD for futur developments
 

Hello,

> As a new user of PostgreSQL, I have started using pg_stat_statements, and
> was pleased but a little surprised:
>
> First of all, the normalized form of the query string makes it impossible to
> be used in EXPLAIN commands.

Yes, because of the normalization.

> Second, normalized constants and parameters values where missing to be able
> to test optimizations results manually with EXPLAIN.

The normalization is entirely voluntary. Otherwise, probably every query
would generate a distinct entry, which for a high load system would be a
very bad idea, and there would be no point to the extension.

Note that with recent pg you can reuse the query with with a PREPARE, and
then EXPLAIN on the EXECUTE.

   PREPARE foo(INT) AS <the-query-with-its-dollar-vars>;
   EXPLAIN EXECUTE foo(5432);
   EXPLAIN EXECUTE foo(2345);
   DEALLOCATE foo;

> Third, execution plan was not available (making usage of AUTO_EXPLAIN

Yep, but ISTM that the plan might differ depending on the actual values,
so it would not make much sense to keep it anyway.

--
Fabien.

RE: pg_stat_statements HLD for futur developments

From
Fabien COELHO
Date:
Hello,

> My question is more as there are so many developments arround 
> pg_stat_statements (see the list at the end of the initial post):
>
> What is the roadmap for its design ?

None? As for any feature, some people may have some plans, that they may 
end up developing or not. If developed, it may end up committed or not. 
Kind of a Darwinian process which involves a degree of randomness.

> Could a PLANID column be added to help new developments working on plans and parameters ?

Dunno. I understand that the underlying suggestion is that selected plans 
may be kept as queries are kept, and that you are refering to 
"https://commitfest.postgresql.org/17/1470/".

Maybe ask your question on the corresponding thread, and contribute to 
reviewing the patch?

As the same plan may be used for two distinct queries and one query may 
yield distinct plans, I'd guess that there is a potential n-n 
relationship, but maybe it is simpler to keep a list of plans attached to 
their queries somehow.

> Could all the new columns be added to the actual view, or should they be 
> added to others like pg_stat_statements_plans or 
> pg_stat_statements_parameters reusing pgss's pk (userid, dbid, queryid, 
> planid) ?

Out of the box I'd be fine with a pg_stat_plans, and some mapping between 
plans and statements.

-- 
Fabien.