Re: RFC: Logging plan of the running query - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: RFC: Logging plan of the running query |
Date | |
Msg-id | CAExHW5t+xh95_LZjkUZARDbV=gS5pBJghG8x=Xnjb2psWr0R6g@mail.gmail.com Whole thread Raw |
In response to | Re: RFC: Logging plan of the running query (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: RFC: Logging plan of the running query
|
List | pgsql-hackers |
On Fri, Nov 3, 2023 at 7:31 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > I have following questions related to the functionality. (Please point > me to the relevant discussion if this has been already discussed.) > > 1. When a backend is running nested queries, we will see the plan of > the innermost query. That query may not be the actual culprit if the > user query is running slowly. E.g a query being run as part of inner > side nested loop when the nested loop itself is the bottleneck. I > think it will be useful to print plans of all the whole query stack. To further explain this point, consider following scenario -- A completely useless function which executes two SQL statements which take small amount individually #create function multstmt() returns int language sql as $$ select count(*) from pg_class where reltype = 12345; select count(*) from pg_type limit 10; $$; -- force a suboptimal plan #set enable_hashjoin to false; #set enable_mergejoin to false; -- A completely useless but long running query #select c.oid, t.oid from pg_class c, pg_type t where multstmt(c.oid) = multstmt(t.oid); This take a few minutes on my laptop. In a separate session query pg_stat_activity. We will see the original query #select pid, query, backend_type from pg_stat_activity where pid = 349330; pid | query | backend_type --------+-----------------------------------------------------------------------------------------+---------------- 349330 | select c.oid, t.oid from pg_class c, pg_type t where multstmt(c.oid) = multstmt(t.oid); | client backend (1 row) Run the plan output function a few times #select pg_log_query_plan(349330); You will observe different plans based on which of the innermost query is runnning LOG: query plan running on backend with PID 349330 is: Query Text: select count(*) from pg_class where reltype = typeid; select count(*) from pg_type where oid = typeid; Query Parameters: $1 = '600' Aggregate (cost=18.18..18.19 rows=1 width=8) Output: count(*) -> Seq Scan on pg_catalog.pg_class (cost=0.00..18.18 rows=2 width=0) Output: oid, relname, relnamespace, reltype, reloftype, relowner, relam, relfilenode, reltablespace, relpages, reltuples, relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, relkind, relnatts, relchecks, relhasrules, relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, relreplident, relispartition, relrewrite, relfrozenxid, relminmxid, relacl, reloptions, relpartbound Filter: (pg_class.reltype = $1) Settings: enable_hashjoin = 'off', enable_mergejoin = 'off' 2023-11-06 11:52:25.610 IST [349330] LOG: query plan running on backend with PID 349330 is: Query Text: select count(*) from pg_class where reltype = typeid; select count(*) from pg_type where oid = typeid; Query Parameters: $1 = '2203' Aggregate (cost=4.30..4.31 rows=1 width=4) Output: count(*) -> Index Only Scan using pg_type_oid_index on pg_catalog.pg_type (cost=0.28..4.29 rows=1 width=0) Output: oid Index Cond: (pg_type.oid = $1) Settings: enable_hashjoin = 'off', enable_mergejoin = 'off' Individual pieces are confusing here since the query run by the backend is not the one being EXPLAINed. A user may not know that the queries being EXPLAINed arise from the function call multstmt(). So they won't be able to stitch the pieces together unless they see plan of the outermost query with loops and costs. What might help if we explain each query in the hierarchy. I think we can start with what auto_explain is doing. Always print the plan of the outermost query; the query found in pg_stat_activity. In a later version we might find a way to print plans of all the queries in the stack and do so in a readable manner. This makes tracking activeQueryDesc a bit tricky. My guess is that the outermost query's descriptor will be available through ActivePortal most of the time. But there are cases when ExecutorRun is called by passing a queryDesc directly. So may be there are some cases where that's not true. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: