Adding a plan_id to pg_stat_activity allows users
to determine if a plan for a particular statement
has changed and if the new plan is performing better
or worse for a particular statement.
There are several ways the plan_id in pg_stat_activity
can be used:
1. In extensions that expose the plan text.
This will allow users to map a plan_id
from pg_stat_activity to the plan text.
2. In EXPLAIN output, including auto_explain.
3. In statement logging.
Computing the plan_id can be done using the same
routines for query jumbling, except plan nodes
will be jumbled. This approach was inspired by
work done in the extension pg_stat_plans,
https://github.com/2ndQuadrant/pg_stat_plans/
Attached is a POC patch that computes the plan_id
and presents the top-level plan_id in pg_stat_activity.
The patch still has work left:
- Perhaps Moving the plan jumbler outside of queryjumble.c?
- In the POC, the compute_query_id GUC determines if a
plan_id is to be computed. Should this be a separate GUC?
-- Below is the output of sampling pg_stat_activity
-- with a pgbench workload running The patch
-- introduces the plan_id column.
select count(*),
query,
query_id,
plan_id
from pg_stat_activity
where state='active'
and plan_id is not null and query_id is not null
group by query, query_id, plan_id
order by 1 desc limit 1;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------
count | 1
query | INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (7, 8, 242150, -1471, CURRENT_TIMESTAMP);
query_id | 4535829358544711074
plan_id | -4913142083940981109
-- Also, a new view called pg_stat_statements_plan which
-- Includes all the same columns as pg_stat_statements, but
-- with statistics shown per plan.
postgres=# select substr(query, 1, 10) as query, queryid, planid, calls from pg_stat_statements_plan where queryid = 4535829358544711074;
-[ RECORD 1 ]-----------------
query | INSERT INT
queryid | 4535829358544711074
planid | -4913142083940981109
calls | 4274428
-- the existing pg_stat_statements table
-- shows stats aggregated on
-- the queryid level. This is current behavior.
postgres=# select substr(query, 1, 10) as query, queryid, calls from pg_stat_statements where queryid = 4535829358544711074;
-[ RECORD 1 ]----------------
query | INSERT INT
queryid | 4535829358544711074
calls | 4377142
-- The “%Q” log_line_prefix flag will also include the planid as part of the output
-- the format will be "query_id/plan_id"
-- An example of using auto_explain with the ‘%Q” flag in log_line_prefix.
2022-06-14 17:08:10.485 CDT [76955] [4912312221998332774/-2294484545013135901] LOG: duration: 0.144 ms plan:
Query Text: UPDATE pgbench_tellers SET tbalance = tbalance + -1952 WHERE tid = 32;
Update on public.pgbench_tellers (cost=0.27..8.29 rows=0 width=0)
-> Index Scan using pgbench_tellers_pkey on public.pgbench_tellers (cost=0.27..8.29 rows=1 width=10)
Output: (tbalance + '-1952'::integer), ctid
Index Cond: (pgbench_tellers.tid = 32)
-- the output for EXPLAIN VERBOSE also shows a plan id.
postgres=# explain verbose select 1;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4)
Output: 1
Query Identifier: -2698492627503961632
Plan Identifier: -7861780579971713347
(4 rows)
Thanks,
Sami Imseih
Amazon Web Services