Re: plan shape work - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: plan shape work
Date
Msg-id a5d1190d-42e5-4cb3-82dd-6eedeb5c7329@gmail.com
Whole thread Raw
List pgsql-hackers
On 19/5/2025 20:01, Robert Haas wrote:
> Hope you find this interesting. If you do, let me know what you think.Thanks for looking in this direction!

Since 2017, we designed features that should 'memorise' the experience 
of previous executions, checking the PlanState tree and instrumentation 
after execution.
The first dumb prototype you should know - AQO. The following, more 
robust code is 'replan' [1]. These two features utilise the core patch, 
and I hope this patch can be removed after a few adjustments to the core.
In fact, 'replan' is used to stop execution in the middle (if the 
time/memory usage limit is achieved), pass through the state, collect 
valuable data and execute again. That's why using data for the 
subsequent execution needs a matching query tree, and that data may be 
irrelevant for a different set of constants.

The lessons learned during design and after some usage in real life:
1. Subplans should be uniquely identified during the planning. Like you 
mentioned, two similar subplans on the same query level may be executed 
differently and provide different row numbers to the 'knowledge base'. I 
used the subquery_planner hack that generated a unique ID for each subplan.
2. Each node should be identified - I used a kind of signature at each 
RelOptInfo node - just a hash generated by restrictinfos and underlying 
signatures. This approach needs a create_plan hook to copy the signature 
to the Plan nodes.
3. A great source of fluctuations is 'never executed' nodes - because 
depending on the constant, the subtree may never be executed or produce 
tons of tuples - I resolved it by just ignoring 'never executed' 
results, no in-core changes needed.
4. A tree of paths may implement a single RelOptInfo. I have saved the 
signature at the top of the Plan node for the corresponding RelOptInfo 
and have never encountered any problems yet. It limits the use of 
gathered data on cardinality/group number/peak memory consumption 
somewhat, but not significantly.

I think the node extension fields and hooks, debated in the thread [2], 
may be enough to enable extensions to implement such a feature.

What's more, I personally prefer to have a hook that allows extension 
checking of a condition during execution - it may be done inside the 
ExecProcNode() by calling the node-specific hook, which an extension may 
initialise in the PlanState structure before the start of execution.
Additionally, it would be beneficial to have a hook for error processing 
at the top of the portal execution code - this is a key point for 
managing query resources. If we need to stop and re-execute the query, 
it is the only place where we can release all the resources assigned.
One more helpful thing - an option to postpone receiver sending data out 
of the instance-side, even result headings. We may want to decide on the 
query plan after some tuples are produced, and if something has already 
been sent to the user, we can't just stop and rebuild the plan.

[1] https://postgrespro.com/docs/enterprise/16/realtime-query-replanning
[2] 

https://www.postgresql.org/message-id/flat/CA%2BTgmoYxfg90rw13%2BJcYwn4dwSC%2Bagw7o8-A%2BfA3M0fh96pg8w%40mail.gmail.com

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Changing the state of data checksums in a running cluster
Next
From: Bertrand Drouvot
Date:
Subject: Re: Per backend relation statistics tracking