On 19/9/2024 13:12, Ashutosh Bapat wrote:
> On Thu, Sep 19, 2024 at 4:18 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> At the same time, this technique (while highly useful in general) adds
>> fragility and increases complexity: a developer needs to remember to
>> link the path using the pointer in different places of the code.
>> So, maybe go the alternative way? Invent a subquery memory context and
>> store all the path allocations there. It can be freed after setrefs
>> finishes this subquery planning without pulling up this subquery.
>>
>
> Using memory context for subquery won't help with partitioning right?
> If the outermost query has a partitioned table with thousands of
> partitions, it will still accumulate those paths till the very end of
> planning.
I got it. Just haven't had huge tables in the outer before.
> We could instead use memory context/s to store all the paths
> created, then copy the optimal paths into a new memory context at
> strategic points and blow up the old memory context. And repeat this
> till we choose the final path and create a plan out of it; at that
> point we could blow up the memory context containing remaining paths
> as well. That will free the paths as soon as they are rendered
> useless.
I think any scalable solution should be based on a per-partition
cleanup. For starters, why not adopt Tom's patch [1] for selectivity
estimations? We will see the profit in the case of long lists of clauses.
> I discussed this idea with Alvaro offline. We thought that
> this approach needs some code to copy paths and then copying paths
> recursively has some overhead of itself.
It needs path_tree_walker at first. We discussed it before but failed.
Maybe design it beforehand and use it in re-parameterising code?
> The current work of adding a
> reference count, OTOH has potential to bring discipline into the way
> we handle paths. We need to avoid risks posed by dangling pointers.
Both hands up for having pointer counters: It is painful all the time in
extensions to invent an approach to safely removing a path you want to
replace with a custom one. I just want to say it looks too dangerous
compared to the value of a possible positive outcome.
> For which Alvaro suggested looking at the way we manage snapshots. But
> I didn't get time to explore that idea yet.
Unfortunately, I can't understand this idea without an explanation.
[1] Optimize planner memory consumption for huge arrays
https://www.postgresql.org/message-id/1367418.1708816059@sss.pgh.pa.us
--
regards, Andrei Lepikhov