Re: Memory consumed by paths during partitionwise join planning - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Memory consumed by paths during partitionwise join planning
Date
Msg-id 203efc9b-a7f4-474f-9ac1-c5db57f727c7@gmail.com
Whole thread Raw
In response to Re: Memory consumed by paths during partitionwise join planning  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers
On 19/9/2024 13:12, Ashutosh Bapat wrote:
> On Thu, Sep 19, 2024 at 4:18 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> At the same time, this technique (while highly useful in general) adds
>> fragility and increases complexity: a developer needs to remember to
>> link the path using the pointer in different places of the code.
>> So, maybe go the alternative way? Invent a subquery memory context and
>> store all the path allocations there. It can be freed after setrefs
>> finishes this subquery planning without pulling up this subquery.
>>
> 
> Using memory context for subquery won't help with partitioning right?
> If the outermost query has a partitioned table with thousands of
> partitions, it will still accumulate those paths till the very end of
> planning.
I got it. Just haven't had huge tables in the outer before.

> We could instead use memory context/s to store all the paths
> created, then copy the optimal paths into a new memory context at
> strategic points and blow up the old memory context. And repeat this
> till we choose the final path and create a plan out of it; at that
> point we could blow up the memory context containing remaining paths
> as well. That will free the paths as soon as they are rendered
> useless.
I think any scalable solution should be based on a per-partition 
cleanup. For starters, why not adopt Tom's patch [1] for selectivity 
estimations? We will see the profit in the case of long lists of clauses.

> I discussed this idea with Alvaro offline. We thought that
> this approach needs some code to copy paths and then copying paths
> recursively has some overhead of itself.
It needs path_tree_walker at first. We discussed it before but failed. 
Maybe design it beforehand and use it in re-parameterising code?

> The current work of adding a
> reference count, OTOH has potential to bring discipline into the way
> we handle paths. We need to avoid risks posed by dangling pointers.
Both hands up for having pointer counters: It is painful all the time in 
extensions to invent an approach to safely removing a path you want to 
replace with a custom one. I just want to say it looks too dangerous 
compared to the value of a possible positive outcome.

> For which Alvaro suggested looking at the way we manage snapshots. But
> I didn't get time to explore that idea yet.
Unfortunately, I can't understand this idea without an explanation.

[1] Optimize planner memory consumption for huge arrays
https://www.postgresql.org/message-id/1367418.1708816059@sss.pgh.pa.us

-- 
regards, Andrei Lepikhov




pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Conflict Detection and Resolution
Next
From: Laurenz Albe
Date:
Subject: Re: Wrong results with equality search using trigram index and non-deterministic collation