Home > mailing lists

Re: Asymmetric partition-wise JOIN - Mailing list pgsql-hackers

From	Andrei Lepikhov
Subject	Re: Asymmetric partition-wise JOIN
Date	August 19, 2024 11:43:35
Msg-id	944ed18c-3e7d-42ef-816e-0afc41610e93@postgrespro.ru Whole thread Raw
In response to	Re: Asymmetric partition-wise JOIN (Alexander Korotkov <aekorotkov@gmail.com>)
List	pgsql-hackers

Tree view

On 1/8/2024 20:56, Alexander Korotkov wrote:
> On Tue, Apr 2, 2024 at 6:07 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
> Actually, the idea I tried to express is the combination of #1 and #2:
> to build individual plan for every partition, but consider the 'Common
> Resources'.  Let me explain this a bit more.
Thanks for keeping your eye on it!
> My idea is to introduce a new property for paths selection.
> 3) Usage of common resources.  The common resource can be: hash
> representation of relation, memoize over relation scan, etc.  We can
> exclude the cost of common resource generation from the path cost, but
> keep the reference for the common resource with its generation cost.
> If one path uses more common resources than another path, it could
> cost-dominate another one only if its cheaper together with its extra
> common resources cost.  If one path uses less or equal common
> resources than another, it could normally cost-dominate another one.
The most challenging part for me is the cost calculation, which is 
bonded with estimations of other paths. To correctly estimate the 
effect, we need to remember at least the whole number of paths sharing 
resources.
Also, I wonder if it can cause some corner cases where prediction error 
on a shared resource will cause an even worse situation upstream.
I think we could push off here from an example and a counter-example, 
but I still can't find them.

> However, I understand this is huge amount of work given we have to
> introduce new basic optimizer concepts.  I get that the main
> application of this patch is sharding.  If we have global tables
> residing each shard, we can push down any joins with them.  Given this
> patch gives some optimization for non-sharded case, I think we
> *probably* can accept its concept even that it this optimization is
> obviously not perfect.
Yes, right now sharding is the most profitable case. We can push down 
parts of the plan which references only some common resources: 
FunctionScan, ValueScan, tables which can be proved are existed 
everywhere and provide the same output. But for now it is too far from 
the core code, IMO. - So, I search for cases that can be helpful for a 
single instance.

-- 
regards,
Andrei Lepikhov
Postgres Professional

pgsql-hackers by date:

From: "Andrey M. Borodin"
Date: 19 August 2024, 10:35:53
Subject: Re: MultiXact\SLRU buffers configuration

From: Jelte Fennema-Nio
Date: 19 August 2024, 12:04:19
Subject: Re: gitmaster server problem?

Re: Asymmetric partition-wise JOIN - Mailing list pgsql-hackers

Previous

Next