Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table
Date
Msg-id 50b7ecd9-7512-479b-8096-2bda4e0e7ac1@gmail.com
Whole thread Raw
In response to Re: Do not scan index in right table if condition for left join evaluates to false using columns in left table  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 8/12/2024 23:37, Andres Freund wrote:
> On 2024-12-08 15:44:23 +0700, Andrei Lepikhov wrote:
>> On 8/12/2024 09:52, Andres Freund wrote:
>>>> I think avoiding touching a hash table and an index under MergeJoin can also
>>>> be beneficial.
>>>
>>> How would you get significant wins for mergejoins? You need to go through both
>>> inner and outer anyway?
> 
>> In my mind, this trick can be designed for specific cases like sales tables,
>> as illustrated before and used by well-rounded developers.
> 
> I'm not saying that the optimization isn't useful, just that i don't think it
> makes much sense for mergeoins.
> 
> Besides, at least as far as I can see, fundamentally not optimizing mergejoins
> in any substantial manner, it also just doesn't seem likely that queries where
> this optimization would come up are likely to be planned as mergejoins. If you
> have a leftjoin to a category-style table, you're very rarely going to scan
> the "main" table with an ordered index scan on the category key, which would
> be required for a merge join. And sorting the main table once for each
> to-be-joined-category-table isn't a desirable path most of the time either.
> 
> I don't know what it means that it's to be "used by well-rounded
> developers". We have to write the implementation in way it works regardless of
> what kind of developer is using postgres.
I got it, thanks. I agree that the profit for MJ & HJ cases looks 
modest. When designing the prototype, I realised that it seemed more 
"symmetrical" and logical to separate such 'one-sided' clauses during 
the planning for all types of join.
But I don't insist on this statement. It would be OK even in the case of 
NestLoop (remember, we should change the NL cost model too).

>> I'm not sure that such optimisation would be profitable in general.
> 
> Are you suggesting it would only be enabled by a GUC?
No, such a feature probably won't generate much overhead; I don't see 
any reason to cover it under a GUC.

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [Bug] Heap Use After Free in Window Aggregate Execution
Next
From: Peter Smith
Date:
Subject: Re: Parallel heap vacuum