Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Removing unneeded self joins
Date
Msg-id 28ab16e7-6ace-40aa-a76d-9a110799944d@gmail.com
Whole thread Raw
In response to Re: Removing unneeded self joins  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Removing unneeded self joins
List pgsql-hackers
On 4/4/25 04:53, Richard Guo wrote:
> On Fri, Apr 4, 2025 at 1:02 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> I've got an off-list bug report from Alexander Lakhin involving a
>> placeholder variable.  Alena and Andrei proposed a fix.  It is fairly
>> simple: we just shouldn't remove PHVs during self-join elimination, as
>> they might still be referenced from other parts of a query.  The patch
>> is attached.  I'm going to fix this if no objections.
> 
> Hmm, I'm not sure about the fix.  It seems to me that it simply
> prevents removing any PHVs in the self-join removal case.  My concern
> is that this might result in PHVs that could actually be removed not
> being removed in many cases.
Let's play with use cases:
If a PHV is needed in the inner or outer only, it means we have a clause 
in the baserestrictinfo that will be transferred to the keeping 
relation, and we shouldn't remove the PHV.
Another case is when the PHV is needed in a join clause of the 
self-join. I may imagine such a case:

toKeep.x+toRemove.y=PHV

This clause will be transformed to "toKeep.x+toKeep.y=PHV", pushed to 
baserestrictinfo of keeping relation and should be saved.
I think it is possible to invent quite a narrow case of clause like the 
following:

PHV_evaluated_at_inner = PHV_evaluated_at_outer

It needs to prove reproducibility. But even if it makes sense, it seems 
to have no danger for further selectivity estimation compared to the 
source clause and is a too-narrow case, isn't it?
In other cases, this PHV is needed something else, and we can't remove it.

Maybe I lost the case you keep in mind? I would like to discover it.

> 
> Besides, there's the specific comment above this code explaining the
> logic behind the removal of PHVs.  Shouldn't that comment be updated
> to reflect the changes?
It makes sense: for now, it seems that PHV removal should be used in the 
case of an outer join removal. In the case of SJE, logically we make a 
replacement, not a removal, and we should not reduce the number of 
entities involved.

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Next
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations