Re: Making Vars outer-join aware - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Making Vars outer-join aware
Date
Msg-id CAMbWs4_JDDdXvEBWtTao9ZTwOGdDCU7X-=bF7yi9G1u_-b0Dfw@mail.gmail.com
Whole thread Raw
In response to Re: Making Vars outer-join aware  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Making Vars outer-join aware  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

On Tue, Dec 27, 2022 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The thing that I couldn't get around before is that if you have,
say, a mergejoinable equality clause in an outer join:

    select ... from a left join b on a.x = b.y;

that equality clause can only be associated with the join domain
for B, because it certainly can't be enforced against A.  However,
you'd still wish to be able to do a mergejoin using indexes on
a.x and b.y, and this means that we have to understand the ordering
induced by a PathKey based on this EC as applicable to A, even
though that relation is not in the same join domain.  So there are
situations where sort orderings apply across domain boundaries even
though equalities don't.  We might have to split the notion of
EquivalenceClass into two sorts of objects, and somewhere right
about here is where I realized that this wasn't getting finished
for v16 :-(.
 
I think I see where the problem is.  And I can see currently in
get_eclass_for_sort_expr we always use the top JoinDomain.  So although
the equality clause 'a.x = b.y' belongs to JoinDomain {B}, we set up ECs
for 'a.x' and 'b.y' that belong to the top JoinDomain {A, B, A/B}.

But doing so would lead to a situation where the "same" Vars from
different join domains might have the same varnullingrels and thus would
match by equal().  As an example, consider

    select ... from a left join b on a.x = b.y where a.x = 1;

As said we would set up EC for 'b.y' as belonging to the top JoinDomain.
Then when reconsider_outer_join_clause generates the equality clause
'b.y = 1', we figure out that the new clause belongs to JoinDomain {B}.
Note that the two 'b.y' here belong to different join domains but they
have the same varnullingrels (empty varnullingrels actually).  As a
result, the equality 'b.y = 1' would be merged into the existing EC for
'b.y', because the two 'b.y' matches by equal() and we do not check
JoinDomain for non-const EC members.  So we would end up with an EC
containing EC members of different join domains.

And it seems this would make the following statement in README not hold
any more.

    We don't have to worry about this for Vars (or expressions
    containing Vars), because references to the "same" column from
    different join domains will have different varnullingrels and thus
    won't be equal() anyway.

Thanks
Richard

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: SQL/JSON revisited
Next
From: Jelte Fennema
Date:
Subject: Re: [PATCH] Support using "all" for the db user in pg_ident.conf