Re: Optimization rules for semi and anti joins - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Optimization rules for semi and anti joins
Date
Msg-id 4993119C.EE98.0025.0@wicourts.gov
Whole thread Raw
In response to Optimization rules for semi and anti joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Optimization rules for semi and anti joins  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Optimization rules for semi and anti joins  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> I wrote (in response to Kevin Grittner's recent issues):
>> Reflecting on this further, I suspect there are also some bugs in
>> the planner's rules about when semi/antijoins can commute with
>> other joins;
> 
> After doing some math I've concluded this is in fact the case. 
> Anyone want to check my work?
> Hence semijoins can be rearranged just as freely as inner joins.
Made sense.  Agreed.
> A6.    (A antijoin B on (Pab)) leftjoin C on (Pbc)
>     = A antijoin (B leftjoin C on (Pbc)) on (Pab)
> 
> The second form is in fact equivalent to null-extending the A/B
> antijoin --- the actual contents of C cannot affect the result.  So
> we could just drop C altogether.  (I'm not going to do anything
> about that now, but it's something to consider for the planned
> join-elimination optimization.)  In the first form, if Pbc is strict
> on B then it must fail for all rows of the antijoin result so we get
> the null-extended A/B result.  If Pbc is not strict then the first
> form might duplicate some rows in the antijoin result, or produce
> non-null-extended rows.  So in this case the identity holds only if
> Pbc is strict, which is the same as for left joins.
How do you get the first form as a starting point?  Aren't we limited
to NOT IN or NOT EXISTS clauses for B?  Those can't really contribute
columns to the result can they?  (I'm probably missing something
here.)
The rest of it made sense, although two identities jumped to mind
which weren't listed.
(A semijoin B on (Pab)) antijoin C on (Pac)
= (A antijoin C on (Pac)) semijoin B on (Pab)
This one turned out, on closer inspection, to be a restatement of S4.
(A semijoin B on (Pab)) antijoin C on (Pbc)
= A semijoin (B antijoin C on (Pbc)) on (Pab)
I think this one is true, and it doesn't seem to be mentioned, unless
I'm missing something.  It seems potentially useful.
Hopefully I didn't miss your point entirely.
-Kevin


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: A deprecation policy
Next
From: Tom Lane
Date:
Subject: Re: Strange issue with CREATE OPERATOR CLASS