Re: Allowing NOT IN to use ANTI joins - Mailing list pgsql-hackers

From David Rowley
Subject Re: Allowing NOT IN to use ANTI joins
Date
Msg-id CAApHDvpKHhw6a5cCnWE0hG7WLuWVac7bCeXcD3F9ZYDv3NGwdQ@mail.gmail.com
Whole thread Raw
In response to Re: Allowing NOT IN to use ANTI joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allowing NOT IN to use ANTI joins  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
List pgsql-hackers
On Fri, Jun 27, 2014 at 6:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> If there's no way to tell that the target entry comes from a left join,
> then would it be a bit too quirky to just do the NOT NULL checking when
> list_length(query->rtable) == 1 ? or maybe even loop over query->rtable and
> abort if we find an outer join type? it seems a bit hackish, but perhaps it
> would please more people than it would surprise.

Why do you think you can't tell if the column is coming from the wrong
side of a left join?

It was more of that I couldn't figure out how to do it, rather than thinking it was not possible.
 
I don't recall right now if there is already machinery in the planner for
this, but we could certainly deconstruct the from-clause enough to tell
that.

It's not hard to imagine a little function that recursively descends the
join tree, not bothering to descend into the nullable sides of outer
joins, and returns "true" if it finds a RangeTableRef for the desired RTE.
If it doesn't find the RTE in the not-nullable parts of the FROM clause,
then presumably it's nullable.


Ok, I've implemented that in the attached. Thanks for the tip. 
 
The only thing wrong with that approach is if you have to call the
function many times, in which case discovering the information from
scratch each time could get expensive.


I ended up just having the function gather up all RelIds that are on the on the inner side. So this will just need to be called once per NOT IN clause.
 
I believe deconstruct_jointree already keeps track of whether it's
underneath an outer join, so if we were doing this later than that,
I'd advocate making sure it saves the needed information.  But I suppose
you're trying to do this before that.  It might be that you could
easily save aside similar information during the earlier steps in
prepjointree.c.  (Sorry for not having examined the patch yet, else
I'd probably have a more concrete suggestion.)


This is being done from within pull_up_sublinks, so it's before deconstruct_jointree.

I've attached an updated version of the patch which seems to now work properly with outer joins.

I think I'm finally ready for a review again, so I'll update the commitfest app.

Regards

David Rowley


Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Cluster name in ps output
Next
From: "MauMau"
Date:
Subject: Re: [Fwd: Re: proposal: new long psql parameter --on-error-stop]