Re: IN vs EXISTS equivalence - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: IN vs EXISTS equivalence
Date
Msg-id 48A0496C.EE98.0025.0@wicourts.gov
Whole thread Raw
In response to IN vs EXISTS equivalence  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: IN vs EXISTS equivalence  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Our Internet connectivity failed as this was being sent.  It looks
like at least the list didn't get it, so here goes another try. 
Apologies for any duplication.
-Kevin
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> I chewed on that for awhile.  We can certainly optimize EXISTS
that's
> appearing in the ON-condition of a regular JOIN, because that's not
> really semantically different from a WHERE-condition.
Good to hear.  I thought that might be doable.  :-)
> But I don't think
> it's going to be reasonable to improve EXISTS in outer-JOIN ON
> conditions.  There are a couple of problems.  Consider
The discussion did make the difficulties clear.
> So this is not something I'm going to tackle; at least not this
> devel cycle.
Fair enough.
> One small step we can take in this direction, though, is to improve
the
> planner's internal handling of the qual conditions for IN and
EXISTS.
> Right now the process is just to throw the sub-select into the main
> range table and put the IN join conditions into the same place in
WHERE
> that the IN-clause was to start with.  The trouble with this is that
the
> distribute_quals_to_rels processing has no idea that there's
anything
> special about the IN join conditions.  We got away with that for the
> limited case of IN clauses at the top level of WHERE, but it's
become
> clear to me over the weekend that this has no hope of working for
NOT
> EXISTS --- since that's effectively an outer join, it absolutely has
to
> have the same kinds of qual-scheduling constraints as ordinary outer
> joins do.  So we need a data structure that distribute_quals_to_rels
can
> work with.  What I think needs to happen is that the initial pass
that
> pulls up optimizable IN/EXISTS sub-selects should not merge the
> SubLink's replacement qual clauses seamlessly, but put them
underneath a
> new node type, say "FlattenedSubLink", that retains knowledge of the
> join it's representing.  The FlattenedSubLink would survive only as
far
> as distribute_quals_to_rels, which would distribute out the
contained
> qual conditions instead of the FlattenedSubLink itself --- but only
> after marking them properly for the outer-join restrictions.  This
> representation would make it feasible to work with IN/EXISTS that
are
> inside JOIN ON conditions, which the present representation using a
> single in_info_list really can't do.  The semantic issues are still
> there but at least the representation isn't getting in the way ...
Just curious, is that something for this cycle, or a TODO item?
Thanks for looking at this.  The one part I'm not sure about is where
the CASE/EXISTS in the SELECT value list fits into this discussion. 
It seems conceptually similar to the OUTER JOIN, but sort of a special
case, so I'm not sure what you had in mind there.
-Kevin



pgsql-hackers by date:

Previous
From: Decibel!
Date:
Subject: Re: Mini improvement: statement_cost_limit
Next
From: "Jaime Casanova"
Date:
Subject: Re: Column level privileges was:(Re: Extending grant insert on tables to sequences)