Thinking about IN/EXISTS optimization - Mailing list pgsql-hackers

From Tom Lane
Subject Thinking about IN/EXISTS optimization
Date
Msg-id 26280.1035328691@sss.pgh.pa.us
Whole thread Raw
Responses Re: Thinking about IN/EXISTS optimization  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
I've been thinking about how to convert "x IN (subselect)" and EXISTS
constructs into join-like processing, and I've run into a small problem
in getting the planner to do it nicely.  The issue is that I need to
take the subselect and push it into the jointree --- essentially, make
it look like a subselect-in-FROM --- so that the join planner can deal
with it.  Basically, I need to rearrange
SELECT ... FROM ... WHERE ... AND x IN (SELECT y FROM ...)

into
SELECT ... FROM ..., (SELECT y FROM ...) ss       WHERE ... AND x =* ss.y

where =* represents some specially-marked RestrictInfo node.  (NOT IN is the
same except that the RestrictInfo node will be marked differently.)

The difficulty is that there's no good place to do this in
subquery_planner().  We should push the subselect into FROM before we
run the pull_up_subqueries() and preprocess_jointree() operations;
if we don't pull up the subselect into the main query then we won't have
accomplished very much.  But the WHERE clause isn't simplified into a
form that makes it easy to spot top-level IN() expressions until after
that.  We can't simply switch the order of the subselect and
WHERE-clause processing, because pulling up subqueries typically adds
conditions to the WHERE clause.

I haven't been able to think of a solution to this that doesn't involve
wasting a lot of cycles by repeating some of these processing steps,
or missing some optimization possibilities.  (For example, if we pull up
a subquery that came from a view, it might contain an IN where-clause,
which ideally we'd want to be able to optimize.  It almost seems like
we need to be able to loop around the whole operation; but most of the
time this will just waste cycles.)

Anyone see a nice way to do this?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Memory leaks
Next
From: Neil Conway
Date:
Subject: Re: Memory leaks