Re: Planner reluctant to start from subquery - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Planner reluctant to start from subquery
Date
Msg-id 43E0C4A7.EE98.0025.0@wicourts.gov
Whole thread Raw
In response to Re: Planner reluctant to start from subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Planner reluctant to start from subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
>>> On Wed, Feb 1, 2006 at  2:14 pm, in message
<4218.1138824885@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> ... expected an equivalent IN clause to work better.  In fact, I'm
not
>>> clear why the planner isn't finding the cheapest plan (which it
does
>>> estimate as cheapest) from the IN version you posted.
>
>> All I know is that trying various permutations, I saw it pick a
good
>> plan for the IN format when I eliminated the last outer join in the
FROM
>> clause.  I know it isn't conclusive, but it was a correlation which
>> suggested a possible causality to me.
>
> But there is still an outer join in your third example (the one with
the
> best plan), so that doesn't seem to hold water.

Right, if I moved the DocImageMetaData from a subquery in the WHERE
clause up to the FROM clause, or I eliminated all OUTER JOINs, it chose
a good plan.  Of course, this was just playing with a few dozen
permutations, so it proves nothing -- I'm just sayin'....

> In any case, the way
> that IN planning works these days it really should have considered
the
> plan equivalent to your JOIN- against- GROUP- BY variant.
>
> I'm interested to poke at this ... are you in a position to provide
a
> test case?

I can't supply the original data, since many of the tables have
millions of rows, with some of the data (related to juvenile, paternity,
sealed, and expunged cases) protected by law.  I could try to put
together a self-contained example, but I'm not sure the best way to do
that, since the table sizes and value distributions may be significant
here.  Any thoughts on that?

-Kevin



pgsql-performance by date:

Previous
From: "Jeffrey W. Baker"
Date:
Subject: Re: Index Usage using IN
Next
From: "Jeffrey W. Baker"
Date:
Subject: Re: Index Usage using IN