Re: performance drop on 8.2.4, reverting to 8.1.4 - Mailing list pgsql-performance

From Steven Flatt
Subject Re: performance drop on 8.2.4, reverting to 8.1.4
Date
Msg-id 357fa7590706051430h5db6760dx9d4a46c7a8a36205@mail.gmail.com
Whole thread Raw
In response to Re: performance drop on 8.2.4, reverting to 8.1.4  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: performance drop on 8.2.4, reverting to 8.1.4
Re: performance drop on 8.2.4, reverting to 8.1.4
List pgsql-performance
On 5/18/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah.  8.2 is estimating that the "nodeid IS NULL" condition will
discard all or nearly all the rows, presumably because there aren't any
null nodeid's in the underlying table --- it fails to consider that the
LEFT JOIN may inject some nulls.  8.1 was not any brighter; the reason
it gets a different estimate is that it doesn't distinguish left-join
and WHERE clauses at all, but assumes that the result of the left join
can't have fewer rows than its left input, even after applying the
filter condition.  In this particular scenario that happens to be a
better estimate.  So even though 8.2 is smarter, and there is no bug
here that wasn't in 8.1 too, it's getting a worse estimate leading to
a worse plan.

This is a sufficiently common idiom that I think it's a must-fix
problem.  Not sure about details yet, but it seems somehow the
selectivity estimator had better start accounting for
outer-join-injected NULLs.
This problem is causing us a bit of grief as we plan to move from 8.1.4 to 8.2.4.  We have many (on the order of a hundred) queries that are of the form:
 
(A) LEFT JOIN (B) ON col WHERE B.col IS NULL
 
These queries are much slower on 8.2 than on 8.1 for what looks like the reason outlined above.  I have rewritten a few key queries to be of the equivalent form:
 
(A) WHERE col NOT IN (SELECT col FROM (B))
 
which has resulted in a dramatic improvement.  I'm really hoping that I'm not going to need to re-write every single one of our queries that are of the first form above.  Is there any estimation as to if/when the fix will become available?  I'm hoping this isn't going to be a showstopper in us moving to 8.2.
 
Thanks,
Steve
 

pgsql-performance by date:

Previous
From: david@lang.hm
Date:
Subject: Re: Thousands of tables versus on table?
Next
From: "Steinar H. Gunderson"
Date:
Subject: Re: performance drop on 8.2.4, reverting to 8.1.4