Re: [SQL] New Optimizer Behaviour In 7.0b1 - Mailing list pgsql-sql

From Tom Lane
Subject Re: [SQL] New Optimizer Behaviour In 7.0b1
Date
Msg-id 26472.951552291@sss.pgh.pa.us
Whole thread Raw
In response to New Optimizer Behaviour In 7.0b1  (Mark Kirkwood <markir@ihug.co.nz>)
List pgsql-sql
Mark Kirkwood <markir@ihug.co.nz> writes:
> I have been using the query shown below to study optimizer changes.

> [ results snipped ]

> However it seems a bit on the brutal side to have to coerce the
> optimizer this way ( after all hash joins are generally good), is
> there any way to get a reasonably sensible use of indexes without such
> desperate measures ?

Obviously we'd like to get the optimizer to do the right thing without
being beaten over the head ;-).  As you see, we're not there yet.

I will be the first to say that the 7.0 optimizer is still in a pretty
crude state: I have made drastic changes to its model of plan costs,
and have not yet had much time to tune the results.  I do appreciate
reports about cases it gets wrong.

In this case I guess the first question to ask is whether its
selectivity estimates are any good.  It seems to be estimating that your
"d0.d0f1 between '1999-11-01' and '1999-12-01'" clause will select about
100 of the 900 rows in dim0; is that anywhere near right?  Also, in the
nested-loop plan we can see that it thinks about 1500 rows from fact1
will match on "d0.d0key = f.d0key" against any given selected row from
dim0; is that on the mark?  Finally, is the estimate that the total
number of joined rows (before GROUP BY) is about 33333 any good?

If you have not done VACUUM ANALYZE recently on these two tables,
it'd be worth trying that to see if it brings the estimates any
closer to reality.
        regards, tom lane


pgsql-sql by date:

Previous
From: "Vikrant Rathore"
Date:
Subject: SQL query problem
Next
From: Mark Kirkwood
Date:
Subject: Re: [SQL] New Optimizer Behaviour In 7.0b1