Re: query slows down with more accurate stats - Mailing list pgsql-performance

From Tom Lane
Subject Re: query slows down with more accurate stats
Date
Msg-id 25556.1081883922@sss.pgh.pa.us
Whole thread Raw
In response to query slows down with more accurate stats  (Robert Treat <xzilla@users.sourceforge.net>)
Responses Re: query slows down with more accurate stats  (Manfred Koizar <mkoi-pg@aon.at>)
Re: query slows down with more accurate stats  (Robert Treat <xzilla@users.sourceforge.net>)
List pgsql-performance
Robert Treat <xzilla@users.sourceforge.net> writes:
> live=# analyze cl;
> ANALYZE
> live=# select reltuples from pg_class where relname = 'cl';
>  reltuples
> -----------
>      53580
> (1 row)
> live=# vacuum cl;
> VACUUM
> live=# select reltuples from pg_class where relname = 'cl';
>   reltuples
> -------------
>  1.14017e+06
> (1 row)

Well, the first problem is why is ANALYZE's estimate of the total row
count so bad :-( ?  I suspect you are running into the situation where
the initial pages of the table are thinly populated and ANALYZE
mistakenly assumes the rest are too.  Manfred is working on a revised
sampling method for ANALYZE that should fix this problem in 7.5 and
beyond, but for now it seems like a VACUUM FULL might be in order.

> so i guess i am wondering if there is something I should be doing to
> help get the better plan at the more accurate stats levels and/or why it
> doesn't stick with the original plan (I noticed disabling merge joins
> does seem to push it back to the original plan).

With the larger number of estimated rows it's figuring the nestloop will
be too expensive.  The row estimate for the cl scan went up from 1248
to 10546, so the estimated cost for the nestloop plan would go to about
240000 units vs 80000 for the mergejoin plan.  This is obviously off
rather badly when the true runtimes are 1.7 vs 8.1 seconds :-(.

I think this is an example of a case where we really need better
estimation of nestloop costs --- it's drastically overestimating the
relative cost of the nestloop because it's not accounting for the cache
benefits of the repeated index searches.  You could probably force the
nestloop to be chosen by lowering random_page_cost, but that's just a
kluge solution ... the real problem is the model is wrong.

I have a to-do item to work on this, and will try to bump up its
priority a bit.

            regards, tom lane

pgsql-performance by date:

Previous
From: Qing Zhao
Date:
Subject: configure shmmax on MAC OS X
Next
From: Jeff Bohmer
Date:
Subject: Re: configure shmmax on MAC OS X