Re: benchmarking the query planner - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: benchmarking the query planner
Date
Msg-id 1229109556.8673.96.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: benchmarking the query planner  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, 2008-12-12 at 13:43 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Fri, 2008-12-12 at 13:18 -0500, Tom Lane wrote:
> >> Could we skip the hyperbole please?
> 
> > Some of the ndistinct values are very badly off, and in the common cases
> > I cited previously, consistently so.
> 
> > Once I'm certain the rescue helicopter has seen me, I'll stop waving my
> > arms. (But yes, OK).
> 
> Well, AFAICT we have agreed in this thread to kick up the default and
> maximum stats targets by a factor of 10 for 8.4.  If there's anything
> to your thesis that a bigger sample size will help, that should already
> make a noticeable difference.  

That only makes x10 sample size. Since we're using such a low sample
size already, it won't make much difference to ndistinct. It will be
great for histograms and MCVs though.

Please review my detailed test results mentioned here
http://archives.postgresql.org/pgsql-hackers/2006-01/msg00153.php

If you reproduce those results you'll see that the ndistinct machinery
is fundamentally broken for clustered data on large tables. In many
cases those are join keys and so joins are badly handled on the very
tables where good optimisation is most important.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: "Greg Stark"
Date:
Subject: Re: benchmarking the query planner
Next
From: Aidan Van Dyk
Date:
Subject: Re: Sync Rep: First Thoughts on Code