Home > mailing lists

Re: benchmarking the query planner - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: benchmarking the query planner
Date	December 11, 2008 21:50:57
Msg-id	15086.1229046610@sss.pgh.pa.us Whole thread Raw
In response to	Re: benchmarking the query planner ("Nathan Boley" <npboley@gmail.com>)
Responses	Re: benchmarking the query planner
List	pgsql-hackers

Tree view

"Nathan Boley" <npboley@gmail.com> writes:
> Isn't a selectivity estimate of x = v as ( the number of values in v's
> histogram bucket ) / ( number of distinct values in v's histogram
> bucket ) pretty rational? Thats currently what we do for non-mcv
> values, except that we look at ndistinct over the whole table instead
> of individual histogram buckets.

But the histogram buckets are (meant to be) equal-population, so it
should come out the same either way.  The removal of MCVs from the
population will knock any possible variance in ndistinct down to the
point where I seriously doubt that this could offer a win.  An even
bigger problem is that this requires estimation of ndistinct among
fractions of the population, which will be proportionally less accurate
than the overall estimate.  Accurate ndistinct estimation is *hard*.

> now, if there are 100 histogram buckets then any values that occupy
> more than 1% of the table will be mcv's regardless - why force a value
> to be an mcv if it only occupies 0.1% of the table?

Didn't you just contradict yourself?  The cutoff would be 1% not 0.1%.
In any case there's already a heuristic to cut off the MCV list at some
shorter length (ie, more than 1% in this example) if it seems not
worthwhile to keep the last entries.  See lines 2132ff (in CVS HEAD)
in analyze.c.
        regards, tom lane

pgsql-hackers by date:

From: "Nathan Boley"
Date: 11 December 2008, 21:35:28
Subject: Re: benchmarking the query planner

From: "Robert Haas"
Date: 11 December 2008, 21:52:05
Subject: Re: benchmarking the query planner

Re: benchmarking the query planner - Mailing list pgsql-hackers

Previous

Next