Re: On Distributions In 7.2.1 - Mailing list pgsql-general

From Mark kirkwood
Subject Re: On Distributions In 7.2.1
Date
Msg-id 1020487765.1259.23.camel@spikey.slithery.org
Whole thread Raw
In response to Re: On Distributions In 7.2.1  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: On Distributions In 7.2.1
List pgsql-general
>On Fri, 2002-05-03 at 02:11, Tom Lane wrote:
> Mark kirkwood <markir@slingshot.co.nz> writes:
> > However Tom's observation is still valid (in spite of my math) - all the
> > frequencies are overestimated, rather than the expected "some bigger,
> > some smaller" sort of thing.
>
> No, that makes sense.  The values that get into the most-common-values
> list are only going to be ones that are significantly more common (in
> the sample) than the estimated average frequency.  So if the thing makes
> a good estimate of the average frequency, you'll only see upside
> outliers in the MCV list.  The relevant logic is in analyze.c:
>
>(snippage)

oh I see... I am thinking that a larger sample may reduce the likelihood
of sample values *not* being included in my MCV list. A quick ALTER ....
SET STATISTICS 500 + ANALYZE results in frequencies of around the 0.0004
area for each MCV - closer to the actual of 0.00033.

So the frequency estimation algorithm is behaving well in this case, and
appears to be coverging to the correct results (or within a sensible
neighbourhood of them...). So for this type of data (uniformly
distributed keys) one can ANALYZE with confidence...

I notice that for a resonably large number of keys + big table
conbination using 100 quantiles gives much more accurate frequencies -
I wonder if its worth a mention in the docs to the effect :

"ANALYZE with the default (10 or so) is ok for most cases, but for big
tables consider using ALTER ... SET STATISTICS 100 for commonly JOINed
or WHEREed columns"

(Next ...the logarithmic guy...)

regards

Mark


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Subject: bool / vacuum full bug followup part 2
Next
From: tony
Date:
Subject: Re: Foxpro