Home > mailing lists

Re: Odd statistics behaviour in 7.2 - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Odd statistics behaviour in 7.2
Date	February 17, 2002 16:13:27
Msg-id	21967.1013968953@sss.pgh.pa.us Whole thread Raw
In response to	Odd statistics behaviour in 7.2 ("Gordon A. Runkle" <gar@integrated-dynamics.com>)
List	pgsql-hackers

Tree view

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> It would seem that if you could determine if the number of distinct
> values is _increasing_ as you scan more rows, that an increase in table
> size would also cause an increase, e.g. if you have X distinct values
> looking at N rows, and 2X distinct values looking at 2N rows, that
> clearly would show a scale.

[ thinks for awhile... ]  I don't think that'll help.  You could not
expect an exact 2:1 increase, except in the case of a simple unique
column, which isn't the problem anyway.  So the above would really
have to be coded as "count the number of distinct values in the sample
(d1) and the number in half of the sample (d2); then if d1/d2 >= X
assume the number of distinct values scales".  X is a constant somewhere
between 1 and 2, but where?  I think you've only managed to trade one
arbitrary threshold for another one.

A more serious problem is that the above could easily be fooled by a
distribution that contains a few very-popular values and a larger number
of seldom-seen ones.  Consider for example a column "number of children"
over a database of families.  In a sample of a thousand or so, you might
well see only values 0..4 (or so); if you double the size of the sample,
and find a few rows with 5 to 10 kids, are you then correct to label the
column as scaling with the size of the database?
        regards, tom lane

pgsql-hackers by date:

From: Brian Bruns
Date: 17 February 2002, 12:22:45
Subject: Re: making way for DRDA

From: "Marc G. Fournier"
Date: 17 February 2002, 20:53:02
Subject: Branch created ... May v7.3 be Born!!

Re: Odd statistics behaviour in 7.2 - Mailing list pgsql-hackers

Previous

Next