Re: Odd statistics behaviour in 7.2 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Odd statistics behaviour in 7.2
Date
Msg-id 20184.1013879853@sss.pgh.pa.us
Whole thread Raw
In response to Re: Odd statistics behaviour in 7.2  ("Gordon A. Runkle" <gar@integrated-dynamics.com>)
List pgsql-hackers
"Gordon A. Runkle" <gar@integrated-dynamics.com> writes:
> Is "-0.503824" the same as "503824 with a predicted increase in the
> number of distinct values" (as opposed to using "-503824")?

No, it means "0.503824 times the number of rows in the table".
Although your table was ~ 1 million rows, so that's approximately
right in your case.

Given the stats you cited, the exactly correct stadistinct value would
be -0.9348085.  In testing I got -1, -0.808612, -0.678641, or once
-0.584611 from your data, depending on whether the sample chanced to
find none, one, two, or three repeated values.  Any of these strike me
as plenty close enough for statistical purposes.  But the Chaudhuri
estimator was off by more than a factor of 10.

> Are you planning to include this patch in v7.2.1, or would it require
> too much testing by others?

I'm going to put it in 7.2.1 unless there are objections.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Oliver Elphick
Date:
Subject: pg_ctl - tighten command parameter checking
Next
From: Tom Lane
Date:
Subject: Re: Odd statistics behaviour in 7.2