Tom,
> Ray Ontko <rayo@ontko.com> writes:
> > Does this help?
>
> Well, it says that ANALYZE is not doing anything wrong ;-).
>
> The true null_frac given those numbers is 1.0 - 433809.0/433902.0
> or about 0.000214. In the default ANALYZE sample size (3000 rows)
> the statistical expectation would be 0.643 NULLs, so most of the
> time ANALYZE is going to find 1 NULL, sometimes no NULLs, and rarely
> more than 1 NULL. Most of the time you'll get null_frac = 1/3000 =
> 0.000333333, sometimes 0, etc.
>
> The stats themselves look fine to me; I think there must be a
> mathematical error in what selfuncs.c is doing with them.
> Still looking to find it...
Well, that explains the non-determinism, I guess. It seems
odd that if we aren't doing any updates to the data that analyze
would give us different stats. How does analyze pick the 3000
rows it gets back, and should this vary from run to run if we
aren't changing the table?
Ray
----------------------------------------------------------------------
Ray Ontko rayo@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/