Re: AW: Call for alpha testing: planner statistics revision s - Mailing list pgsql-hackers

From Alex Pilosov
Subject Re: AW: Call for alpha testing: planner statistics revision s
Date
Msg-id Pine.BSO.4.10.10106180911560.8898-100000@spider.pilosoft.com
Whole thread Raw
In response to AW: Call for alpha testing: planner statistics revision s  (Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>)
List pgsql-hackers
On Mon, 18 Jun 2001, Zeugswetter Andreas SB wrote:

> First of all thanks for the great effort, it will surely be appreciated :-)
> 
> > * On large tables, ANALYZE uses a random sample of rows rather than
> > examining every row, so that it should take a reasonably short time
> > even on very large tables.  Possible downside: inaccurate stats.
> > We need to find out if the sample size is large enough.
> 
> Imho that is not optimal :-) ** ducks head, to evade flying hammer **
> 1. the random sample approach should be explicitly requested with some 
> syntax extension
> 2. the sample size should also be tuneable with some analyze syntax 
> extension (the dba chooses the tradeoff between accuracy and runtime)
> 3. if at all, an automatic analyze should do the samples on small tables,
> and accurate stats on large tables
> 
> The reasoning behind this is, that when the optimizer does a "mistake"
> on small tables the runtime penalty is small, and probably even beats
> the cost of accurate statistics lookup. (3 page table --> no stats 
> except table size needed)
I disagree.

As monte carlo method shows, _as long as you_ query random rows, your
result will be sufficiently close to the real statistics. I'm not sure if
I can find math behind this, though...

-alex



pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: Doc translation
Next
From: Tom Lane
Date:
Subject: Re: AW: Call for alpha testing: planner statistics revision s