Home > mailing lists

Re: Why won't it index scan? - Mailing list pgsql-general

From	Greg Stark
Subject	Re: Why won't it index scan?
Date	May 18, 2006 02:05:09
Msg-id	87fyj7q5xm.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: Why won't it index scan? (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Why won't it index scan?
List	pgsql-general

Tree view

Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Ed L." <pgsql@bluepolka.net> writes:
> > So, does this sound like we just happened to get repeatedly
> > horribly unrepresentative random samples with stats target at
> > 10?  Are we at the mercy of randomness here?  Or is there a
> > better preventive procedure we can follow to systematically
> > identify this kind of situation?
>
> I think the real issue is that stats target 10 is too small for large
> tables: the samples are just not large enough to support a decent
> numdistinct estimate, which is the critical stat for cases such as this
> (ie, estimating the number of hits on a value that's not in the
> most-common-values list).

There's been some discussion on -hackers about this area. Sadly the idea of
using samples to calculate numdistinct estimates is fundamentally on pretty
shaky ground.

Whereas a fixed sample size works fine for calculating distribution of values,
in order to generate consistent precision for numdistinct estimates the
samples will have to be a constant fraction of the table -- and unfortunately
a pretty large fraction at that.

So sadly I think "at the mercy of randomness" is pretty accurate. You'll have
to raise the statistics target as the table grows and I expect you'll
eventually run into some downsides of large stats targets.

Some better algorithms were posted, but they would require full table scans
during analyze, not just samples.

--
greg

pgsql-general by date:

From: Tom Lane
Date: 17 May 2006, 23:48:31
Subject: Re: Contributing code

From: Martijn van Oosterhout
Date: 18 May 2006, 02:28:19
Subject: Re: PG_DIAG_STATEMENT_POSITION Where is it defined?

Re: Why won't it index scan? - Mailing list pgsql-general

Previous

Next