Re: query slows down with more accurate stats - Mailing list pgsql-performance

From Manfred Koizar
Subject Re: query slows down with more accurate stats
Date
Msg-id ri5u70du80gnnt326k2hhuei5nlnimonbs@email.aon.at
Whole thread Raw
In response to Re: query slows down with more accurate stats  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: query slows down with more accurate stats
List pgsql-performance
[Just a quick note here;  a more thorough discussion of my test results
will be posted to -hackers]

On Tue, 13 Apr 2004 15:18:42 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Well, the first problem is why is ANALYZE's estimate of the total row
>count so bad :-( ?  I suspect you are running into the situation where
>the initial pages of the table are thinly populated and ANALYZE
>mistakenly assumes the rest are too.  Manfred is working on a revised
>sampling method for ANALYZE that should fix this problem

The new method looks very promising with respect to row count
estimation:  I got estimation errors of +/- 1% where the old method was
off by up to 60%.  (My test methods might be a bit biased though :-))

My biggest concern at the moment is that the new sampling method
violates the contract of returning each possible sample with he same
probability:  getting several tuples from the same page is more likely
than with the old method.

Servus
 Manfred

pgsql-performance by date:

Previous
From: Manfred Koizar
Date:
Subject: Re: index v. seqscan for certain values
Next
From: Tom Lane
Date:
Subject: Re: query slows down with more accurate stats