Home > mailing lists

Re: Query optimizer 8.0.1 (and 8.0) - Mailing list pgsql-hackers

From	pgsql@mohawksoft.com
Subject	Re: Query optimizer 8.0.1 (and 8.0)
Date	February 7, 2005 19:25:18
Msg-id	16623.24.91.171.78.1107793679.squirrel@mail.mohawksoft.com Whole thread Raw
In response to	Re: Query optimizer 8.0.1 (and 8.0) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Query optimizer 8.0.1 (and 8.0)
List	pgsql-hackers

Tree view

> pgsql@mohawksoft.com writes:
>> On a very basic level, why bother sampling the whole table at all? Why
>> not
>> check one block and infer all information from that? Because we know
>> that
>> isn't enough data. In a table of 4.6 million rows, can you say with any
>> mathmatical certainty that a sample of 100 points can be, in any way,
>> representative?
>
> This is a statistical argument, not a rhetorical one, and I'm not going
> to bother answering handwaving.  Show me some mathematical arguments for
> a specific sampling rule and I'll listen.
>

Tom, I am floored by this response, I am shaking my head in disbelief.

It is inarguable that increasing the sample size increases the accuracy of
a study, especially when diversity of the subject is unknown. It is known
that reducing a sample size increases probability of error in any poll or
study. The required sample size depends on the variance of the whole. It
is mathmatically unsound to ASSUME any sample size is valid without
understanding the standard deviation of the set.

http://geographyfieldwork.com/MinimumSampleSize.htm

Again, I understand why you used the Vitter algorithm, but it has been
proven insufficient (as used) with the US Census TIGER database. We
understand this because we have seen that the random sampling as
implemented has insufficient information to properly characterize the
variance in the data.

pgsql-hackers by date:

From: Tom Lane
Date: 07 February 2005, 18:51:26
Subject: Re: Is there a way to make VACUUM run completely outside transaction

From: Jan Wieck
Date: 07 February 2005, 19:41:24
Subject: Re: Patent issues and 8.1

Re: Query optimizer 8.0.1 (and 8.0) - Mailing list pgsql-hackers

Previous

Next