Home > mailing lists

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From	Josh Berkus
Subject	Re: Improving N-Distinct estimation by ANALYZE
Date	January 5, 2006 02:27:55
Msg-id	43BCBC87.3050108@agliodbs.com Whole thread Raw
In response to	Re: Improving N-Distinct estimation by ANALYZE (Greg Stark <gsstark@mit.edu>)
Responses	Re: Improving N-Distinct estimation by ANALYZE
List	pgsql-hackers

Tree view

Greg,

> Only if your sample is random and independent. The existing mechanism tries
> fairly hard to ensure that every record has an equal chance of being selected.
> If you read the entire block and not appropriate samples then you'll introduce
> systematic sampling errors. For example, if you read an entire block you'll be
> biasing towards smaller records.

Did you read any of the papers on block-based sampling?   These sorts of 
issues are specifically addressed in the algorithms.

--Josh

pgsql-hackers by date:

From: Josh Berkus
Date: 05 January 2006, 02:23:17
Subject: Re: Improving N-Distinct estimation by ANALYZE

From: Josh Berkus
Date: 05 January 2006, 02:30:34
Subject: Re: Improving N-Distinct estimation by ANALYZE

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

Previous

Next