Re: [HACKERS] Bad n_distinct estimation; hacks suggested? - Mailing list pgsql-performance

From Josh Berkus
Subject Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Date
Msg-id 200504241208.15437.josh@agliodbs.com
Whole thread Raw
In response to Bad n_distinct estimation; hacks suggested?  (Josh Berkus <josh@agliodbs.com>)
Responses Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
List pgsql-performance
Folks,

> I wonder if this paper has anything that might help:
> http://www.stat.washington.edu/www/research/reports/1999/tr355.ps - if I
> were more of a statistician I might be able to answer :-)

Actually, that paper looks *really* promising.   Does anyone here have enough
math to solve for D(sub)Md on page 6?   I'd like to test it on samples of <
0.01%.

Tom, how does our heuristic sampling work?   Is it pure random sampling, or
page sampling?

--
Josh Berkus
Aglio Database Solutions
San Francisco

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Sort and index