Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improving N-Distinct estimation by ANALYZE
Date
Msg-id 6579.1136404156@sss.pgh.pa.us
Whole thread Raw
In response to Improving N-Distinct estimation by ANALYZE  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Improving N-Distinct estimation by ANALYZE  (Josh Berkus <josh@agliodbs.com>)
Re: Improving N-Distinct estimation by ANALYZE  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> [ ... a large amount of analysis based on exactly one test case ... ]

I think you are putting too much emphasis on fixing one case and not
enough on considering what may happen in other cases ...

In general, estimating n-distinct from a sample is just plain a hard
problem, and it's probably foolish to suppose we'll ever be able to
do it robustly.  What we need is to minimize the impact when we get
it wrong.  So I agree with the comment that we need to finish the
unfinished project of making HashAggregate tables expansible, but
I'm dubious about the rest of this.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Inconsistent syntax in GRANT
Next
From: "Magnus Hagander"
Date:
Subject: Re: psql & readline & win32