Re: [HACKERS] Bad n_distinct estimation; hacks suggested? - Mailing list pgsql-performance

From Tom Lane
Subject Re: [HACKERS] Bad n_distinct estimation; hacks suggested?
Date
Msg-id 19276.1114442580@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bad n_distinct estimation; hacks suggested?  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: [HACKERS] Bad n_distinct estimation; hacks suggested?  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-performance
Simon Riggs <simon@2ndquadrant.com> writes:
> My suggested hack for PostgreSQL is to have an option to *not* sample,
> just to scan the whole table and find n_distinct accurately.
> ...
> What price a single scan of a table, however large, when incorrect
> statistics could force scans and sorts to occur when they aren't
> actually needed ?

It's not just the scan --- you also have to sort, or something like
that, if you want to count distinct values.  I doubt anyone is really
going to consider this a feasible answer for large tables.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon
Next
From: Thomas F.O'Connell
Date:
Subject: Re: pgbench Comparison of 7.4.7 to 8.0.2