Re: Maximum statistics target - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Maximum statistics target
Date
Msg-id 87ejaibspe.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: Maximum statistics target  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Maximum statistics target
Re: Maximum statistics target
List pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Peter Eisentraut <peter_e@gmx.net> writes:
>> Am Freitag, 7. März 2008 schrieb Tom Lane:
>>> IIRC, egjoinsel is one of the weak spots, so tests involving planning of
>>> joins between two tables with large MCV lists would be a good place to
>>> start.
>
>> I have run tests with joining two and three tables with 10 million rows each,
>> and the planning times seem to be virtually unaffected by the statistics
>> target, for values between 10 and 800000.
>
> It's not possible to believe that you'd not notice O(N^2) behavior for N
> approaching 800000 ;-).  Perhaps your join columns were unique keys, and
> thus didn't have any most-common-values?

We could remove the hard limit on statistics target and impose the limit
instead on the actual size of the arrays. Ie, allow people to specify larger
sample sizes and discard unreasonably large excess data (possibly warning them
when that happens).

That would remove the screw case the original poster had where he needed to
scan a large portion of the table to see at least one of every value even
though there were only 169 distinct values.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Include Lists for Text Search
Next
From: Andrew Dunstan
Date:
Subject: Re: Include Lists for Text Search