"Tom Lane" <tgl@sss.pgh.pa.us> writes:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> Am Freitag, 7. März 2008 schrieb Tom Lane:
>>> IIRC, egjoinsel is one of the weak spots, so tests involving planning of
>>> joins between two tables with large MCV lists would be a good place to
>>> start.
>
>> I have run tests with joining two and three tables with 10 million rows each,
>> and the planning times seem to be virtually unaffected by the statistics
>> target, for values between 10 and 800000.
>
> It's not possible to believe that you'd not notice O(N^2) behavior for N
> approaching 800000 ;-). Perhaps your join columns were unique keys, and
> thus didn't have any most-common-values?
We could remove the hard limit on statistics target and impose the limit
instead on the actual size of the arrays. Ie, allow people to specify larger
sample sizes and discard unreasonably large excess data (possibly warning them
when that happens).
That would remove the screw case the original poster had where he needed to
scan a large portion of the table to see at least one of every value even
though there were only 169 distinct values.
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!