Re: strange row count estimates with conditions on multiple column - Mailing list pgsql-general

From Tom Lane
Subject Re: strange row count estimates with conditions on multiple column
Date
Msg-id 8171.1292183450@sss.pgh.pa.us
Whole thread Raw
In response to Re: strange row count estimates with conditions on multiple column  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: strange row count estimates with conditions on multiple column  (tv@fuzzy.cz)
List pgsql-general
Tomas Vondra <tv@fuzzy.cz> writes:
> Dne 17.11.2010 06:58, Tom Lane napsal(a):
>>> BTW I think the default estimate used to be 1000, so it was changed in
>>> one of the 8.x releases? Can you point me to the docs? I've even tried
>>> to find that in the sources, but unsuccessfully.
>>
>> It's DEFAULT_RANGE_INEQ_SEL, and AFAIR it hasn't changed in quite a while.
>> But I wouldn't be surprised if the behavior of this example changed when
>> we boosted the default statistics target.

> I've been thinking about this and I think it might be improved. If I
> understand the logic corretly, it says 'use half of the histogram bin
> size'. But the value

> #define DEFAULT_RANGE_INEQ_SEL 0.005

> says it's always 0.5%, which is not not true if STATISTICS TARGET is not
> 100. This could actually yield 10x more precise estimates when the
> STATISTICS TARGET is set to 1000.

Huh?  The default estimates are completely unrelated to the size of the
histogram, and certainly unrelated to the default size of the
histogram.  We use those estimates when we don't have relevant stats.
It's pure wishful thinking to suppose that changing the statistics
target would have any impact on what the estimate ought to be in such
a case.

I believe the actual reasoning for setting the default estimates that
are under 1% is that we wanted to encourage indexscan choices in such
cases.  Once it's small enough for that, making it even smaller doesn't
really help --- and that does risk making bad join choices.  You don't
want the thing coming up with one-row estimates unless there's real
evidence for such an estimate.

            regards, tom lane

pgsql-general by date:

Previous
From: Andreas 'ads' Scherbaum
Date:
Subject: [ANNOUNCE] PostgreSQL@FOSDEM 2011 - Reminder: call for papers ending soon!
Next
From: tv@fuzzy.cz
Date:
Subject: Re: strange row count estimates with conditions on multiple column