Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >= - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >=
Date
Msg-id 4292.1499183421@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and >from >=  (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
Responses Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >=
Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and >from >=
List pgsql-hackers
Kuntal Ghosh <kuntalghosh.2007@gmail.com> writes:
> On Tue, Jul 4, 2017 at 9:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... I have to admit that I've failed to wrap my brain around exactly
>> why it's correct.  The arguments that I've constructed so far seem to
>> point in the direction of applying the opposite correction, which is
>> demonstrably wrong.  Perhaps someone whose college statistics class
>> wasn't quite so long ago can explain this satisfactorily?

> I guess that you're referring the last case, i.e.
> explain analyze select * from tenk1 where thousand between 10 and 10;

No, the thing that is bothering me is why it seems to be correct to
apply a positive correction for ">=", a negative correction for "<",
and no correction for "<=" or ">".  That seems weird and I can't
construct a plausible explanation for it.  I think it might be a
result of the fact that, given a discrete distribution rather than
a continuous one, the histogram boundary values should be understood
as having some "width" rather than being zero-width points on the
distribution axis.  But the arguments I tried to fashion on that
basis led to other rules that didn't actually work.

It's also possible that this logic is in fact wrong and it just happens
to give the right answer anyway for uniformly-distributed cases.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and >from >=
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] WIP patch: distinguish selectivity of < from <= and > from >=