Re: Setting Statistics on Functional Indexes - Mailing list pgsql-performance

From Tom Lane
Subject Re: Setting Statistics on Functional Indexes
Date
Msg-id 25730.1352926825@sss.pgh.pa.us
Whole thread Raw
In response to Re: Setting Statistics on Functional Indexes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
Robert Haas <robertmhaas@gmail.com> writes:
> Shouldn't there be a separate estimator for scalarlesel?  Or should
> the existing estimator be adjusted to handle the two cases
> differently?

Well, it does handle it differently to some extent, in that the operator
itself is invoked when checking the MCV values, so we get the right
answer for those.

The fact that there's not separate estimators for < and <= is something
we inherited from Berkeley, so I can't give the original rationale for
certain, but I think the notion was that the difference is imperceptible
when dealing with a continuous distribution.  The question is whether
you think that the "=" case contributes any significant amount to the
probability given that the bound is not one of the MCV values.  (If it
is, the MCV check will have accounted for it, so adding anything would
be wrong.)  I guess we could add 1/ndistinct or something like that,
but I'm not convinced that will really make the estimates better, mainly
because ndistinct is none too reliable itself.

            regards, tom lane


pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Setting Statistics on Functional Indexes
Next
From: Jeff Janes
Date:
Subject: Re: postgres 8.4, COPY, and high concurrency