Re: Patch: improve selectivity estimation for IN/NOT IN - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Patch: improve selectivity estimation for IN/NOT IN
Date
Msg-id 407.1331179639@sss.pgh.pa.us
Whole thread Raw
In response to Re: Patch: improve selectivity estimation for IN/NOT IN  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Daniele Varrazzo <daniele.varrazzo@gmail.com> writes:
>> the attached patch improves the array selectivity estimation for = ANY
>> and <> ALL, hence for the IN/NOT IN operators, to avoid the
>> shortcoming described in
>> <http://archives.postgresql.org/pgsql-performance/2012-03/msg00006.php>.

I've committed a modified version of this patch.

> I'm not sure offhand which way is better.  It could be argued that yours
> is more appropriate because if the operator isn't btree equality, but acts
> enough like it to use eqsel() as estimator, then it's still appropriate
> for scalararraysel() to treat it as equality.  On the other side of the
> coin, an operator might be equality but have reason to use some
> operator-specific estimator rather than eqsel().

After some reflection I decided it was probably sane to use both
methods, that is apply the disjoint-probabilities calculation if the
operator appears to be equality/inequality according to either rule.
However, I also put in the safety valve I suggested in the -performance
thread, that the code fall back to the old calculation if the assumption
of disjoint probabilities yields an impossible result.  The patch as you
had it would have just clamped such a result to 1 or 0, which didn't
seem to me to be the best we could do.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: pg_stat_statements and planning time
Next
From: Tom Lane
Date:
Subject: Re: Custom Operators Cannot be Found for Composite Type Values