On Mon, Apr 20, 2020 at 09:27:34PM -0400, James Coleman wrote:
>Over in "execExprInterp() questions / How to improve scalar array op
>expr eval?" [1] I'd mused about how we might be able to optimized
>scalar array ops with OR'd semantics.
>
>This patch implements a binary search for such expressions when the
>array argument is a constant so that we can avoid needing to teach
>expression execution to cache stable values or know when a param has
>changed.
>
>The speed-up for the target case can pretty impressive: in my
>admittedly contrived and relatively unscientific test with a query in
>the form:
>
>select count(*) from generate_series(1,100000) n(i) where i in (<1000
>random integers in the series>)
>
>shows ~30ms for the patch versus ~640ms on master.
>
Nice improvement, although 1000 items is probably a bit unusual. The
threshold used in the patch (9 elements) seems a bit too low - what
results have you seen with smaller arrays?
Another idea - would a bloom filter be useful here, as a second
optimization? That is, for large arrays build s small bloom filter,
allowing us to skip even the binary search.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services