pgsql: Allow matchingsel() to be used with operators that might return - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Allow matchingsel() to be used with operators that might return
Date
Msg-id E1jQwCk-00018M-37@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Allow matchingsel() to be used with operators that might return NULL.

Although selfuncs.c will never call a target operator with null inputs,
some functions might return null anyway.  The existing coding will fail
if that happens (since FunctionCall2Coll will punt), which seems
undesirable given that matchingsel() has such a broad range of potential
applicability --- in fact, we already have a problem because we apply it
to jsonb_path_exists_opr, which can return null.  Hence, rejigger the
underlying functions mcv_selectivity and histogram_selectivity to cope,
treating a null result as false.

While we are at it, we can move the InitFunctionCallInfoData overhead
out of the inner loops, which isn't a huge number of cycles but might
save something considering we are likely calling functions as cheap
as int4eq().  Plus, the number of loop cycles to be expected is much
more than it was when this code was written, since typical settings
of default_statistics_target are higher.

In view of that consideration, let's apply the same change to
var_eq_const, eqjoinsel_inner, and eqjoinsel_semi.  We do not expect
equality functions to ever return null for non-null inputs (and
certainly that code has been that way a long time without complaints),
but the cycle savings seem attractive, especially in the eqjoinsel loops
where there's potentially an O(N^2) savings.

Similar code exists in ineq_histogram_selectivity and
get_variable_range, but I forebore from changing those for now.
The performance argument for changing ineq_histogram_selectivity
is really weak anyway, since that will only iterate log2(N) times.

Nikita Glukhov and Tom Lane

Discussion: https://postgr.es/m/9d3b0959-95d6-c37e-2c0b-287bcfe5c705@postgrespro.ru

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1c455078b0950cb6bad83198d818a55f02649fd4

Modified Files
--------------
src/backend/utils/adt/selfuncs.c | 166 ++++++++++++++++++++++++++++++---------
1 file changed, 128 insertions(+), 38 deletions(-)


pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Clean up cpluspluscheck violation.
Next
From: Peter Geoghegan
Date:
Subject: pgsql: Consider outliers in split interval calculation.