Thread: matchingsel() and NULL-returning operators

matchingsel() and NULL-returning operators

From
Nikita Glukhov
Date:
Hi, hackers!


I found a problem with selectivity estimation for NULL-returning operators.
matchingsel() is not ready to use as a restriction selectivity estimator for
operators like our jsonpath operators @? and @@, because it calls operator
function on values obtained from pg_statistic through plain FunctionCall2Coll()
which does not accept NULL results (see mcv_selectivity() etc.).

=# CREATE TABLE test AS SELECT '{}'::jsonb js FROM generate_series(1, 1000);
=# ANALYZE test;
=# SELECT * FROM test WHERE js @@ '$ == 1';
ERROR:  function 4011 returned NULL


I'm not sure what we should to fix: operators or matchingsel().  So, attached
two possible independent fixes:

1. Return FALSE instead of NULL in jsonpath operators.  The corresponding 
functions jsonb_path_exists() and jsonb_path_match() still return NULL in 
error cases.

2. Fix NULL operator results in selectivity estimation functions.
Introduced BoolFunctionCall2Coll() for replacing NULL with FALSE, that is used
for calling non-comparison operators (I'm not sure that comparison can return 
NULLs).  Maybe it is worth add a whole set of functions to fmgr.c for replacing 
NULL results with the specified default Datum value.


If the selectivity estimation code will be left unchanged, then I think it
should be noted in documentation that matchingsel() is not applicable to
NULL-returning operators (there is already a similar note about hash-joinable
operators).


But if we will fix NULL handling, I think it would be worth to fix it everywhere
in the selectivity estimation code.  Without this, completely wrong results can
be get not only for NULL values, but also for NULL operator results:

=# EXPLAIN SELECT * FROM test WHERE NOT js @@ '$ == 1';  -- 0 rows returned                      QUERY PLAN                       
--------------------------------------------------------Seq Scan on test  (cost=0.00..17.50 rows=1000 width=5)  Filter: (NOT (js @@ '($ == 1)'::jsonpath))
(2 rows)


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: matchingsel() and NULL-returning operators

From
Tom Lane
Date:
Nikita Glukhov <n.gluhov@postgrespro.ru> writes:
> I found a problem with selectivity estimation for NULL-returning operators.
> matchingsel() is not ready to use as a restriction selectivity estimator for
> operators like our jsonpath operators @? and @@, because it calls operator
> function on values obtained from pg_statistic through plain FunctionCall2Coll()
> which does not accept NULL results (see mcv_selectivity() etc.).

Ah, good point.

> I'm not sure what we should to fix: operators or matchingsel().

Seems reasonable to let matchingsel support such cases.

> Introduced BoolFunctionCall2Coll() for replacing NULL with FALSE, that is used
> for calling non-comparison operators (I'm not sure that comparison can return
> NULLs).

Normally what we do is just invoke the function directly without going
through that layer.  If you need to cope with NULL then the simplicity
of notation of FunctionCallN is lost to you anyway.  I don't think we
particularly need an additional API that's intermediate between those.

> But if we will fix NULL handling, I think it would be worth to fix it
> everywhere in the selectivity estimation code.

I'm disinclined to move the goalposts so far for places where there have
been no complaints; especially not post-feature-freeze.

            regards, tom lane