On Fri, 17 Apr 2026 at 01:01, Chengpeng Yan <chengpeng_yan@outlook.com> wrote:
> The first attached patch fixes this by bypassing hash probing when the
> LHS is NULL and the comparator is non-strict, falling back to a linear
> evaluation consistent with ExecEvalScalarArrayOp(). For NOT IN, only
> non-NULL results are inverted.
Thanks for the bug report.
I don't think we need to fallback on a linear search. If the
non-strict function returns false for NULL = NULL, then as far as I
can see, we can still get the correct result by checking if the hash
table contains any other members. What I'm not certain of is if a
non-strict function must return NULL for NULL = non-NULL. If yes, then
we could just do it as the attached patch. I made this check the hash
table to see if it has non-NULL Datums hashed. This means something
like "WHERE NULL IN (NULL, 1)" for a non-strict function returning
false for NULL = NULL and NULL for NULL = 1 would evaluate the same as
"WHERE false OR NULL", which is NULL. Whereas, "WHERE NULL IN(NULL)"
would be "false".
If we need to assume the non-strict function could return false on
NULL = non-NULL, then we could test for that when inserting the first
datum into the hash table and store the behaviour in the expression.
It may also be worth doing that check for NULL = NULL so that we don't
need to call the equals function every time we see a NULL.
I'll need to dig a bit deeper to see if we've written down any rules
about non-strict equality functions anywhere...
David