Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's - Mailing list pgsql-hackers

From James Coleman
Subject Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's
Date
Msg-id CAAaqYe_YuJ++pnUUE6u1r2ak62tHHvQ0=Jg=L-A2U-kTjSOHZw@mail.gmail.com
Whole thread Raw
In response to Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Jan 13, 2019 at 3:06 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> There's still a logical problem here, which is that in order to
> prove that the ScalarArrayOpExpr yields NULL when its LHS does,
> you also have to prove that the RHS is not an empty array.
>
> Otherwise you're up against the fact that the OR of zero boolean
> expressions is false not null:
>
> regression=# select null::int = any(array[]::int[]);
>  ?column?
> ----------
>  f
> (1 row)

I've constructed a test case running the test_predtest function on the
expression
"select x is null, x = any(array[]::int[]) from integers" which I
think gets at the logic bug you're noting. In doing this I noticed
something I don't fully understand: on master this shows that "x =
any(array[]::int[])" proves "x is null" (while my patch shows the
opposite). Is this because the second expression being false means
there's can be no valid value for x? It also claims it refutes the
same, which is even more confusing.

> While this is doable when the RHS is an array constant or ArrayExpr, I'm
> afraid it's likely to be a bit tedious.  (Hm, actually predicate_classify
> already has logic to extract the number of elements in these cases ...
> I wonder if there's any use in trying to share code?)

It seems to me that this would mean that an IS NOT NULL index would
still not be proven to be usable for a non-constant array (e.g., from
a subquery calculated array), and in fact I've included a (failing)
test demonstrating that fact in the attached patch. This feeds into my
question above about there possibly being another corollary/proof that
could be added for this case.

Semi-related: I noticed that predicate_classify handles an ArrayExpr
by using list_length; this seems unnecessary if we know we can fail
fast. I realize it's not a common problem, but I have seen extremely
long arrays before, and maybe we should fall out of the count once we
hit max+1 items. My new code also has this "problem", but I figured it
was worth getting this logically correct before attempting to optimize
that especially since it already exists in one place.

> Minor stylistic quibble: I don't like where you inserted the new code in
> clause_is_strict_for, because it renders the comment a few lines above
> that unrelatable to the code.  I'd put it at the bottom, after the
> is_funcclause stanza.

Fixed in this revision.

James Coleman

Attachment

pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: Pluggable Storage - Andres's take
Next
From: Andreas Karlsson
Date:
Subject: Re: insensitive collations