Re: IS NOT DISTINCT FROM + Indexing - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: IS NOT DISTINCT FROM + Indexing
Date
Msg-id CAM3SWZTSuzm4V5hJ-P9mGdTk5SS0XjEjCgTJMCP42tBuZhR3gQ@mail.gmail.com
Whole thread Raw
In response to IS NOT DISTINCT FROM + Indexing  ("Jonathan S. Katz" <jonathan.katz@excoventures.com>)
Responses Re: IS NOT DISTINCT FROM + Indexing  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Jul 21, 2014 at 4:16 PM, Jonathan S. Katz
<jonathan.katz@excoventures.com> wrote:
> With NULLs being indexable, I was wondering if there was some reason why IS NOT DISTINCT FROM could not use the
index?

FWIW this works:

postgres=# explain analyze select * from orders where orderid in (5, null);
       QUERY PLAN
 

----------------------------------------------------------------------------------------------------------------------Index
Scanusing orders_pkey on orders  (cost=0.29..12.60 rows=1
 
width=60) (actual time=0.019..0.021 rows=1 loops=1)  Index Cond: (orderid = ANY ('{5,NULL}'::integer[]))Planning time:
0.100msExecution time: 0.416 ms
 
(4 rows)

I think that it would almost be a Simple Matter of Programming to make
IS NOT DISTINCT FROM indexable. Under the hood, IS DISTINCT FROM isn't
very different to using the equality operator:

/** DistinctExpr - expression node for "x IS DISTINCT FROM y"** Except for the nodetag, this is represented identically
toan OpExpr* referencing the "=" operator for x and y.* We use "=", not the more obvious "<>", because more datatypes
have"="* than "<>".  This means the executor must invert the operator result.* Note that the operator function won't be
calledat all if either input* is NULL, since then the result can be determined directly.*/
 
typedef OpExpr DistinctExpr;

We're already inverting the equals operator. But that isn't
necessarily how a B-Tree index represents equality (that is, a
particular B-Tree operator class could have a non-'=' operator that it
thinks of as equality-ish - in general that could even be the default
B-Tree opclass and there may not be an equals operator). The fact that
most types think of the '=' equals operator as equality is just a
convention, and so technically IS DISTINCT FROM doesn't invert B-Tree
operation 3. See "31.14. Interfacing Extensions To Indexes" for
details. The equals operator '=' isn't really supposed to be magic, it
just is in some places.

Right now the executor is directly inverting the equality operator to
make this work (and has done so since long before NULLs were
indexable). This is a bit of a kludge. I guess it just works that way
because there is no convenient place to insert the special inversion
of the operator, and the special NULL handling that currently appears
within ExecEvalDistinct().

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: IS NOT DISTINCT FROM + Indexing
Next
From: Andres Freund
Date:
Subject: Re: IS NOT DISTINCT FROM + Indexing