Odd behaviour with indexes for NULLs - Mailing list pgsql-admin

From Steve Hill
Subject Odd behaviour with indexes for NULLs
Date
Msg-id 5e4fa822-171a-5904-46e0-28a45ff66592@opendium.com
Whole thread Raw
Responses Re: Odd behaviour with indexes for NULLs  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
I have a couple of examples of unexpected behaviour when dealing with 
indexes containing NULL columns which I hope someone could explain.  I'm 
using PostgreSQL 10.19 under CentOS Stream 8:


First: a B-tree index created as:

CREATE INDEX sessions_find_linkable_test2 ON sessions (realm, eui64, 
update_time DESC)

This query behaves as I would expect:

EXPLAIN ANALYZE SELECT * FROM sessions WHERE realm = 'foo' AND eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a' ORDER BY update_time DESC LIMIT 1;
 
QUERY PLAN 


------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.56..5.29 rows=1 width=159) (actual time=0.039..0.040 
rows=0 loops=1)
    ->  Index Scan using sessions_find_linkable_test2 on sessions 
(cost=0.56..19.48 rows=4 width=159) (actual time=0.038..0.039 rows=0 
loops=1)
          Index Cond: ((realm = 'foo'::text) AND (eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a'::macaddr8))
  Planning time: 0.215 ms
  Execution time: 0.066 ms

i.e. the index is used to find the matching records, and also to order 
them by update_time.  However, this other query doesn't work how I 
expected it to:

EXPLAIN ANALYZE SELECT * FROM webfront_accounting.sessions WHERE realm 
IS NULL AND eui64 = '9e:cc:b9:ff:fe:5d:28:0a' ORDER BY update_time DESC 
LIMIT 1;
 
    QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=7957.62..7957.62 rows=1 width=159) (actual 
time=13.940..13.941 rows=1 loops=1)
    ->  Sort  (cost=7957.62..7963.09 rows=2188 width=159) (actual 
time=13.939..13.939 rows=1 loops=1)
          Sort Key: update_time DESC
          Sort Method: top-N heapsort  Memory: 25kB
          ->  Index Scan using sessions_find_linkable_test2 on sessions 
  (cost=0.56..7946.68 rows=2188 width=159) (actual time=0.034..13.258 
rows=3821 loops=1)
                Index Cond: ((realm IS NULL) AND (eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a'::macaddr8))
  Planning time: 0.372 ms
  Execution time: 13.964 ms

In this case, rather than using the index to order the records, it uses 
a separate sorting step, which is considerably slower.  The only 
difference between the two queries is that the one that doesn't use the 
index for sorting is looking for a NULL realm column, instead of a realm 
column which contains some text.

Quite a lot of Googling hasn't turned up any explanation - everything I 
have found essentially just says that B-tree indexes can be used for IS 
NULL checks (since PostgreSQL 8.something).  I've not found anything 
saying that making an IS NULL check against an index would prohibit 
using that index for ordering.


My second piece of oddness is for an index defined as:
CREATE INDEX sessions_find_linkable_test2 ON sessions ((realm IS NULL), 
eui64, update_time DESC);

This query works as expected:
EXPLAIN ANALYZE SELECT * FROM sessions WHERE realm IS NULL AND eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a' ORDER BY update_time DESC LIMIT 1;
 
QUERY PLAN 


-----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.56..2.63 rows=1 width=159) (actual time=0.034..0.034 
rows=1 loops=1)
    ->  Index Scan using sessions_find_linkable_test2 on sessions 
(cost=0.56..4527.04 rows=2188 width=159) (actual time=0.033..0.033 
rows=1 loops=1)
          Index Cond: (((realm IS NULL) = true) AND (eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a'::macaddr8))
          Filter: (realm IS NULL)
  Planning time: 0.326 ms
  Execution time: 0.050 ms

i.e. the index is used for the whole WHERE clause, although I'm not sure 
why a separate Filter step is needed.  However, this other query does 
not work as I would expect:

EXPLAIN ANALYZE SELECT * FROM sessions WHERE realm IS NOT NULL AND eui64 
= '9e:cc:b9:ff:fe:5d:28:0a' ORDER BY update_time DESC LIMIT 1;
 
QUERY PLAN 


------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=8277.79..8277.79 rows=1 width=159) (actual 
time=15.552..15.554 rows=0 loops=1)
    ->  Sort  (cost=8277.79..8277.94 rows=59 width=159) (actual 
time=15.551..15.552 rows=0 loops=1)
          Sort Key: update_time DESC
          Sort Method: quicksort  Memory: 25kB
          ->  Bitmap Heap Scan on sessions  (cost=141.42..8277.49 
rows=59 width=159) (actual time=15.547..15.548 rows=0 loops=1)
                Recheck Cond: (eui64 = '9e:cc:b9:ff:fe:5d:28:0a'::macaddr8)
                Filter: (realm IS NOT NULL)
                Rows Removed by Filter: 3821
                Heap Blocks: exact=3048
                ->  Bitmap Index Scan on sessions_eui64_start_time 
(cost=0.00..141.41 rows=2247 width=0) (actual time=0.764..0.764 
rows=3832 loops=1)
                      Index Cond: (eui64 = 
'9e:cc:b9:ff:fe:5d:28:0a'::macaddr8)
  Planning time: 0.163 ms
  Execution time: 15.578 ms

Since the index contains the boolean result of an IS NULL check, I would 
have expected it to use that index for IS NOT NULL (i.e. where the IS 
NULL check returned false).  However, it isn't using an index at all for 
the "realm IS NOT NULL" part of the WHERE clause.

Creating two separate partial indexes, like this works:
CREATE INDEX sessions_find_linkable_test2a ON sessions (eui64, 
update_time DESC) WHERE realm IS NULL;
CREATE INDEX sessions_find_linkable_test2b ON sessions (eui64, 
update_time DESC) WHERE realm IS NOT NULL;

However, this seems like something that should have been achievable with 
a single index?

Many thanks.

-- 
- Steve Hill
    Technical Director | Cyfarwyddwr Technegol
    Opendium    Online Safety & Web Filtering     http://www.opendium.com
                Diogelwch Ar-Lein a Hidlo Gwefan

    Enquiries | Ymholiadau:   sales@opendium.com     +44-1792-824568
    Support   | Cefnogi:      support@opendium.com   +44-1792-825748

------------------------------------------------------------------------
Opendium Limited is a company registered in England and Wales.
Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.

Company No. | Rhif Cwmni:   5465437
Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
Attachment

pgsql-admin by date:

Previous
From: KK CHN
Date:
Subject: WAL replication from an Old Production server
Next
From: Tom Lane
Date:
Subject: Re: Odd behaviour with indexes for NULLs