Thread: BUG #6283: About the behavior of indexscan in case there are some NULL values.
BUG #6283: About the behavior of indexscan in case there are some NULL values.
From
"Naoya Anzai"
Date:
The following bug has been logged online: Bug reference: 6283 Logged by: Naoya Anzai Email address: anzai-naoya@mxu.nes.nec.co.jp PostgreSQL version: 9.1.1 Operating system: RHEL5.5 Description: About the behavior of indexscan in case there are some NULL values. Details: Hello, In the newest PostgreSQL(9.1.1) or lower, Index-Scan's actual time may increase unexpectedly. I think that this is for scanning All NULL VALUES when performing an indexscan even if they does not need to be scanned. I think that the cause is here. [src/backend/access/nbtree/nbtutils.c(L963-L996) in PostgreSQL9.1.1] -------- if (isNull) { if (key->sk_flags & SK_BT_NULLS_FIRST) { /* * Since NULLs are sorted before non-NULLs, we know we have * reached the lower limit of the range of values for this * index attr. On a backward scan, we can stop if this qual * is one of the "must match" subset. On a forward scan, * however, we should keep going. */ if ((key->sk_flags & SK_BT_REQBKWD) && ScanDirectionIsBackward(dir)) *continuescan = false; } else { /* * Since NULLs are sorted after non-NULLs, we know we have * reached the upper limit of the range of values for this * index attr. On a forward scan, we can stop if this qual is * one of the "must match" subset. On a backward scan, * however, we should keep going. */ if ((key->sk_flags & SK_BT_REQFWD) && ScanDirectionIsForward(dir)) *continuescan = false; } /* * In any case, this indextuple doesn't match the qual. */ return false; } --------- For example, with NULLS_LAST, GREATER THAN scan key('value > scankey' etc.), and FORWARD SCAN conditions, even if scan have reached a NULL value, continuescan is still true all the time. If it rewrites as follows, I think that this problem is solved, but how is it? -------- --- nbtutils.c 2011-11-02 14:10:55.000000000 +0900 +++ nbtutils.c.new 2011-11-02 14:11:38.000000000 +0900 @@ -971,8 +971,7 @@ * is one of the "must match" subset. On a forward scan, * however, we should keep going. */ - if ((key->sk_flags & SK_BT_REQBKWD) && - ScanDirectionIsBackward(dir)) + if (ScanDirectionIsBackward(dir)) *continuescan = false; } else @@ -984,8 +983,7 @@ * one of the "must match" subset. On a backward scan, * however, we should keep going. */ - if ((key->sk_flags & SK_BT_REQFWD) && - ScanDirectionIsForward(dir)) + if (ScanDirectionIsForward(dir)) *continuescan = false; } --------- Regards, Naoya Anzai
>> >> Hello, >> >> In the newest PostgreSQL(9.1.1) or lower, Index-Scan's actual time may >> increase unexpectedly. >> >> I think that this is for scanning All NULL VALUES when performing an >> indexscan >> even if they does not need to be scanned. > >I think this was just fixed. Please check latest source code. I have checked latest source code. But, backward scan doesn't work correctly... ========================================== [naoya@nesitcspg03 ~]$ psql psql (9.2devel) Type "help" for help. naoya=# create table hoge(id integer,id2 integer); CREATE TABLE naoya=# insert into hoge select generate_series(1,10); INSERT 0 10 naoya=# update hoge set id2=1 where id=5; UPDATE 1 naoya=# update hoge set id2=10 where id=7; UPDATE 1 naoya=# create index hoge_idx on hoge(id2); CREATE INDEX naoya=# analyze hoge; ANALYZE naoya=# set enable_bitmapscan to off; SET naoya=# set enable_seqscan to off; SET naoya=# select * from hoge;id | id2 ----+----- 1 | 2 | 3 | 4 | 6 | 8 | 9 |10 | 5 | 1 7 | 10 (10 rows) naoya=# explain analyze select * from hoge where id2>0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------Index Scanusing hoge_idx on hoge (cost=0.00..8.29 rows=2 width=8) (actual time=0.010..0.012 rows=2 loops=1) Index Cond: (id2> 0)Total runtime: 0.065 ms (3 rows) naoya=# explain analyze select * from hoge where id2>0 order by id2 desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------Index ScanBackward using hoge_idx on hoge (cost=0.00..8.29 rows=2 width=8) (actual time=0.005..0.005 rows=0 loops=1) Index Cond:(id2 > 0)Total runtime: 0.035 ms (3 rows) naoya=# select * from hoge where id2>0;id | id2 ----+----- 5 | 1 7 | 10 (2 rows) naoya=# select * from hoge where id2>0 order by id2 desc;id | id2 ----+----- (0 rows) ========================================== Regards. --- Naoya Anzai ---
Re: BUG #6283: About the behavior of indexscan in case there are some NULL values.
From
Tom Lane
Date:
安西 直也 <anzai-naoya@mxu.nes.nec.co.jp> writes: > I have checked latest source code. > But, backward scan doesn't work correctly... [ pokes at that... ] Hmm, the patches I applied a couple days ago assumed that we are stepping forward or back from a place where the WHERE clauses are satisfied. But in this example, the system just applies _bt_endpoint to descend to the right-hand end of the index, since there is no upper-bound qual with which to do anything different. So we start from a place where the clauses aren't satisfied. That also means that we haven't really fixed the original performance complaint: there could be lots of nulls to be stepped over before we reach the first matching row. I think that the right fix for this is probably to make _bt_preprocess_keys explicitly generate the "id is not null" qual that's implied by "id > 0", so that it will have what amounts to a range condition on the index contents (since for NULLS LAST, "id is not null" amounts to "id is less than null", as it were). Then, instead of applying _bt_endpoint, it will use the less-than key to descend the btree to the last non-null entry, and we'll be good for both correctness and performance. I don't see any big problem in doing this in HEAD, but it's getting past what seems like a sane back-patch. So probably we should revert the back-branch versions of the prior patch, and just say that the performance problem is only going to be addressed in HEAD. regards, tom lane