[HACKERS] partial indexes and bitmap scans - Mailing list pgsql-hackers

From Stephen Frost
Subject [HACKERS] partial indexes and bitmap scans
Date
Msg-id 20170309160121.GH9812@tamriel.snowman.net
Whole thread Raw
Responses Re: [HACKERS] partial indexes and bitmap scans  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Greetings,

Consider this:

create table t10 (c1 int, c2 int);
create index on t10 (c1) where c2 > 5;

\d t10     Table "sfrost.t10"Column |  Type   | Modifiers
--------+---------+-----------c1     | integer | c2     | integer |
Indexes:   "t10_c1_idx" btree (c1) WHERE c2 > 5

insert into t10 select * from generate_series(1,10000) a, generate_series(1,10) b;
(repeat a bunch of times, if desired)

vacuum analyze t10;
set work_mem = '64kB';
set enable_indexscan = false;
set enable_seqscan = false;

=*> explain analyze select * from t10 where c2 > 6;                                                           QUERY
PLAN                                                            

----------------------------------------------------------------------------------------------------------------------------------Bitmap
HeapScan on t10  (cost=6496.49..15037.50 rows=318653 width=8) (actual time=34.682..116.236 rows=320000 loops=1)
RecheckCond: (c2 > 5)  Rows Removed by Index Recheck: 327502  Filter: (c2 > 6)  Rows Removed by Filter: 80000  Heap
Blocks:exact=642 lossy=2898  ->  Bitmap Index Scan on t10_c1_idx  (cost=0.00..6416.83 rows=400081 width=0) (actual
time=34.601..34.601rows=400000 loops=1)Planning time: 0.087 msExecution time: 124.229 ms 
(9 rows)

Perhaps I'm missing something obvious, but isn't it a bit redundant to
have both a Recheck condition (which is the predicate of the index) and
a Filter condition (which is the user's predicate) when we've already
decided that the user's predicate must result in a subset of the
index's, as, otherwise, we wouldn't be able to use the index in the
first place?

In other words, it seems like we shouldn't need a Filter in the above
Bitmap Heap Scan, instead we should just make the Recheck be (c2 > 6).

I've not looked into the code side of this at all and there may be
reasons why this is hard to do, but it seems like a worthwhile
improvement to consider doing, though perhaps I'm missing some reason
why we need both the Recheck and the Filter in such cases for
correctness.

Thanks!

Stephen

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] use SQL standard error code for nextval
Next
From: Aleksander Alekseev
Date:
Subject: Re: [HACKERS] Declarative partitioning optimization for large amountof partitions