Thread: Index INCLUDE vs. Bitmap Index Scan
Hi! I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, which it doesn’t at the moment (tested on masteras of yesterday). Consider this (find the setup at the bottom of this mail). CREATE INDEX idx ON tbl (a, b) INCLUDE (c); EXPLAIN (analyze, buffers) SELECT * FROM tbl WHERE a = 1 AND c = 1; The following plan could be produced: QUERY PLAN ------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.027..0.029 rows=1 loops=1) Recheck Cond: (a = 1) Filter: (c = 1) Rows Removed by Filter: 7 Heap Blocks: exact=2 Buffers: shared hit=2 read=1 -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.018..0.018 rows=8 loops=1) Index Cond: (a = 1) Buffers: shared read=1 Planning Time: 0.615 ms Execution Time: 0.060 ms The Bitmap Index Scan does not filter on column C, which is INCLUDEd in the index leaf nodes. Instead, the Bitmap Heap Scan fetches unnecessary blocks and then applies the filter. I would expect a similar execution as with this index: DROP INDEX idx; CREATE INDEX idx ON tbl (a, b, c); QUERY PLAN ------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.021..0.021 rows=1 loops=1) Recheck Cond: ((a = 1) AND (c = 1)) Heap Blocks: exact=1 Buffers: shared hit=1 read=1 -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=1) Index Cond: ((a = 1) AND (c = 1)) Buffers: shared read=1 Planning Time: 0.123 ms Execution Time: 0.037 ms (As a side node: I also dislike it how Bitmap Index Scan mixes search conditions and filters in “Index Cond”) Due to the not-filtered column B in the index, the use of column C is here pretty much the same as it is for the first index,which has C in INCLUDE. Am I missing anything or is it just an oversight because INCLUDE was primarily done for Index Only Scan? As a background, here is the use case I see for this scenario: Filters that can not be used for searching the tree can be put in INCLUDE instead of the key-part of the index even if thereis no intend to run the query as Index Only Scan (e.g. SELECT *). Columns in INCLUDE can be used for filtering (worksfine for Index Only Scan, btw). The most prominent example are post-fix or in-fix LIKE ‘%term%’ filters. The benefit of putting such columns in INCLUDE isthat it is clearly documented that the index column is neither used for searching in the tree nor for ordering. It is eitherused for filtering, or for fetching. Knowing this makes it easier to extend existing index. Imagine you have this query to optimize: SELECT * FROM … WHERE a = ? AND b = ? ORDER BY ts DESC LIMIT 10 And find this existing index on that table: CREATE INDEX … ON … (a, b) INCLUDE (c); In this case it is a rather safe option to replace the index with the following: CREATE INDEX … ON … (a, b, ts) INCLUDE (c); On the other hand, if the original index had all column in the key part, changing it to (A, B, TS, C) is a rather dangerousoption. -markus — Setup for the demo -- demo table: 4 rows per block -- the row if interest is the first one, the others spill to a second block so the problem can also seen in “Buffers" CREATE TABLE tbl (a INTEGER, b INTEGER, c INTEGER, junk CHAR(1900)); INSERT INTO tbl VALUES (1, 1, 1, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'), (1, 1, 9, 'junk'); -- prevent unwanted plans for demo of Bitmap Index+Heap Scan SET ENABLE_INDEXSCAN = OFF; SET ENABLE_SEQSCAN = OFF;
Markus Winand <markus.winand@winand.at> writes: > I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, which it doesn’t at the moment (tested on masteras of yesterday). Regular index scans don't do what you're imagining either (i.e., check filter conditions in advance of visiting the heap). There's a roadblock to implementing such behavior, which is that we might end up applying filter expressions to dead rows. That could make users unhappy. For example, given a filter condition like "1.0/c > 0.1", people would complain if it still got zero-divide failures even after they'd deleted all rows with c=0 from their table. Generally speaking, we expect indexable operators not to throw errors on any input values, which is why this problem isn't serious for the index conditions proper. But we can't make that assumption for arbitrary filter conditions. > (As a side node: I also dislike it how Bitmap Index Scan mixes search conditions and filters in “Index Cond”) What do you think is being mixed exactly? regards, tom lane
Hi, On 2019-02-26 18:22:41 -0500, Tom Lane wrote: > Markus Winand <markus.winand@winand.at> writes: > > I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, which it doesn’t at the moment (tested onmaster as of yesterday). > > Regular index scans don't do what you're imagining either (i.e., check > filter conditions in advance of visiting the heap). There's a roadblock > to implementing such behavior, which is that we might end up applying > filter expressions to dead rows. That could make users unhappy. > For example, given a filter condition like "1.0/c > 0.1", people > would complain if it still got zero-divide failures even after they'd > deleted all rows with c=0 from their table. > > Generally speaking, we expect indexable operators not to throw > errors on any input values, which is why this problem isn't serious > for the index conditions proper. But we can't make that assumption > for arbitrary filter conditions. We could probably work around that, by only doing such pre-checks on index tuples only for tuples known to be visible according to the VM. Would quite possibly be too hard to understand for users though. Also not sure if it'd actually perform that well due to the added random IO, as we'd have to do such checks before entering a tuple into the bitmap (as obviously we don't have a tuple afterwards). Greetings, Andres Freund
On Tue, Feb 26, 2019 at 09:07:01PM +0100, Markus Winand wrote: > CREATE INDEX idx ON tbl (a, b, c); > Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.021..0.021 rows=1 loops=1) > Recheck Cond: ((a = 1) AND (c = 1)) > -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=1) > Index Cond: ((a = 1) AND (c = 1)) > > (As a side node: I also dislike it how Bitmap Index Scan mixes search conditions and filters in “Index Cond”) I don't think it's mixing them; it's using index scan on leading *and* nonleading column. That's possible even if frequently not efficient. Justin
> On 27.02.2019, at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Markus Winand <markus.winand@winand.at> writes: >> I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, which it doesn’t at the moment (tested on masteras of yesterday). > > Regular index scans don't do what you're imagining either (i.e., check > filter conditions in advance of visiting the heap). There's a roadblock > to implementing such behavior, which is that we might end up applying > filter expressions to dead rows. That could make users unhappy. > For example, given a filter condition like "1.0/c > 0.1", people > would complain if it still got zero-divide failures even after they'd > deleted all rows with c=0 from their table. Ok, but I don’t see how this case different for key columns vs. INCLUDE columns. When I test this with the (a, b, c) index (no INCLUDE), different plans are produced for "c=1" (my original example) vs."1.0/c > 0.1”. The second one postpones this condition to the Bitmap Heap Scan. QUERY PLAN ------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=4) (actual time=0.023..0.028 rows=8 loops=1) Recheck Cond: (a = 1) Filter: ((1.0 / (c)::numeric) > 0.1) Heap Blocks: exact=2 Buffers: shared hit=3 -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.007..0.007 rows=8 loops=1) Index Cond: (a = 1) Buffers: shared hit=1 Planning Time: 0.053 ms Execution Time: 0.044 ms I’ve never noticed that behaviour before, but if it is there to prevent the exception-on-dead-tuple problem, the same couldbe applied to INCLUDE columns? I realise that this will not cover all use cases I can imagine but it would be consistent for key and non-key columns. -markus
> On 27.02.2019, at 02:00, Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Tue, Feb 26, 2019 at 09:07:01PM +0100, Markus Winand wrote: >> CREATE INDEX idx ON tbl (a, b, c); >> Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.021..0.021 rows=1 loops=1) >> Recheck Cond: ((a = 1) AND (c = 1)) >> -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=1) >> Index Cond: ((a = 1) AND (c = 1)) >> >> (As a side node: I also dislike it how Bitmap Index Scan mixes search conditions and filters in “Index Cond”) > > I don't think it's mixing them; it's using index scan on leading *and* > nonleading column. That's possible even if frequently not efficient. The distinction leading / non-leading is very important for performance. Other database products use different names in theexecution plan so that it is immediately visible (without knowing the index definition). - Oracle: access vs. filter predicates - SQL Server: “seek predicates” vs. “predicates” - Db2: START/STOP vs. SARG - MySQL/MariaDB show how many leading columns of the index are used — the rest is just “filtering" PostgreSQL: no difference visible in the execution plan. CREATE INDEX idx ON tbl (a,b,c); EXPLAIN (analyze, buffers) SELECT * FROM tbl WHERE a = 1 AND c = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.017..0.018 rows=1 loops=1) Recheck Cond: ((a = 1) AND (c = 1)) Heap Blocks: exact=1 Buffers: shared hit=1 read=1 -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.014..0.014 rows=1 loops=1) Index Cond: ((a = 1) AND (c = 1)) Buffers: shared read=1 Planning Time: 0.149 ms Execution Time: 0.035 ms DROP INDEX idx; CREATE INDEX idx ON tbl (a, c, b); -- NOTE: column “c” is second QUERY PLAN ------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on tbl (cost=4.14..8.16 rows=1 width=7616) (actual time=0.013..0.013 rows=1 loops=1) Recheck Cond: ((a = 1) AND (c = 1)) Heap Blocks: exact=1 Buffers: shared hit=1 read=1 -> Bitmap Index Scan on idx (cost=0.00..4.14 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: ((a = 1) AND (c = 1)) Buffers: shared read=1 Planning Time: 0.262 ms Execution Time: 0.036 ms (9 rows) -markus
Markus Winand <markus.winand@winand.at> writes: >> On 27.02.2019, at 02:00, Justin Pryzby <pryzby@telsasoft.com> wrote: >> On Tue, Feb 26, 2019 at 09:07:01PM +0100, Markus Winand wrote: >>> (As a side node: I also dislike it how Bitmap Index Scan mixes search conditions and filters in “Index Cond”) >> I don't think it's mixing them; it's using index scan on leading *and* >> nonleading column. That's possible even if frequently not efficient. > The distinction leading / non-leading is very important for performance. Other database products use different names inthe execution plan so that it is immediately visible (without knowing the index definition). Other database products don't have the wide range of index types that we do. The concepts you propose using are pretty much meaningless for non-btree indexes. EXPLAIN doesn't really know which of the index conditions will usefully cut down the index search space for the particular type, so it just lists everything that has the right form to be passed to the index AM. Note that passing a condition to the AM, rather than executing it as a filter, is generally a win when possible even if it fails to cut the portion of the index searched at all. That's because it can save visits to the heap (tying back to the original point in this thread, that we test index conditions, then heap liveness check, then filter conditions). So the planner is aggressive about pushing things into that category when it can. It might help to point out that to be an index condition, a WHERE clause has to meet tighter conditions than just whether it mentions an index column. It generally has to be of the form "index_column indexable_operator pseudo_constant" (though some index types support some other cases like "index_column IS NULL" as index conditions too). Clauses mentioning INCLUDE columns fail this test a priori, because there are no indexable operators associated with an INCLUDE column. regards, tom lane
On 2/27/19 6:36 AM, Markus Winand wrote: > > >> On 27.02.2019, at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Markus Winand <markus.winand@winand.at> writes: >>> I think Bitmap Index Scan should take advantage of B-tree INCLUDE columns, which it doesn’t at the moment (tested onmaster as of yesterday). >> >> Regular index scans don't do what you're imagining either (i.e., check >> filter conditions in advance of visiting the heap). There's a roadblock >> to implementing such behavior, which is that we might end up applying >> filter expressions to dead rows. That could make users unhappy. >> For example, given a filter condition like "1.0/c > 0.1", people >> would complain if it still got zero-divide failures even after they'd >> deleted all rows with c=0 from their table. > > Ok, but I don’t see how this case different for key columns vs. INCLUDE columns. > Yeah, I'm a bit puzzled by this difference too - why would it be safe for keys and not the other included columns? > When I test this with the (a, b, c) index (no INCLUDE), different > plans are produced for "c=1" (my original example) vs. "1.0/c > 0.1”. I do recall a discussion about executing expressions on index tuples during IOS (before/without inspecting the heap tuple). I'm too lazy to search for the thread now, but I recall it was somehow related to leak-proof-ness. And AFAICS none of the "div" procs are marked as leak-proof, so perhaps that's one of the reasons? Of course, this does not explain why equality conditions and such (which are generally leak-proof) couldn't be moved to the bitmap index scan. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 2/27/19 6:36 AM, Markus Winand wrote: > On 27.02.2019, at 00:22, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> For example, given a filter condition like "1.0/c > 0.1", people >>> would complain if it still got zero-divide failures even after they'd >>> deleted all rows with c=0 from their table. >> Ok, but I don’t see how this case different for key columns vs. INCLUDE columns. > Yeah, I'm a bit puzzled by this difference too - why would it be safe > for keys and not the other included columns? It's not about the column, it's about the operator. We assume that operators appearing in opclasses are safe to execute even for index entries that correspond to dead rows. INCLUDE columns don't have any associated opclass, hence no assumed-usable operators. > I do recall a discussion about executing expressions on index tuples > during IOS (before/without inspecting the heap tuple). I'm too lazy to > search for the thread now, but I recall it was somehow related to > leak-proof-ness. And AFAICS none of the "div" procs are marked as > leak-proof, so perhaps that's one of the reasons? Leak-proof-ness is kind of related, perhaps, but it's not quite the property we're after here --- at least not to my mind. It might be an interesting discussion exactly what the relationship is. Meanwhile, we don't have any separate concept of functions that are safe to apply to any index entry; opclass membership is it. You could probably argue that any clause containing only indexed variables and operators that are members of *some* opclass could be used as a filter in advance of heap liveness checking. But we lack any infrastructure for that, in either the planner or executor. regards, tom lane
I wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> I do recall a discussion about executing expressions on index tuples >> during IOS (before/without inspecting the heap tuple). I'm too lazy to >> search for the thread now, but I recall it was somehow related to >> leak-proof-ness. And AFAICS none of the "div" procs are marked as >> leak-proof, so perhaps that's one of the reasons? > Leak-proof-ness is kind of related, perhaps, but it's not quite the property > we're after here --- at least not to my mind. It might be an interesting > discussion exactly what the relationship is. Meanwhile, we don't have any > separate concept of functions that are safe to apply to any index entry; > opclass membership is it. The other thread about RLS helped me to crystallize the vague feelings I had about this. I now think that this is what we're actually assuming: an indexable operator must be leakproof *with respect to its index-key input*. If it is not, it might throw errors or otherwise reveal the existence of index entries for dead rows, which would be a usability fail whether or not you are excited about security as such. On the other hand, it's okay to throw errors that only reveal information about the non-index input. For instance, it's not a problem for pg_trgm to treat regex-match operators as indexable, even though those will throw error for a malformed pattern input. So this is indeed related to leakproof-ness, but our current definition of "leakproof" is too simple to capture the property exactly. Getting back to the question of this thread, I think we'd have to restrict any filtering done in advance of heap liveness checks to fully leakproof functions, since we don't want the filter expression to possibly throw an error, regardless of which input(s) came from the index. regards, tom lane