Re: POC, WIP: OR-clause support for indexes - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: POC, WIP: OR-clause support for indexes |
Date | |
Msg-id | 1457615077.31876.51.camel@2ndquadrant.com Whole thread Raw |
In response to | Re: POC, WIP: OR-clause support for indexes (Teodor Sigaev <teodor@sigaev.ru>) |
Responses |
Re: POC, WIP: OR-clause support for indexes
Re: POC, WIP: OR-clause support for indexes |
List | pgsql-hackers |
Hi Teodor, I've looked into v2 of the patch you sent a few days ago. Firstly, I definitely agree that being able to use OR conditions with an index is definitely a cool idea. I do however agree with David that the patch would definitely benefit from comments documenting various bits that are less obvious to mere mortals like me, with limited knowledge of the index internals. I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. Now, some review comments from eyeballing the patch. Some of those are nitpicking, but well ... 1) fields in BrinOpaque are not following the naming convention (all the existing fields start with bo_) 2) there's plenty of places violating the usual code style (e.g. for single-command if branches) - not a big deal for WIP patch, but needs to get fixed eventually 3) I wonder whether we really need both SK_OR and SK_AND, considering they are mutually exclusive. Why not to assume SK_AND by default, and only use SK_OR? If we really need them, perhaps an assert making sure they are not set at the same time would be appropriate. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. 5) scanGetItem() may end up using uninitialized 'cmp' - it only gets initialized when (!leftFinished && !rightFinished), but then gets used when either part of the condition evaluates to true. Probably should be if (!leftFinished || !rightFinished) cmp = ... 6) the code in nodeIndexscan.c should not include call to abort() { abort(); elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } 7) I find it rather ugly that the paths are built by converting BitmapOr paths. Firstly, it means indexes without amgetbitmap can't benefit from this change. Maybe that's reasonable limitation, though? But more importantly, this design already has a bunch of unintended consequences. For example, the current code completely ignores enable_indexscan setting, because it merely copies the costs from the bitmap path. SET enable_indexscan = off; EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1] OR c && ARRAY[2]); QUERY PLAN -------------------------------------------------------------------Index Scan using t_c_idx on t (cost=0.00..4.29 rows=0width=33) Index Cond: ((c && '{1}'::integer[]) OR (c && '{2}'::integer[])) (2 rows) That's pretty dubious, I guess. So this code probably needs to be made aware of enable_indexscan - right now it entirely ignores startup_cost in convert_bitmap_path_to_index_clause(). But of course if there are multiple IndexPaths, the enable_indexscan=off will be included multiple times. 9) This already breaks estimation for some reason. Consider this example, using a table with int[] column, with gist index built using intarray: EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1,2,3,4,5,6,7]); QUERY PLAN --------------------------------------------------------------------Index Scan using t_c_idx on t (cost=0.28..52.48 rows=12width=33) Index Cond: (c && '{1,2,3,4,5,6,7}'::integer[]) (2 rows) EXPLAIN SELECT * FROM t WHERE (c && ARRAY[8,9,10,11,12,13,14]); QUERY PLAN --------------------------------------------------------------------Index Scan using t_c_idx on t (cost=0.28..44.45 rows=10width=33) Index Cond: (c && '{8,9,10,11,12,13,14}'::integer[]) (2 rows) EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1,2,3,4,5,6,7]) OR (c && ARRAY[8,9,10,11,12,13,14]); QUERY PLAN --------------------------------------------------------------------Index Scan using t_c_idx on t (cost=0.00..4.37 rows=0width=33) Index Cond: ((c && '{1,2,3,4,5,6,7}'::integer[]) OR (c && '{8,9,10,11,12,13,14}'::integer[])) (2 rows) So the OR-clause is estimated to match 0 rows, less than each of the clauses independently. Needless to say that without the patch this works just fine. 10) Also, this already breaks some regression tests, apparently because it changes how 'width' is computed. So I think this way of building the index path from a BitmapOr path is pretty much a dead-end. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date:
Previous
From: Amit KapilaDate:
Subject: Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”