Thread: Bug: When user-defined AM is used, the index path cannot be selected correctly
Bug: When user-defined AM is used, the index path cannot be selected correctly
From
Quan Zongliang
Date:
Hi, 1. When using extended PGroonga CREATE EXTENSION pgroonga; CREATE TABLE memos ( id boolean, content varchar ); CREATE INDEX idxA ON memos USING pgroonga (id); 2. Disable bitmapscan and seqscan: SET enable_seqscan=off; SET enable_indexscan=on; SET enable_bitmapscan=off; 3. Neither ID = 'f' nor id= 't' can use the index correctly. postgres=# explain select * from memos where id='f'; QUERY PLAN -------------------------------------------------------------------------- Seq Scan on memos (cost=10000000000.00..10000000001.06 rows=3 width=33) Filter: (NOT id) (2 rows) postgres=# explain select * from memos where id='t'; QUERY PLAN -------------------------------------------------------------------------- Seq Scan on memos (cost=10000000000.00..10000000001.06 rows=3 width=33) Filter: id (2 rows) postgres=# explain select * from memos where id>='t'; QUERY PLAN ------------------------------------------------------------------- Index Scan using idxa on memos (cost=0.00..4.01 rows=2 width=33) Index Cond: (id >= true) (2 rows) The reason is that these expressions are converted to BoolExpr and Var. match_clause_to_indexcol does not use them to check boolean-index. patch attached. -- Quan Zongliang Beijing Vastdata
Attachment
Re: Bug: When user-defined AM is used, the index path cannot be selected correctly
From
Tom Lane
Date:
Quan Zongliang <quanzongliang@yeah.net> writes: > 1. When using extended PGroonga > ... > 3. Neither ID = 'f' nor id= 't' can use the index correctly. This works fine for btree indexes. I think the actual problem is that IsBooleanOpfamily only accepts the btree and hash opclasses, and that's what needs to be improved. Your proposed patch fails to do that, which makes it just a crude hack that solves some aspects of the issue (and probably breaks other things). It might work to change IsBooleanOpfamily so that it checks to see whether BooleanEqualOperator is a member of the opclass. That's basically what we need to know before we dare generate substitute index clauses. It's kind of an expensive test though, and the existing coding assumes that IsBooleanOpfamily is cheap ... regards, tom lane
Re: Bug: When user-defined AM is used, the index path cannot be selected correctly
From
Quan Zongliang
Date:
On 2022/8/17 10:03, Tom Lane wrote: > Quan Zongliang <quanzongliang@yeah.net> writes: >> 1. When using extended PGroonga >> ... >> 3. Neither ID = 'f' nor id= 't' can use the index correctly. > > This works fine for btree indexes. I think the actual problem > is that IsBooleanOpfamily only accepts the btree and hash > opclasses, and that's what needs to be improved. Your proposed > patch fails to do that, which makes it just a crude hack that > solves some aspects of the issue (and probably breaks other > things). > > It might work to change IsBooleanOpfamily so that it checks to > see whether BooleanEqualOperator is a member of the opclass. > That's basically what we need to know before we dare generate > substitute index clauses. It's kind of an expensive test > though, and the existing coding assumes that IsBooleanOpfamily > is cheap ... > > regards, tom lane New patch attached. It seems that partitions do not use AM other than btree and hash. Rewrite only indxpath.c and check if it is a custom AM.
Attachment
Re: Bug: When user-defined AM is used, the index path cannot be selected correctly
From
Tom Lane
Date:
Quan Zongliang <quanzongliang@yeah.net> writes: > New patch attached. > It seems that partitions do not use AM other than btree and hash. > Rewrite only indxpath.c and check if it is a custom AM. This seems drastically underdocumented, and the test you're using for extension opclasses is wrong. What we need to know before applying match_boolean_index_clause is that a clause using BooleanEqualOperator will be valid for the index. That's got next door to nothing to do with whether the opclass is default for the index AM. A non-default opclass might support that operator, and conversely a default one might not (although I concede it's not that easy to imagine what other set of operators-on-boolean an extension opclass might be interested in). I think we need something more like the attached. regards, tom lane diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 045ff2e487..63a8eef45c 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -153,6 +153,7 @@ static IndexClause *match_clause_to_indexcol(PlannerInfo *root, RestrictInfo *rinfo, int indexcol, IndexOptInfo *index); +static bool IsBooleanOpfamily(Oid opfamily); static IndexClause *match_boolean_index_clause(PlannerInfo *root, RestrictInfo *rinfo, int indexcol, IndexOptInfo *index); @@ -2342,6 +2343,23 @@ match_clause_to_indexcol(PlannerInfo *root, return NULL; } +/* + * IsBooleanOpfamily + * Detect whether an opfamily supports boolean equality as an operator. + * + * If the opfamily OID is in the range of built-in objects, we can rely + * on hard-wired knowledge of which built-in opfamilies support this. + * For extension opfamilies, there's no choice but to do a catcache lookup. + */ +static bool +IsBooleanOpfamily(Oid opfamily) +{ + if (opfamily < FirstNormalObjectId) + return IsBuiltinBooleanOpfamily(opfamily); + else + return op_in_opfamily(BooleanEqualOperator, opfamily); +} + /* * match_boolean_index_clause * Recognize restriction clauses that can be matched to a boolean index. diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 1fa7fc99b5..b9c1ce0a64 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -1191,8 +1191,13 @@ partkey_is_bool_constant_for_query(RelOptInfo *partrel, int partkeycol) PartitionScheme partscheme = partrel->part_scheme; ListCell *lc; - /* If the partkey isn't boolean, we can't possibly get a match */ - if (!IsBooleanOpfamily(partscheme->partopfamily[partkeycol])) + /* + * If the partkey isn't boolean, we can't possibly get a match. + * + * Partitioning currently can only use built-in AMs, so checking for + * built-in boolean opclasses is good enough. + */ + if (!IsBuiltinBooleanOpfamily(partscheme->partopfamily[partkeycol])) return false; /* Check each restriction clause for the partitioned rel */ diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index bf9fe5b7aa..b5403149d7 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -3596,7 +3596,11 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey, *outconst = NULL; - if (!IsBooleanOpfamily(partopfamily)) + /* + * Partitioning currently can only use built-in AMs, so checking for + * built-in boolean opclasses is good enough. + */ + if (!IsBuiltinBooleanOpfamily(partopfamily)) return PARTCLAUSE_UNSUPPORTED; if (IsA(clause, BooleanTest)) diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h index 8dc9ce01bb..b56d714322 100644 --- a/src/include/catalog/pg_opfamily.h +++ b/src/include/catalog/pg_opfamily.h @@ -55,7 +55,8 @@ DECLARE_UNIQUE_INDEX_PKEY(pg_opfamily_oid_index, 2755, OpfamilyOidIndexId, on pg #ifdef EXPOSE_TO_CLIENT_CODE -#define IsBooleanOpfamily(opfamily) \ +/* This does not account for non-core opclasses that might accept boolean */ +#define IsBuiltinBooleanOpfamily(opfamily) \ ((opfamily) == BOOL_BTREE_FAM_OID || (opfamily) == BOOL_HASH_FAM_OID) #endif /* EXPOSE_TO_CLIENT_CODE */
Re: Bug: When user-defined AM is used, the index path cannot be selected correctly
From
Tom Lane
Date:
I wrote: > I think we need something more like the attached. Meh -- serves me right for not doing check-world before sending. The patch causes some plans to change in the btree_gin and btree_gist modules; which is good, because that shows that the patch is actually doing what it's supposed to. The fact that your patch didn't make the cfbot unhappy implies that it wasn't triggering for those modules. I think the reason is that you did + ((amid) >= FirstNormalObjectId && \ + OidIsValid(GetDefaultOpClass(BOOLOID, (amid)))) \ so that the FirstNormalObjectId cutoff was applied to the AM's OID, not to the opfamily OID, causing it to do the wrong thing for extension opclasses over built-in AMs. The good news is this means we don't need to worry about making a test case ... regards, tom lane diff --git a/contrib/btree_gin/expected/bool.out b/contrib/btree_gin/expected/bool.out index efb3e1e327..207a3f2328 100644 --- a/contrib/btree_gin/expected/bool.out +++ b/contrib/btree_gin/expected/bool.out @@ -87,13 +87,15 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i<=true ORDER BY i; (6 rows) EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i=true ORDER BY i; - QUERY PLAN ------------------------------ + QUERY PLAN +------------------------------------------- Sort Sort Key: i - -> Seq Scan on test_bool + -> Bitmap Heap Scan on test_bool Filter: i -(4 rows) + -> Bitmap Index Scan on idx_bool + Index Cond: (i = true) +(6 rows) EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i>=true ORDER BY i; QUERY PLAN diff --git a/contrib/btree_gist/expected/bool.out b/contrib/btree_gist/expected/bool.out index c67a9685d5..29390f079c 100644 --- a/contrib/btree_gist/expected/bool.out +++ b/contrib/btree_gist/expected/bool.out @@ -71,7 +71,7 @@ SELECT * FROM booltmp WHERE a; QUERY PLAN ------------------------------------------ Index Only Scan using boolidx on booltmp - Filter: a + Index Cond: (a = true) (2 rows) SELECT * FROM booltmp WHERE a; @@ -85,7 +85,7 @@ SELECT * FROM booltmp WHERE NOT a; QUERY PLAN ------------------------------------------ Index Only Scan using boolidx on booltmp - Filter: (NOT a) + Index Cond: (a = false) (2 rows) SELECT * FROM booltmp WHERE NOT a; diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 045ff2e487..63a8eef45c 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -153,6 +153,7 @@ static IndexClause *match_clause_to_indexcol(PlannerInfo *root, RestrictInfo *rinfo, int indexcol, IndexOptInfo *index); +static bool IsBooleanOpfamily(Oid opfamily); static IndexClause *match_boolean_index_clause(PlannerInfo *root, RestrictInfo *rinfo, int indexcol, IndexOptInfo *index); @@ -2342,6 +2343,23 @@ match_clause_to_indexcol(PlannerInfo *root, return NULL; } +/* + * IsBooleanOpfamily + * Detect whether an opfamily supports boolean equality as an operator. + * + * If the opfamily OID is in the range of built-in objects, we can rely + * on hard-wired knowledge of which built-in opfamilies support this. + * For extension opfamilies, there's no choice but to do a catcache lookup. + */ +static bool +IsBooleanOpfamily(Oid opfamily) +{ + if (opfamily < FirstNormalObjectId) + return IsBuiltinBooleanOpfamily(opfamily); + else + return op_in_opfamily(BooleanEqualOperator, opfamily); +} + /* * match_boolean_index_clause * Recognize restriction clauses that can be matched to a boolean index. diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c index 1fa7fc99b5..b9c1ce0a64 100644 --- a/src/backend/optimizer/path/pathkeys.c +++ b/src/backend/optimizer/path/pathkeys.c @@ -1191,8 +1191,13 @@ partkey_is_bool_constant_for_query(RelOptInfo *partrel, int partkeycol) PartitionScheme partscheme = partrel->part_scheme; ListCell *lc; - /* If the partkey isn't boolean, we can't possibly get a match */ - if (!IsBooleanOpfamily(partscheme->partopfamily[partkeycol])) + /* + * If the partkey isn't boolean, we can't possibly get a match. + * + * Partitioning currently can only use built-in AMs, so checking for + * built-in boolean opclasses is good enough. + */ + if (!IsBuiltinBooleanOpfamily(partscheme->partopfamily[partkeycol])) return false; /* Check each restriction clause for the partitioned rel */ diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c index bf9fe5b7aa..b5403149d7 100644 --- a/src/backend/partitioning/partprune.c +++ b/src/backend/partitioning/partprune.c @@ -3596,7 +3596,11 @@ match_boolean_partition_clause(Oid partopfamily, Expr *clause, Expr *partkey, *outconst = NULL; - if (!IsBooleanOpfamily(partopfamily)) + /* + * Partitioning currently can only use built-in AMs, so checking for + * built-in boolean opclasses is good enough. + */ + if (!IsBuiltinBooleanOpfamily(partopfamily)) return PARTCLAUSE_UNSUPPORTED; if (IsA(clause, BooleanTest)) diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h index 8dc9ce01bb..b56d714322 100644 --- a/src/include/catalog/pg_opfamily.h +++ b/src/include/catalog/pg_opfamily.h @@ -55,7 +55,8 @@ DECLARE_UNIQUE_INDEX_PKEY(pg_opfamily_oid_index, 2755, OpfamilyOidIndexId, on pg #ifdef EXPOSE_TO_CLIENT_CODE -#define IsBooleanOpfamily(opfamily) \ +/* This does not account for non-core opclasses that might accept boolean */ +#define IsBuiltinBooleanOpfamily(opfamily) \ ((opfamily) == BOOL_BTREE_FAM_OID || (opfamily) == BOOL_HASH_FAM_OID) #endif /* EXPOSE_TO_CLIENT_CODE */