Thread: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18344 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 16.2 Operating system: Ubuntu 22.04 Description: The following query: CREATE TABLE t (b bool, i int) PARTITION BY RANGE (b, i); CREATE TABLE tp PARTITION OF t FOR VALUES FROM (false, 0) TO (false, 1); SELECT * FROM t WHERE b IS NOT true; fails with ERROR: invalid strategy number 0. Reproduced on REL_12_STABLE .. master. The first bad commit for this anomaly is e0693faf7.
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > The following query: > CREATE TABLE t (b bool, i int) PARTITION BY RANGE (b, i); > CREATE TABLE tp PARTITION OF t FOR VALUES FROM (false, 0) TO (false, 1); > SELECT * FROM t WHERE b IS NOT true; > fails with ERROR: invalid strategy number 0. > Reproduced on REL_12_STABLE .. master. > The first bad commit for this anomaly is e0693faf7. What seems to be happening is that gen_prune_step_op is getting op_is_ne = true and doing this: /* * For clauses that contain an <> operator, set opstrategy to * InvalidStrategy to signal get_matching_list_bounds to do the right * thing. */ opstep->opstrategy = op_is_ne ? InvalidStrategy : opstrategy; but then we're failing in get_matching_range_bounds, ie somebody taught get_matching_list_bounds to do the right thing but not any of the other code paths. I'm also wondering how we got there in the first place. It looks like match_boolean_partition_clause thinks it can translate "b IS NOT true" to "b <> true", which is flat wrong --- it gives the wrong result for null. regards, tom lane
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Fri, 16 Feb 2024 at 05:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > What seems to be happening is that gen_prune_step_op is getting > op_is_ne = true and doing this: > > /* > * For clauses that contain an <> operator, set opstrategy to > * InvalidStrategy to signal get_matching_list_bounds to do the right > * thing. > */ > opstep->opstrategy = op_is_ne ? InvalidStrategy : opstrategy; > > but then we're failing in get_matching_range_bounds, ie somebody > taught get_matching_list_bounds to do the right thing but not > any of the other code paths. hmm, yeah. I'm just trying to wrap my head around if this can even work for RANGE partitioned tables. > I'm also wondering how we got there in the first place. It looks like > match_boolean_partition_clause thinks it can translate "b IS NOT true" > to "b <> true", which is flat wrong --- it gives the wrong result for > null. Thought I'd fixed that in e0693faf7, but looks like I only tested it with DEFAULT partitions, not NULL partitions. A fairly simple fix for that part: /* Always include the default partition if any. */ result->scan_default = partition_bound_has_default(boundinfo); + /* Likewise for the NULL partition, if any */ + result->scan_null = partition_bound_accepts_nulls(boundinfo); I'll look at the RANGE <> bool stuff. David
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Fri, 16 Feb 2024 at 05:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > What seems to be happening is that gen_prune_step_op is getting > op_is_ne = true and doing this: > > /* > * For clauses that contain an <> operator, set opstrategy to > * InvalidStrategy to signal get_matching_list_bounds to do the right > * thing. > */ > opstep->opstrategy = op_is_ne ? InvalidStrategy : opstrategy; > > but then we're failing in get_matching_range_bounds, ie somebody > taught get_matching_list_bounds to do the right thing but not > any of the other code paths. Yeah, prior to e0693faf7, we always did: partclause->op_is_ne = false; so the code you quoted would always use the equality opstrategy, therefore wouldn't hit the "if (opstrategy == InvalidStrategy)" block in get_matching_list_bounds(). The old code wrongly assumed "bool IS NOT true" was the same as "bool IS false" and vice-versa. I had thought I could fix this by making it use the same code that we do for cases like int4col <> 123, but: a) only get_matching_list_bounds() knows how to handle those, not the equivalent hash and range versions and; b) bool is not true matches NULLs, but int4col <> 123 does not. So, I'm a little unsure if we should try and make bool IS NOT clauses prune partitions at all. It was a bug in the original code that thought we could do that, and it looks like I didn't do a good job of fixing that. I see three options: 1) Make get_matching_list_bounds() work for bool IS NOT clauses by properly including the NULL partition when handling a bool IS NOT clause. 2) Just don't generate a pruning step for bool IS NOT clauses. 3) Just always include the NULL partition in get_matching_list_bounds's "if (opstrategy == InvalidStrategy)" block. I don't quite see how to do #1 as we don't have an easy way to tell if we're handling bool IS NOT clauses inside get_matching_list_bounds(). Maybe we could check the comparison function is btboolcmp. That's kinda cruddy. We don't have oid_symbol for pg_proc.dat as we do in pg_operator.dat, so there's no #define for the pg_proc entry's Oid. If we do #2, I'm concerned we'll regress someone's workload by including the other bool partition. Likewise, for #3, we'll include the NULL partition for non-boolean cases, which could cause someone problems. The attached does #3 plus disables "bool IS NOT" pruning for RANGE and HASH to avoid the reported "invalid strategy number 0." error. Maybe we could do #1 by checking for partsupfunc.fn_oid == 1693 in the back branches and come up with a cleaner way in master by adding a new field to PartitionPruneStepOp. Keen to get feedback on this one. Also included Amit and Álvaro as they might have another idea. I also noticed that the following code returns PARTCLAUSE_NOMATCH: if (part_scheme->strategy != PARTITION_STRATEGY_LIST) return PARTCLAUSE_NOMATCH; I think that should return PARTCLAUSE_UNSUPPORTED. But it's really only an inefficiency rather than a bug, I think. David
Attachment
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Sat, 17 Feb 2024 at 01:32, David Rowley <dgrowleyml@gmail.com> wrote: > I see three options: > > 1) Make get_matching_list_bounds() work for bool IS NOT clauses by > properly including the NULL partition when handling a bool IS NOT > clause. > 2) Just don't generate a pruning step for bool IS NOT clauses. > 3) Just always include the NULL partition in > get_matching_list_bounds's "if (opstrategy == InvalidStrategy)" block. It turns out there's a 4th, and much better option that allows this just to work without any weirdness. The method used in partprune.c to handle "partkey IN ('const1', 'const2')" is to transform that into "partkey = 'const1' OR partkey = 'const2'". Whenever we see a ScalarArrayOpExpr with consts, we just form such an OR clause and recursively generate pruning steps for the OR clause. That'll end up creating two pruning steps and combining them with a PARTPRUNE_COMBINE_UNION. We can do the same for BooleanTests. Given a clause such as: "partkey IS NOT false", we can just generate the clause "partkey IS true OR partkey IS NULL" and recursively generate steps for that. I've attached a draft patch. I'll work on this more after I sleep. I'm tempted to go a bit further in master only and add support for bool IS NOT UNKNOWN and bool IS UNKNOWN using the same method. David
Attachment
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > We can do the same for BooleanTests. Given a clause such as: "partkey > IS NOT false", we can just generate the clause "partkey IS true OR > partkey IS NULL" and recursively generate steps for that. +1 ... sounds clean and clearly correct. > I'm tempted to go a bit further in master only and add support for > bool IS NOT UNKNOWN and bool IS UNKNOWN using the same method. These are the same as IS NOT NULL and IS NULL, so I don't see the need for an OR? regards, tom lane
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Mon, 19 Feb 2024 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > We can do the same for BooleanTests. Given a clause such as: "partkey > > IS NOT false", we can just generate the clause "partkey IS true OR > > partkey IS NULL" and recursively generate steps for that. > > +1 ... sounds clean and clearly correct. Here's a more complete patch for this. I included some tests for LIST and RANGE partitioned tables. I did manual testing for HASH, and was on the fence about covering that too. I did try the following using the table from the tests: select * from boolrangep where a is not true and not b and c = 25 and a is not null; When will be effectively transformed into: select * from boolrangep where (a is false or a is null) and not b and c = 25 and a is not null; It seems that's unable to prune the NULL partition but that mostly seems to be due to a limitation of the current design. I'm not sure it's worth going to any additional trouble to make that work. It seems a bit unlikely, especially so given how long the BooleanTest pruning stuff was broken for before anyone noticed. > > I'm tempted to go a bit further in master only and add support for > > bool IS NOT UNKNOWN and bool IS UNKNOWN using the same method. > > These are the same as IS NOT NULL and IS NULL, so I don't see the > need for an OR? Uh, yeah. True. That makes it even more simple. Just use PARTCLAUSE_MATCH_NULLNESS. David
Attachment
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
Tender Wang
Date:
David Rowley <dgrowleyml@gmail.com> 于2024年2月19日周一 07:49写道:
On Mon, 19 Feb 2024 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > We can do the same for BooleanTests. Given a clause such as: "partkey
> > IS NOT false", we can just generate the clause "partkey IS true OR
> > partkey IS NULL" and recursively generate steps for that.
>
> +1 ... sounds clean and clearly correct.
Here's a more complete patch for this. I included some tests for LIST
and RANGE partitioned tables. I did manual testing for HASH, and was
on the fence about covering that too.
I did try the following using the table from the tests:
select * from boolrangep where a is not true and not b and c = 25 and
a is not null;
When will be effectively transformed into:
select * from boolrangep where (a is false or a is null) and not b and
c = 25 and a is not null;
It seems that's unable to prune the NULL partition but that mostly
seems to be due to a limitation of the current design. I'm not sure
it's worth going to any additional trouble to make that work. It
seems a bit unlikely, especially so given how long the BooleanTest
pruning stuff was broken for before anyone noticed.
> > I'm tempted to go a bit further in master only and add support for
> > bool IS NOT UNKNOWN and bool IS UNKNOWN using the same method.
>
> These are the same as IS NOT NULL and IS NULL, so I don't see the
> need for an OR?
Uh, yeah. True. That makes it even more simple. Just use
PARTCLAUSE_MATCH_NULLNESS.
David
After git apply fix_partprune_BooleanTests.patch on master, I got below warnings:
partprune.c: In function ‘match_clause_to_partition_key’:
../../../src/include/nodes/nodes.h:221:25: warning: initialization of ‘BooleanTest *’ {aka ‘struct BooleanTest *’} from incompatible pointer type ‘Expr *’ {aka ‘struct Expr *’} [-Wincompatible-pointer-types]
221 | #define copyObject(obj) ((typeof(obj)) copyObjectImpl(obj))
| ^
partprune.c:1824:32: note: in expansion of macro ‘copyObject’
1824 | BooleanTest *new_booltest = copyObject(clause);
../../../src/include/nodes/nodes.h:221:25: warning: initialization of ‘BooleanTest *’ {aka ‘struct BooleanTest *’} from incompatible pointer type ‘Expr *’ {aka ‘struct Expr *’} [-Wincompatible-pointer-types]
221 | #define copyObject(obj) ((typeof(obj)) copyObjectImpl(obj))
| ^
partprune.c:1824:32: note: in expansion of macro ‘copyObject’
1824 | BooleanTest *new_booltest = copyObject(clause);
Maybe this: BooleanTest *new_booltest = (BooleanTest *) copyObject(clause);
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
Amit Langote
Date:
Hi David, On Mon, Feb 19, 2024 at 8:49 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Mon, 19 Feb 2024 at 05:25, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > David Rowley <dgrowleyml@gmail.com> writes: > > > We can do the same for BooleanTests. Given a clause such as: "partkey > > > IS NOT false", we can just generate the clause "partkey IS true OR > > > partkey IS NULL" and recursively generate steps for that. > > > > +1 ... sounds clean and clearly correct. > > Here's a more complete patch for this. Thanks for working on this. Overall, I too like this idea. Though I noticed that this approach will effectively disable pruning with a clause on the 2nd key column, if any, present in the query: CREATE TABLE t (b bool, i int) PARTITION BY RANGE (b, i); CREATE TABLE tp PARTITION OF t FOR VALUES FROM (false, 0) TO (false, 1); CREATE TABLE tp2 PARTITION OF t FOR VALUES FROM (false, 1) TO (false, 2); CREATE TABLE tp3 PARTITION OF t FOR VALUES FROM (true, 0) TO (true, 1); CREATE TABLE tp4 PARTITION OF t FOR VALUES FROM (true, 1) TO (true, 2); -- tp2 should be pruned, but is not. explain SELECT * FROM t WHERE b IS NOT true and i = 0; QUERY PLAN -------------------------------------------------------------- Append (cost=0.00..81.81 rows=12 width=5) -> Seq Scan on tp t_1 (cost=0.00..40.88 rows=6 width=5) Filter: ((b IS NOT TRUE) AND (i = 0)) -> Seq Scan on tp2 t_2 (cost=0.00..40.88 rows=6 width=5) Filter: ((b IS NOT TRUE) AND (i = 0)) (5 rows) -- like it is in this case explain SELECT * FROM t WHERE b IS false and i = 0; QUERY PLAN ----------------------------------------------------- Seq Scan on tp t (cost=0.00..40.88 rows=6 width=5) Filter: ((b IS FALSE) AND (i = 0)) (2 rows) I guess we'll have to live with that, because the generate_opsteps code that generates multi-column pruning steps only supports scenarios where each key's matched clause is a simple comparison, not, for example, where it is an OR expression. -- Thanks, Amit Langote
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
Alexander Lakhin
Date:
Hello David, 19.02.2024 02:49, David Rowley wrote: > > Here's a more complete patch for this. I included some tests for LIST > and RANGE partitioned tables. I did manual testing for HASH, and was > on the fence about covering that too. > Thank you for the fix! Beside that, I'm a bit confused by the opstrategy description for get_matching_range_bounds(). Above that function we have: * 'opstrategy' if non-zero must be a btree strategy number. But as we could see, zero opstrategy is not valid for the function (so "if non-zero" is meaningless here?), unlike opstrategy for get_matching_list_bounds(), which has the same description, but the latter function contains: /* Special case handling of values coming from a <> operator clause. */ if (opstrategy == InvalidStrategy) ... Best regards, Alexander
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Tue, 20 Feb 2024 at 16:00, Alexander Lakhin <exclusion@gmail.com> wrote: > Beside that, I'm a bit confused by the opstrategy description for > get_matching_range_bounds(). > Above that function we have: > * 'opstrategy' if non-zero must be a btree strategy number. > > But as we could see, zero opstrategy is not valid for the function (so > "if non-zero" is meaningless here?), unlike opstrategy for > get_matching_list_bounds(), which has the same description, but the latter > function contains: > /* Special case handling of values coming from a <> operator clause. */ > if (opstrategy == InvalidStrategy) Yeah, that seems worth fixing in master as, I agree, the comment is wrong. Possibly, we considered supporting <> for RANGE partitioning at some point, I don't recall. I was also working on a fix for what I mentioned in [1], which, I think, is master-only material. I'd say we can fix the comment as part of that. The patch for both is attached. David [1] https://postgr.es/m/CAApHDvqriy8mPOFJ_Bd66YGXJ4+XULpv-4YdB+ePdCQFztyisA@mail.gmail.com
Attachment
Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
From
David Rowley
Date:
On Tue, 20 Feb 2024 at 16:50, David Rowley <dgrowleyml@gmail.com> wrote: > > On Tue, 20 Feb 2024 at 16:00, Alexander Lakhin <exclusion@gmail.com> wrote: > > Beside that, I'm a bit confused by the opstrategy description for > > get_matching_range_bounds(). > > Above that function we have: > > * 'opstrategy' if non-zero must be a btree strategy number. > > > Yeah, that seems worth fixing in master as, I agree, the comment is > wrong. Possibly, we considered supporting <> for RANGE partitioning > at some point, I don't recall. > > I was also working on a fix for what I mentioned in [1], which, I > think, is master-only material. I'd say we can fix the comment as > part of that. > > The patch for both is attached. I've pushed this patch. David