Thread: Partition pruning on parameters grouped into an array does not prune properly
Partition pruning on parameters grouped into an array does not prune properly
From
Andrei Lepikhov
Date:
Hi, Working on improving partition pruning [1] I found a case that I may treat like a bug. I mean the case with expression on a partitioning column like: id = ARRAY[$1,$2] Basically, pruning on ARRAY[$1,$2] works in the case of single column (see correct-pruning-example.sql in attachment): PREPARE test (int, int) AS SELECT * FROM array_prune WHERE id = ANY(ARRAY[$1,$2]); EXPLAIN (COSTS OFF) EXECUTE test(1,2); Append Subplans Removed: 1 -> Seq Scan on array_prune_t0 array_prune_1 Filter: (id = ANY (ARRAY[$1, $2])) But if we partition on HASH(x,y) it is not working (see incorrect-pruning-example.sql): PREPARE test2 (int,int) AS SELECT 1 FROM array_prune WHERE id1 = ANY(ARRAY[$1]) AND id2 = ANY(ARRAY[$2]); EXPLAIN (COSTS OFF) EXECUTE test2(1,-1); Append -> Seq Scan on array_prune_t0 array_prune_1 Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2]))) -> Seq Scan on array_prune_t1 array_prune_2 Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2]))) Although its analogue works nice: PREPARE test3 (int,int) AS SELECT 1 FROM array_prune WHERE id1 = $1 AND id2 = $2; EXPLAIN (COSTS OFF) EXECUTE test3(1,-1); Append Subplans Removed: 1 -> Seq Scan on array_prune_t0 array_prune_1 Filter: ((id1 = $1) AND (id2 = $2)) So, before diving into the partitioning depths, someone may quickly say it is not a bug, but I am missing something. Some hidden semantics? [1] Prune partitions by ScalarArrayOpExpr with an array parameter (partkey = ANY($1)) https://www.postgresql.org/message-id/b8cdd20f-b34b-42b9-8c7c-dae864b7b3b2@gmail.com -- regards, Andrei Lepikhov
Attachment
Re: Partition pruning on parameters grouped into an array does not prune properly
From
David Rowley
Date:
On Thu, 27 Mar 2025 at 04:19, Andrei Lepikhov <lepihov@gmail.com> wrote: > But if we partition on HASH(x,y) it is not working (see > incorrect-pruning-example.sql): > > PREPARE test2 (int,int) AS > SELECT 1 FROM array_prune > WHERE id1 = ANY(ARRAY[$1]) AND id2 = ANY(ARRAY[$2]); > EXPLAIN (COSTS OFF) EXECUTE test2(1,-1); > > Append > -> Seq Scan on array_prune_t0 array_prune_1 > Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2]))) > -> Seq Scan on array_prune_t1 array_prune_2 > Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2]))) It is a bug. This is down to how match_clause_to_partition_key() handles ScalarArrayOpExpr. To save some complexity in the handling of ScalarArrayOpExpr, these get transformed into OpExprs, one for each item in the ScalarArrayOpExpr. Look for the call to make_opclause() in match_clause_to_partition_key(). Just a few lines down, you see that we recursively call gen_partprune_steps_internal() to pass down the OpExprs that we just generated. The problem is that the recursive call only contains the OpExprs generated for one of the ScalarArrayOpExpr, gen_prune_steps_from_opexps() requires equality quals (or at least an key IS NULL qual) for all partitioned keys for hash partitioning, otherwise it'll bail out on the following: if (part_scheme->strategy == PARTITION_STRATEGY_HASH && clauselist == NIL && !bms_is_member(i, nullkeys)) return NIL; I wonder if we need to redesign this to not do that recursive processing and instead have it so match_clause_to_partition_key() can generate multiple PartClauseInfos. If we've matched to the ScalarArrayOpExpr then I think each generated PartClauseInfo should have the same PartClauseMatchStatus. That would also get rid of the (kinda silly) overhead we have of having to match the ScalarArrayOpExpr to the partition key, then generating OpExprs and having to match those again, even though we know they will match. I suspect the fix for this might be a bit invasive to backpatch. Maybe it's something we can give a bit more clear thought to after the freeze is over. David
Re: Partition pruning on parameters grouped into an array does not prune properly
From
Andrei Lepikhov
Date:
On 3/27/25 01:58, David Rowley wrote: > I wonder if we need to redesign this to not do that recursive > processing and instead have it so match_clause_to_partition_key() can > generate multiple PartClauseInfos. If we've matched to the > ScalarArrayOpExpr then I think each generated PartClauseInfo should > have the same PartClauseMatchStatus. That would also get rid of the > (kinda silly) overhead we have of having to match the > ScalarArrayOpExpr to the partition key, then generating OpExprs and > having to match those again, even though we know they will match. > > I suspect the fix for this might be a bit invasive to backpatch. Maybe > it's something we can give a bit more clear thought to after the > freeze is over. Thank you for the explanation! Why does the pruning machinery only include the OpExpr pruning operation? Often, when preparing for pruning steps, we don’t know the exact number of values we will encounter during the initial or execution pruning stages. I believe it would be beneficial to have an iterator - something similar to the predicate_implied_by function - that can iteratively extract values from an array. This would allow pruning in practical scenarios, such as the following: CREATE OR REPLACE FUNCTION some_business_logic(val integer) RETURNS integer[] AS $$ BEGIN IF txid_current() % 2 = 0 THEN RETURN ARRAY[val]; ELSE RETURN ARRAY[val + 1]; END IF; END; $$ LANGUAGE plpgsql STRICT STABLE; PREPARE test (int) AS SELECT * FROM array_prune WHERE id = ANY (some_business_logic($1)); EXPLAIN (ANALYZE, COSTS OFF) EXECUTE test(1); Also in that case we wouldn't need to decompose a ScalarArrayOpExpr to the list of OpExpr clauses to prune partitions. -- regards, Andrei Lepikhov
Re: Partition pruning on parameters grouped into an array does not prune properly
From
Andrei Lepikhov
Date:
On 3/27/25 01:58, David Rowley wrote: > I suspect the fix for this might be a bit invasive to backpatch. Maybe > it's something we can give a bit more clear thought to after the > freeze is over. One more issue I think may be addressed (or just considered) here is the following: CREATE TABLE parted (a int, b int) PARTITION BY RANGE (a, b); CREATE TABLE part1 PARTITION OF parted FOR VALUES FROM (1, 1) TO (1, 10); CREATE TABLE part2 PARTITION OF parted FOR VALUES FROM (2, 1) TO (2, 10); INSERT INTO parted (VALUES (1, 2)); INSERT INTO parted VALUES (2, 2); EXPLAIN (COSTS OFF) SELECT * FROM parted WHERE a > 1 AND b < 1; EXPLAIN (COSTS OFF) SELECT * FROM parted WHERE a > 1 AND b > 10; /* Seq Scan on part2 parted Filter: ((a > 1) AND (b < 1)) Seq Scan on part2 parted Filter: ((a > 1) AND (b > 10)) */ I think partition part2 could be pruned like in the following example: EXPLAIN (COSTS OFF) SELECT * FROM parted WHERE a > 2 AND b > 10; /* Result One-Time Filter: false */ -- regards, Andrei Lepikhov