Re: [HACKERS] path toward faster partition pruning - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] path toward faster partition pruning
Date
Msg-id 60280612-9319-fbb4-e71d-d023b5a7ab61@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] path toward faster partition pruning  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Responses Re: [HACKERS] path toward faster partition pruning  (Jesper Pedersen <jesper.pedersen@redhat.com>)
List pgsql-hackers
Thanks Jesper.

On 2017/11/23 3:56, Jesper Pedersen wrote:
> Hi Amit,
> 
> On 11/22/2017 03:59 AM, Amit Langote wrote:
>> Fixed in the attached.  No other changes beside that.
>>
> 
> I have been using the following script to look at the patch
> 
> -- test.sql --

[ ... ]

> EXPLAIN (ANALYZE) SELECT t1.a, t1.b, t2.c, t2.d FROM t1 INNER JOIN t2 ON
> t2.c = t1.b WHERE t2.d = 1;
> 
> I just wanted to highlight that the "JOIN ON" partition isn't pruned - the
> "WHERE" one is.

Did you mean to write ON t2.d = t1.b?  If so, equivalence class mechanism
will give rise to a t1.b = 1 and hence help prune t1's partition as well:

EXPLAIN (COSTS OFF)
SELECT t1.a, t1.b, t2.c, t2.d
FROM t1 INNER JOIN t2 ON t2.d = t1.b
WHERE t2.d = 1;                       QUERY PLAN
-----------------------------------------------------------Nested Loop  ->  Append        ->  Bitmap Heap Scan on
t1_p00             Recheck Cond: (b = 1)              ->  Bitmap Index Scan on idx_t1_b_a_p00                    Index
Cond:(b = 1)  ->  Materialize        ->  Append              ->  Bitmap Heap Scan on t2_p00                    Recheck
Cond:(d = 1)                    ->  Bitmap Index Scan on idx_t2_d_p00                          Index Cond: (d = 1)
 

In your original query, you use ON t2.c = t1.b, whereby there is no
"constant" value to perform partition pruning with.  t2.c is unknown until
the join actually executes.

> BEGIN;
> EXPLAIN (ANALYZE) UPDATE t1 SET a = 1 WHERE b = 1;
> ROLLBACK;
>
> BEGIN;
> EXPLAIN (ANALYZE) DELETE FROM t1 WHERE b = 1;
> ROLLBACK;
> 
> Should pruning of partitions for UPDATEs (where the partition key isn't
> updated) and DELETEs be added to the TODO list?

Note that partition pruning *does* work for UPDATE and DELETE, but only if
you use list/range partitioning.  The reason it doesn't work in this case
(t1 is hash partitioned) is that the pruning is still based on constraint
exclusion in the UPDATE/DELETE case and constraint exclusion cannot handle
hash partitioning.

See example below that uses list partitioning:

drop table t1, t2;
create table t1 (a int, b int) partition by list (b);
create table t1_p0 partition of t1 for values in (0);
create table t1_p1 partition of t1 for values in (1);
create table t1_p2 partition of t1 for values in (2);
create table t1_p3 partition of t1 for values in (3);

create table t2 (c int, d int) partition by list (d);
create table t2_p0 partition of t2 for values in (0);
create table t2_p1 partition of t2 for values in (1);
create table t2_p2 partition of t2 for values in (2);
create table t2_p3 partition of t2 for values in (3);

explain (costs off) update t1 set a = 1 where b = 1;      QUERY PLAN
=------------------------Update on t1  Update on t1_p1  ->  Seq Scan on t1_p1        Filter: (b = 1)
(4 rows)

explain (costs off) delete from t1 where b = 1;      QUERY PLAN
=------------------------Delete on t1  Delete on t1_p1  ->  Seq Scan on t1_p1        Filter: (b = 1)
(4 rows)


I can see how that seems a bit odd.  If you use hash partitioning,
UPDATE/DELETE do not benefit from partition-pruning, even though SELECT
does.  That's because SELECT uses the new partition-pruning method (this
patch set) which supports hash partitioning, whereas UPDATE and DELETE use
constraint exclusion which doesn't.  It would be a good idea to make even
UPDATE and DELETE use the new method thus bringing everyone on the same
page, but that requires us to make some pretty non-trivial changes to how
UPDATE/DELETE planning works for inheritance/partitioned tables, which we
should undertake separately, imho.

Thanks,
Amit



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] UPDATE of partition key