Home > mailing lists

Potential partition pruning regression on PostgreSQL 18 - Mailing list pgsql-performance

From	Cándido Antonio Martínez Descalzo
Subject	Potential partition pruning regression on PostgreSQL 18
Date	April 1 14:56:52
Msg-id	CAH5YaUwVUWETTyVECTnhs7C=CVwi+uMSQH=cOkwAUqMdvXdwWA@mail.gmail.com Whole thread
Responses	Re: Potential partition pruning regression on PostgreSQL 18
List	pgsql-performance

Tree view

Hi all,

We noticed that one of our queries unexpectedly stopped applying partition pruning on PG18, although it applies it on PG16 and PG17. The issue has been replicated on Linux and macOS.

Failing to apply partition pruning significantly impacts the performance of these queries.

We recreated the issue using a simplified schema and query. Details on the schema, query and resulting plans in PG17 and PG18 are provided below. Some changes in the query restore partition pruning in PG18, specifically:

Replacing the view and date condition used with a sub-query or CTE with the same condition restores partition pruning (updated query and plan provided further below)
Keeping the view and using a single "group by" instead of multiple grouping sets restores partition pruning (updated query and plan provided further below)

Does anybody know if there is a documented behaviour change in PG18 that could explain this or if this is a known issue?

Many thanks,

Cándido Martínez

ninehq

This is the schema used:

create table entity (
    id integer primary key,
    name varchar(255) unique not null
);

insert into entity (id, name)
select i, 'Entity ' || i from generate_series(1, 1000, 1) g(i);

create table entity_tags (
    entity_id integer not null references entity(id),
    from_day date not null,
    to_day date not null,
    tag_1 text not null,
    tag_2 text not null,
    primary key (entity_id, from_day)
);

insert into entity_tags
select id, '2025-01-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 0;

insert into entity_tags
select id, '2025-01-01'::date, '2026-01-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 1;

insert into entity_tags
select id, '2026-02-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 1;

insert into entity_tags
select id, '2025-01-01'::date, '2026-02-28'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 2;

insert into entity_tags
select id, '2026-03-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 2;

create table monthly_data (
    month date not null,
    external_ref text not null,
    entity_id integer not null references entity(id),
    duration integer not null,
    counter integer not null,
    amount integer not null
) partition by RANGE (month);
create index on monthly_data (external_ref);
create index on monthly_data (entity_id);
create view monthly_data_view as select * from monthly_data;
create table monthly_data_202601 partition of monthly_data for values from ('2026-01-01') to ('2026-01-31');
create table monthly_data_202602 partition of monthly_data for values from ('2026-02-01') to ('2026-02-28');
create table monthly_data_202603 partition of monthly_data for values from ('2026-03-01') to ('2026-03-31');

insert into monthly_data
with m as (
    select d::date as month from generate_series('2026-01-01'::date, '2026-03-31'::date, '1 month') g(d)
)
select m.month, 'ext-' || random(1, 50000), random(1, 1000), random(1, 1000), random(1, 1000), random(1, 100)
from generate_series(1, 3000000, 1) g(i), m;

analyze entity, entity_tags, monthly_data;

And this is the query:

select m.external_ref, t.tag_1, t.tag_2, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from monthly_data_view m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
where m.month between '2026-02-01'::date and '2026-02-28'::date
group by m.external_ref, grouping sets ((), t.tag_1, t.tag_2);

PostgreSQL 17 Plan:

GroupAggregate (cost=94584.40..253820.84 rows=1105572 width=49) (actual time=642.913..2291.658 rows=2271176 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
Group Key: monthly_data.external_ref, t.tag_1
Group Key: monthly_data.external_ref
Sort Key: monthly_data.external_ref, t.tag_2
Group Key: monthly_data.external_ref, t.tag_2
Buffers: shared hit=32066 read=13, temp read=36690 written=36703
I/O Timings: shared read=0.697, temp read=32.232 write=197.328
-> Gather Merge (cost=94584.40..159286.08 rows=555539 width=37) (actual time=642.904..977.809 rows=3000000 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=32066 read=13, temp read=18345 written=18351
I/O Timings: shared read=0.697, temp read=18.407 write=130.461
-> Sort (cost=93584.38..94163.07 rows=231475 width=37) (actual time=622.100..709.953 rows=1000000 loops=3)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Sort Key: monthly_data.external_ref, t.tag_1
Sort Method: external merge Disk: 52096kB
Buffers: shared hit=32066 read=13, temp read=18345 written=18351
I/O Timings: shared read=0.697, temp read=18.407 write=130.461
Worker 0: actual time=614.585..706.233 rows=976888 loops=1
Sort Method: external merge Disk: 47792kB
Buffers: shared hit=10526, temp read=5974 written=5976
I/O Timings: temp read=6.759 write=49.156
Worker 1: actual time=609.153..697.519 rows=958096 loops=1
Sort Method: external merge Disk: 46872kB
Buffers: shared hit=10388, temp read=5859 written=5861
I/O Timings: temp read=5.899 write=43.593
-> Nested Loop (cost=0.29..72959.38 rows=231475 width=37) (actual time=0.139..248.122 rows=1000000 loops=3)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Buffers: shared hit=32050 read=13
I/O Timings: shared read=0.697
Worker 0: actual time=0.061..243.302 rows=976888 loops=1
Buffers: shared hit=10518
Worker 1: actual time=0.058..246.889 rows=958096 loops=1
Buffers: shared hit=10380
-> Parallel Seq Scan on public.monthly_data_202602 monthly_data (cost=0.00..40809.00 rows=1250000 width=29) (actual time=0.014..64.695 rows=1000000 loops=3)
Output: monthly_data.external_ref, monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
Buffers: shared hit=22059
Worker 0: actual time=0.017..64.085 rows=976888 loops=1
Buffers: shared hit=7183
Worker 1: actual time=0.018..67.602 rows=958096 loops=1
Buffers: shared hit=7045
-> Memoize (cost=0.29..0.31 rows=1 width=28) (actual time=0.000..0.000 rows=1 loops=3000000)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Cache Key: monthly_data.month, monthly_data.entity_id
Cache Mode: binary
Hits: 1064016 Misses: 1000 Evictions: 0 Overflows: 0 Memory Usage: 133kB
Buffers: shared hit=9991 read=13
I/O Timings: shared read=0.697
Worker 0: actual time=0.000..0.000 rows=1 loops=976888
Hits: 975888 Misses: 1000 Evictions: 0 Overflows: 0 Memory Usage: 133kB
Buffers: shared hit=3335
Worker 1: actual time=0.000..0.000 rows=1 loops=958096
Hits: 957096 Misses: 1000 Evictions: 0 Overflows: 0 Memory Usage: 133kB
Buffers: shared hit=3335
-> Index Scan using entity_tags_pkey on public.entity_tags t (cost=0.28..0.30 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=3000)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Index Cond: ((t.entity_id = monthly_data.entity_id) AND (t.from_day <= monthly_data.month))
Filter: (monthly_data.month <= t.to_day)
Rows Removed by Filter: 0
Buffers: shared hit=9991 read=13
I/O Timings: shared read=0.697
Worker 0: actual time=0.002..0.002 rows=1 loops=1000
Buffers: shared hit=3335
Worker 1: actual time=0.001..0.002 rows=1 loops=1000
Buffers: shared hit=3335

PostgreSQL 18 plan (no partition pruning):

HashAggregate (cost=229746.36..242370.87 rows=12200 width=72) (actual time=1621.794..2508.533 rows=2262361.00 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
Hash Key: monthly_data.external_ref, t.tag_1
Hash Key: monthly_data.external_ref
Hash Key: monthly_data.external_ref, t.tag_2
Batches: 13 Memory Usage: 54433kB Disk Usage: 250536kB
Buffers: shared hit=66216, temp read=31017 written=58146
I/O Timings: temp read=29.524 write=118.672
-> Gather (cost=1050.51..222800.52 rows=555667 width=60) (actual time=93.721..192.443 rows=3000000.00 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=66216
-> Hash Join (cost=50.51..166233.82 rows=231528 width=60) (actual time=63.866..320.145 rows=1000000.00 loops=3)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Hash Cond: (monthly_data.entity_id = t.entity_id)
Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
Rows Removed by Join Filter: 667154
Buffers: shared hit=66216
Worker 0: actual time=0.852..768.416 rows=2995648.00 loops=1
Buffers: shared hit=22040
Worker 1: actual time=97.229..97.847 rows=2176.00 loops=1
Buffers: shared hit=22088
-> Parallel Append (cost=0.00..128677.01 rows=1250002 width=52) (actual time=63.442..158.419 rows=1000000.00 loops=3)
Buffers: shared hit=66177
Worker 0: actual time=0.032..284.520 rows=2995648.00 loops=1
Buffers: shared hit=22027
Worker 1: actual time=96.963..97.184 rows=2176.00 loops=1
Buffers: shared hit=22075
-> Parallel Seq Scan on public.monthly_data_202601 monthly_data_1 (cost=0.00..40809.00 rows=1 width=52) (actual time=96.957..96.957 rows=0.00 loops=1)
Output: monthly_data_1.external_ref, monthly_data_1.duration, monthly_data_1.counter, monthly_data_1.amount, monthly_data_1.entity_id, monthly_data_1.month
Filter: ((monthly_data_1.month >= '2026-02-01'::date) AND (monthly_data_1.month <= '2026-02-28'::date))
Rows Removed by Filter: 3000000
Buffers: shared hit=22059
Worker 1: actual time=96.957..96.957 rows=0.00 loops=1
Buffers: shared hit=22059
-> Parallel Seq Scan on public.monthly_data_202602 monthly_data_2 (cost=0.00..40809.00 rows=1250000 width=52) (actual time=0.013..62.957 rows=1000000.00 loops=3)
Output: monthly_data_2.external_ref, monthly_data_2.duration, monthly_data_2.counter, monthly_data_2.amount, monthly_data_2.entity_id, monthly_data_2.month
Filter: ((monthly_data_2.month >= '2026-02-01'::date) AND (monthly_data_2.month <= '2026-02-28'::date))
Buffers: shared hit=22059
Worker 0: actual time=0.032..188.573 rows=2995648.00 loops=1
Buffers: shared hit=22027
Worker 1: actual time=0.005..0.153 rows=2176.00 loops=1
Buffers: shared hit=16
-> Parallel Seq Scan on public.monthly_data_202603 monthly_data_3 (cost=0.00..40809.00 rows=1 width=52) (actual time=93.328..93.328 rows=0.00 loops=1)
Output: monthly_data_3.external_ref, monthly_data_3.duration, monthly_data_3.counter, monthly_data_3.amount, monthly_data_3.entity_id, monthly_data_3.month
Filter: ((monthly_data_3.month >= '2026-02-01'::date) AND (monthly_data_3.month <= '2026-02-28'::date))
Rows Removed by Filter: 3000000
Buffers: shared hit=22059
-> Hash (cost=29.67..29.67 rows=1667 width=28) (actual time=0.412..0.412 rows=1667.00 loops=3)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Buckets: 2048 Batches: 1 Memory Usage: 120kB
Buffers: shared hit=39
Worker 0: actual time=0.807..0.807 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.248..0.248 rows=1667.00 loops=1
Buffers: shared hit=13
-> Seq Scan on public.entity_tags t (cost=0.00..29.67 rows=1667 width=28) (actual time=0.058..0.222 rows=1667.00 loops=3)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Buffers: shared hit=39
Worker 0: actual time=0.104..0.435 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.058..0.137 rows=1667.00 loops=1
Buffers: shared hit=13

On PG18, replacing the monthly_data_view and month condition with a sub-query or CTE restores partition pruning:

with m as (
    select * from monthly_data where month between '2026-02-01'::date and '2026-02-28'::date
)
select m.external_ref, t.tag_1, t.tag_2, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
group by m.external_ref, grouping sets ((), t.tag_1, t.tag_2);
HashAggregate (cost=141878.30..154502.80 rows=12200 width=72) (actual time=1583.549..2502.394 rows=2262361.00 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
Hash Key: monthly_data.external_ref, t.tag_1
Hash Key: monthly_data.external_ref
Hash Key: monthly_data.external_ref, t.tag_2
Batches: 13 Memory Usage: 54433kB Disk Usage: 250552kB
Buffers: shared hit=22098, temp read=31016 written=58135
I/O Timings: temp read=27.912 write=116.172
-> Gather (cost=1050.51..134932.46 rows=555667 width=60) (actual time=1.314..105.099 rows=3000000.00 loops=1)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=22098
-> Hash Join (cost=50.51..78365.76 rows=231528 width=60) (actual time=0.783..239.677 rows=1000000.00 loops=3)
Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
Hash Cond: (monthly_data.entity_id = t.entity_id)
Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
Rows Removed by Join Filter: 667154
Buffers: shared hit=22098
Worker 0: actual time=0.726..698.352 rows=2969536.00 loops=1
Buffers: shared hit=21848
Worker 1: actual time=0.653..16.148 rows=26112.00 loops=1
Buffers: shared hit=205
-> Parallel Seq Scan on public.monthly_data_202602 monthly_data (cost=0.00..40809.00 rows=1250000 width=52) (actual time=0.022..68.714 rows=1000000.00 loops=3)
Output: monthly_data.external_ref, monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
Buffers: shared hit=22059
Worker 0: actual time=0.030..199.783 rows=2969536.00 loops=1
Buffers: shared hit=21835
Worker 1: actual time=0.023..5.233 rows=26112.00 loops=1
Buffers: shared hit=192
-> Hash (cost=29.67..29.67 rows=1667 width=28) (actual time=0.749..0.749 rows=1667.00 loops=3)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Buckets: 2048 Batches: 1 Memory Usage: 120kB
Buffers: shared hit=39
Worker 0: actual time=0.679..0.679 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.621..0.622 rows=1667.00 loops=1
Buffers: shared hit=13
-> Seq Scan on public.entity_tags t (cost=0.00..29.67 rows=1667 width=28) (actual time=0.058..0.388 rows=1667.00 loops=3)
Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
Buffers: shared hit=39
Worker 0: actual time=0.092..0.420 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.072..0.321 rows=1667.00 loops=1
Buffers: shared hit=13

On PG18 pruning is also restored keeping the view but performing a single "group by" instead of multiple grouping sets:

select t.tag_1, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from monthly_data_view m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
where m.month between '2026-02-01'::date and '2026-02-28'::date
group by t.tag_1;

Finalize GroupAggregate (cost=81682.97..81698.65 rows=50 width=32) (actual time=356.116..358.029 rows=50.00 loops=1)
Output: t.tag_1, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
Group Key: t.tag_1
Buffers: shared hit=22114
-> Gather Merge (cost=81682.97..81696.95 rows=120 width=32) (actual time=356.111..358.009 rows=150.00 loops=1)
Output: t.tag_1, (PARTIAL sum(monthly_data.duration)), (PARTIAL sum(monthly_data.counter)), (PARTIAL sum(monthly_data.amount))
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=22114
-> Sort (cost=80682.95..80683.07 rows=50 width=32) (actual time=349.568..349.570 rows=50.00 loops=3)
Output: t.tag_1, (PARTIAL sum(monthly_data.duration)), (PARTIAL sum(monthly_data.counter)), (PARTIAL sum(monthly_data.amount))
Sort Key: t.tag_1
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=22114
Worker 0: actual time=346.658..346.660 rows=50.00 loops=1
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=7385
Worker 1: actual time=346.663..346.665 rows=50.00 loops=1
Sort Method: quicksort Memory: 27kB
Buffers: shared hit=7235
-> Partial HashAggregate (cost=80681.04..80681.54 rows=50 width=32) (actual time=349.530..349.533 rows=50.00 loops=3)
Output: t.tag_1, PARTIAL sum(monthly_data.duration), PARTIAL sum(monthly_data.counter), PARTIAL sum(monthly_data.amount)
Group Key: t.tag_1
Batches: 1 Memory Usage: 32kB
Buffers: shared hit=22098
Worker 0: actual time=346.608..346.611 rows=50.00 loops=1
Batches: 1 Memory Usage: 32kB
Buffers: shared hit=7377
Worker 1: actual time=346.615..346.618 rows=50.00 loops=1
Batches: 1 Memory Usage: 32kB
Buffers: shared hit=7227
-> Hash Join (cost=50.51..78365.76 rows=231528 width=20) (actual time=0.936..260.236 rows=1000000.00 loops=3)
Output: t.tag_1, monthly_data.duration, monthly_data.counter, monthly_data.amount
Hash Cond: (monthly_data.entity_id = t.entity_id)
Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
Rows Removed by Join Filter: 667154
Buffers: shared hit=22098
Worker 0: actual time=1.031..261.125 rows=1001480.00 loops=1
Buffers: shared hit=7377
Worker 1: actual time=0.947..259.326 rows=981104.00 loops=1
Buffers: shared hit=7227
-> Parallel Seq Scan on public.monthly_data_202602 monthly_data (cost=0.00..40809.00 rows=1250000 width=20) (actual time=0.027..79.622 rows=1000000.00 loops=3)
Output: monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
Buffers: shared hit=22059
Worker 0: actual time=0.030..80.173 rows=1001480.00 loops=1
Buffers: shared hit=7364
Worker 1: actual time=0.031..82.531 rows=981104.00 loops=1
Buffers: shared hit=7214
-> Hash (cost=29.67..29.67 rows=1667 width=20) (actual time=0.895..0.895 rows=1667.00 loops=3)
Output: t.tag_1, t.entity_id, t.from_day, t.to_day
Buckets: 2048 Batches: 1 Memory Usage: 106kB
Buffers: shared hit=39
Worker 0: actual time=0.983..0.983 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.898..0.898 rows=1667.00 loops=1
Buffers: shared hit=13
-> Seq Scan on public.entity_tags t (cost=0.00..29.67 rows=1667 width=20) (actual time=0.081..0.542 rows=1667.00 loops=3)
Output: t.tag_1, t.entity_id, t.from_day, t.to_day
Buffers: shared hit=39
Worker 0: actual time=0.118..0.540 rows=1667.00 loops=1
Buffers: shared hit=13
Worker 1: actual time=0.117..0.483 rows=1667.00 loops=1
Buffers: shared hit=13

pgsql-performance by date:

From: David Rowley
Date: 01 April, 14:12:18
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

From: Tom Lane
Date: 01 April, 16:44:41
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17

Potential partition pruning regression on PostgreSQL 18 - Mailing list pgsql-performance

Previous

Next