Potential partition pruning regression on PostgreSQL 18 - Mailing list pgsql-performance

From Cándido Antonio Martínez Descalzo
Subject Potential partition pruning regression on PostgreSQL 18
Date
Msg-id CAH5YaUwVUWETTyVECTnhs7C=CVwi+uMSQH=cOkwAUqMdvXdwWA@mail.gmail.com
Whole thread
Responses Re: Potential partition pruning regression on PostgreSQL 18
List pgsql-performance
Hi all,

We noticed that one of our queries unexpectedly stopped applying partition pruning on PG18, although it applies it on PG16 and PG17. The issue has been replicated on Linux and macOS.

Failing to apply partition pruning significantly impacts the performance of these queries.

We recreated the issue using a simplified schema and query. Details on the schema, query and resulting plans in PG17 and PG18 are provided below. Some changes in the query restore partition pruning in PG18, specifically:
  • Replacing the view and date condition used with a sub-query or CTE  with the same condition restores partition pruning (updated query and plan provided further below)
  • Keeping the view and using a single "group by" instead of multiple grouping sets restores partition pruning (updated query and plan provided further below)

Does anybody know if there is a documented behaviour change in PG18 that could explain this or if this is a known issue?

Many thanks,

Cándido Martínez
ninehq

This is the schema used:
create table entity (
id integer primary key,
name varchar(255) unique not null
);

insert into entity (id, name)
select i, 'Entity ' || i from generate_series(1, 1000, 1) g(i);

create table entity_tags (
entity_id integer not null references entity(id),
from_day date not null,
to_day date not null,
tag_1 text not null,
tag_2 text not null,
primary key (entity_id, from_day)
);

insert into entity_tags
select id, '2025-01-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 0;

insert into entity_tags
select id, '2025-01-01'::date, '2026-01-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 1;

insert into entity_tags
select id, '2026-02-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 1;

insert into entity_tags
select id, '2025-01-01'::date, '2026-02-28'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 2;

insert into entity_tags
select id, '2026-03-01'::date, '9999-12-31'::date, 'Tag 1-' || random(1,50), 'Tag 2-' || random(1, 10)
from entity where id % 3 = 2;

create table monthly_data (
month date not null,
external_ref text not null,
entity_id integer not null references entity(id),
duration integer not null,
counter integer not null,
amount integer not null
) partition by RANGE (month);
create index on monthly_data (external_ref);
create index on monthly_data (entity_id);
create view monthly_data_view as select * from monthly_data;
create table monthly_data_202601 partition of monthly_data for values from ('2026-01-01') to ('2026-01-31');
create table monthly_data_202602 partition of monthly_data for values from ('2026-02-01') to ('2026-02-28');
create table monthly_data_202603 partition of monthly_data for values from ('2026-03-01') to ('2026-03-31');

insert into monthly_data
with m as (
select d::date as month from generate_series('2026-01-01'::date, '2026-03-31'::date, '1 month') g(d)
)
select m.month, 'ext-' || random(1, 50000), random(1, 1000), random(1, 1000), random(1, 1000), random(1, 100)
from generate_series(1, 3000000, 1) g(i), m;

analyze entity, entity_tags, monthly_data;
And this is the query:
select m.external_ref, t.tag_1, t.tag_2, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from monthly_data_view m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
where m.month between '2026-02-01'::date and '2026-02-28'::date
group by m.external_ref, grouping sets ((), t.tag_1, t.tag_2);
PostgreSQL 17 Plan:
GroupAggregate  (cost=94584.40..253820.84 rows=1105572 width=49) (actual time=642.913..2291.658 rows=2271176 loops=1)
  Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
  Group Key: monthly_data.external_ref, t.tag_1
  Group Key: monthly_data.external_ref
  Sort Key: monthly_data.external_ref, t.tag_2
    Group Key: monthly_data.external_ref, t.tag_2
  Buffers: shared hit=32066 read=13, temp read=36690 written=36703
  I/O Timings: shared read=0.697, temp read=32.232 write=197.328
  ->  Gather Merge  (cost=94584.40..159286.08 rows=555539 width=37) (actual time=642.904..977.809 rows=3000000 loops=1)
        Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=32066 read=13, temp read=18345 written=18351
        I/O Timings: shared read=0.697, temp read=18.407 write=130.461
        ->  Sort  (cost=93584.38..94163.07 rows=231475 width=37) (actual time=622.100..709.953 rows=1000000 loops=3)
              Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
              Sort Key: monthly_data.external_ref, t.tag_1
              Sort Method: external merge  Disk: 52096kB
              Buffers: shared hit=32066 read=13, temp read=18345 written=18351
              I/O Timings: shared read=0.697, temp read=18.407 write=130.461
              Worker 0:  actual time=614.585..706.233 rows=976888 loops=1
                Sort Method: external merge  Disk: 47792kB
                Buffers: shared hit=10526, temp read=5974 written=5976
                I/O Timings: temp read=6.759 write=49.156
              Worker 1:  actual time=609.153..697.519 rows=958096 loops=1
                Sort Method: external merge  Disk: 46872kB
                Buffers: shared hit=10388, temp read=5859 written=5861
                I/O Timings: temp read=5.899 write=43.593
              ->  Nested Loop  (cost=0.29..72959.38 rows=231475 width=37) (actual time=0.139..248.122 rows=1000000 loops=3)
                    Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
                    Buffers: shared hit=32050 read=13
                    I/O Timings: shared read=0.697
                    Worker 0:  actual time=0.061..243.302 rows=976888 loops=1
                      Buffers: shared hit=10518
                    Worker 1:  actual time=0.058..246.889 rows=958096 loops=1
                      Buffers: shared hit=10380
                    ->  Parallel Seq Scan on public.monthly_data_202602 monthly_data  (cost=0.00..40809.00 rows=1250000 width=29) (actual time=0.014..64.695 rows=1000000 loops=3)
                          Output: monthly_data.external_ref, monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
                          Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
                          Buffers: shared hit=22059
                          Worker 0:  actual time=0.017..64.085 rows=976888 loops=1
                            Buffers: shared hit=7183
                          Worker 1:  actual time=0.018..67.602 rows=958096 loops=1
                            Buffers: shared hit=7045
                    ->  Memoize  (cost=0.29..0.31 rows=1 width=28) (actual time=0.000..0.000 rows=1 loops=3000000)
                          Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                          Cache Key: monthly_data.month, monthly_data.entity_id
                          Cache Mode: binary
                          Hits: 1064016  Misses: 1000  Evictions: 0  Overflows: 0  Memory Usage: 133kB
                          Buffers: shared hit=9991 read=13
                          I/O Timings: shared read=0.697
                          Worker 0:  actual time=0.000..0.000 rows=1 loops=976888
                            Hits: 975888  Misses: 1000  Evictions: 0  Overflows: 0  Memory Usage: 133kB
                            Buffers: shared hit=3335
                          Worker 1:  actual time=0.000..0.000 rows=1 loops=958096
                            Hits: 957096  Misses: 1000  Evictions: 0  Overflows: 0  Memory Usage: 133kB
                            Buffers: shared hit=3335
                          ->  Index Scan using entity_tags_pkey on public.entity_tags t  (cost=0.28..0.30 rows=1 width=28) (actual time=0.002..0.002 rows=1 loops=3000)
                                Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                                Index Cond: ((t.entity_id = monthly_data.entity_id) AND (t.from_day <= monthly_data.month))
                                Filter: (monthly_data.month <= t.to_day)
                                Rows Removed by Filter: 0
                                Buffers: shared hit=9991 read=13
                                I/O Timings: shared read=0.697
                                Worker 0:  actual time=0.002..0.002 rows=1 loops=1000
                                  Buffers: shared hit=3335
                                Worker 1:  actual time=0.001..0.002 rows=1 loops=1000
                                  Buffers: shared hit=3335


PostgreSQL 18 plan (no partition pruning):
HashAggregate  (cost=229746.36..242370.87 rows=12200 width=72) (actual time=1621.794..2508.533 rows=2262361.00 loops=1)
  Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
  Hash Key: monthly_data.external_ref, t.tag_1
  Hash Key: monthly_data.external_ref
  Hash Key: monthly_data.external_ref, t.tag_2
  Batches: 13  Memory Usage: 54433kB  Disk Usage: 250536kB
  Buffers: shared hit=66216, temp read=31017 written=58146
  I/O Timings: temp read=29.524 write=118.672
  ->  Gather  (cost=1050.51..222800.52 rows=555667 width=60) (actual time=93.721..192.443 rows=3000000.00 loops=1)
        Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=66216
        ->  Hash Join  (cost=50.51..166233.82 rows=231528 width=60) (actual time=63.866..320.145 rows=1000000.00 loops=3)
              Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
              Hash Cond: (monthly_data.entity_id = t.entity_id)
              Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
              Rows Removed by Join Filter: 667154
              Buffers: shared hit=66216
              Worker 0:  actual time=0.852..768.416 rows=2995648.00 loops=1
                Buffers: shared hit=22040
              Worker 1:  actual time=97.229..97.847 rows=2176.00 loops=1
                Buffers: shared hit=22088
              ->  Parallel Append  (cost=0.00..128677.01 rows=1250002 width=52) (actual time=63.442..158.419 rows=1000000.00 loops=3)
                    Buffers: shared hit=66177
                    Worker 0:  actual time=0.032..284.520 rows=2995648.00 loops=1
                      Buffers: shared hit=22027
                    Worker 1:  actual time=96.963..97.184 rows=2176.00 loops=1
                      Buffers: shared hit=22075
                    ->  Parallel Seq Scan on public.monthly_data_202601 monthly_data_1  (cost=0.00..40809.00 rows=1 width=52) (actual time=96.957..96.957 rows=0.00 loops=1)
                          Output: monthly_data_1.external_ref, monthly_data_1.duration, monthly_data_1.counter, monthly_data_1.amount, monthly_data_1.entity_id, monthly_data_1.month
                          Filter: ((monthly_data_1.month >= '2026-02-01'::date) AND (monthly_data_1.month <= '2026-02-28'::date))
                          Rows Removed by Filter: 3000000
                          Buffers: shared hit=22059
                          Worker 1:  actual time=96.957..96.957 rows=0.00 loops=1
                            Buffers: shared hit=22059
                    ->  Parallel Seq Scan on public.monthly_data_202602 monthly_data_2  (cost=0.00..40809.00 rows=1250000 width=52) (actual time=0.013..62.957 rows=1000000.00 loops=3)
                          Output: monthly_data_2.external_ref, monthly_data_2.duration, monthly_data_2.counter, monthly_data_2.amount, monthly_data_2.entity_id, monthly_data_2.month
                          Filter: ((monthly_data_2.month >= '2026-02-01'::date) AND (monthly_data_2.month <= '2026-02-28'::date))
                          Buffers: shared hit=22059
                          Worker 0:  actual time=0.032..188.573 rows=2995648.00 loops=1
                            Buffers: shared hit=22027
                          Worker 1:  actual time=0.005..0.153 rows=2176.00 loops=1
                            Buffers: shared hit=16
                    ->  Parallel Seq Scan on public.monthly_data_202603 monthly_data_3  (cost=0.00..40809.00 rows=1 width=52) (actual time=93.328..93.328 rows=0.00 loops=1)
                          Output: monthly_data_3.external_ref, monthly_data_3.duration, monthly_data_3.counter, monthly_data_3.amount, monthly_data_3.entity_id, monthly_data_3.month
                          Filter: ((monthly_data_3.month >= '2026-02-01'::date) AND (monthly_data_3.month <= '2026-02-28'::date))
                          Rows Removed by Filter: 3000000
                          Buffers: shared hit=22059
              ->  Hash  (cost=29.67..29.67 rows=1667 width=28) (actual time=0.412..0.412 rows=1667.00 loops=3)
                    Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                    Buckets: 2048  Batches: 1  Memory Usage: 120kB
                    Buffers: shared hit=39
                    Worker 0:  actual time=0.807..0.807 rows=1667.00 loops=1
                      Buffers: shared hit=13
                    Worker 1:  actual time=0.248..0.248 rows=1667.00 loops=1
                      Buffers: shared hit=13
                    ->  Seq Scan on public.entity_tags t  (cost=0.00..29.67 rows=1667 width=28) (actual time=0.058..0.222 rows=1667.00 loops=3)
                          Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                          Buffers: shared hit=39
                          Worker 0:  actual time=0.104..0.435 rows=1667.00 loops=1
                            Buffers: shared hit=13
                          Worker 1:  actual time=0.058..0.137 rows=1667.00 loops=1
                            Buffers: shared hit=13

On PG18, replacing the monthly_data_view and month condition with a sub-query or CTE restores partition pruning:
with m as (
select * from monthly_data where month between '2026-02-01'::date and '2026-02-28'::date
)
select m.external_ref, t.tag_1, t.tag_2, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
group by m.external_ref, grouping sets ((), t.tag_1, t.tag_2);
HashAggregate  (cost=141878.30..154502.80 rows=12200 width=72) (actual time=1583.549..2502.394 rows=2262361.00 loops=1)
  Output: monthly_data.external_ref, t.tag_1, t.tag_2, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
  Hash Key: monthly_data.external_ref, t.tag_1
  Hash Key: monthly_data.external_ref
  Hash Key: monthly_data.external_ref, t.tag_2
  Batches: 13  Memory Usage: 54433kB  Disk Usage: 250552kB
  Buffers: shared hit=22098, temp read=31016 written=58135
  I/O Timings: temp read=27.912 write=116.172
  ->  Gather  (cost=1050.51..134932.46 rows=555667 width=60) (actual time=1.314..105.099 rows=3000000.00 loops=1)
        Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=22098
        ->  Hash Join  (cost=50.51..78365.76 rows=231528 width=60) (actual time=0.783..239.677 rows=1000000.00 loops=3)
              Output: monthly_data.external_ref, t.tag_1, t.tag_2, monthly_data.duration, monthly_data.counter, monthly_data.amount
              Hash Cond: (monthly_data.entity_id = t.entity_id)
              Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
              Rows Removed by Join Filter: 667154
              Buffers: shared hit=22098
              Worker 0:  actual time=0.726..698.352 rows=2969536.00 loops=1
                Buffers: shared hit=21848
              Worker 1:  actual time=0.653..16.148 rows=26112.00 loops=1
                Buffers: shared hit=205
              ->  Parallel Seq Scan on public.monthly_data_202602 monthly_data  (cost=0.00..40809.00 rows=1250000 width=52) (actual time=0.022..68.714 rows=1000000.00 loops=3)
                    Output: monthly_data.external_ref, monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
                    Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
                    Buffers: shared hit=22059
                    Worker 0:  actual time=0.030..199.783 rows=2969536.00 loops=1
                      Buffers: shared hit=21835
                    Worker 1:  actual time=0.023..5.233 rows=26112.00 loops=1
                      Buffers: shared hit=192
              ->  Hash  (cost=29.67..29.67 rows=1667 width=28) (actual time=0.749..0.749 rows=1667.00 loops=3)
                    Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                    Buckets: 2048  Batches: 1  Memory Usage: 120kB
                    Buffers: shared hit=39
                    Worker 0:  actual time=0.679..0.679 rows=1667.00 loops=1
                      Buffers: shared hit=13
                    Worker 1:  actual time=0.621..0.622 rows=1667.00 loops=1
                      Buffers: shared hit=13
                    ->  Seq Scan on public.entity_tags t  (cost=0.00..29.67 rows=1667 width=28) (actual time=0.058..0.388 rows=1667.00 loops=3)
                          Output: t.tag_1, t.tag_2, t.entity_id, t.from_day, t.to_day
                          Buffers: shared hit=39
                          Worker 0:  actual time=0.092..0.420 rows=1667.00 loops=1
                            Buffers: shared hit=13
                          Worker 1:  actual time=0.072..0.321 rows=1667.00 loops=1
                            Buffers: shared hit=13


On PG18 pruning is also restored keeping the view but performing a single "group by" instead of multiple grouping sets:
select t.tag_1, sum(m.duration) as duration, sum(m.counter) as counter, sum(m.amount) as amount
from monthly_data_view m
join entity_tags t on m.entity_id = t.entity_id and m.month between t.from_day and t.to_day
where m.month between '2026-02-01'::date and '2026-02-28'::date
group by t.tag_1;

Finalize GroupAggregate  (cost=81682.97..81698.65 rows=50 width=32) (actual time=356.116..358.029 rows=50.00 loops=1)
  Output: t.tag_1, sum(monthly_data.duration), sum(monthly_data.counter), sum(monthly_data.amount)
  Group Key: t.tag_1
  Buffers: shared hit=22114
  ->  Gather Merge  (cost=81682.97..81696.95 rows=120 width=32) (actual time=356.111..358.009 rows=150.00 loops=1)
        Output: t.tag_1, (PARTIAL sum(monthly_data.duration)), (PARTIAL sum(monthly_data.counter)), (PARTIAL sum(monthly_data.amount))
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=22114
        ->  Sort  (cost=80682.95..80683.07 rows=50 width=32) (actual time=349.568..349.570 rows=50.00 loops=3)
              Output: t.tag_1, (PARTIAL sum(monthly_data.duration)), (PARTIAL sum(monthly_data.counter)), (PARTIAL sum(monthly_data.amount))
              Sort Key: t.tag_1
              Sort Method: quicksort  Memory: 27kB
              Buffers: shared hit=22114
              Worker 0:  actual time=346.658..346.660 rows=50.00 loops=1
                Sort Method: quicksort  Memory: 27kB
                Buffers: shared hit=7385
              Worker 1:  actual time=346.663..346.665 rows=50.00 loops=1
                Sort Method: quicksort  Memory: 27kB
                Buffers: shared hit=7235
              ->  Partial HashAggregate  (cost=80681.04..80681.54 rows=50 width=32) (actual time=349.530..349.533 rows=50.00 loops=3)
                    Output: t.tag_1, PARTIAL sum(monthly_data.duration), PARTIAL sum(monthly_data.counter), PARTIAL sum(monthly_data.amount)
                    Group Key: t.tag_1
                    Batches: 1  Memory Usage: 32kB
                    Buffers: shared hit=22098
                    Worker 0:  actual time=346.608..346.611 rows=50.00 loops=1
                      Batches: 1  Memory Usage: 32kB
                      Buffers: shared hit=7377
                    Worker 1:  actual time=346.615..346.618 rows=50.00 loops=1
                      Batches: 1  Memory Usage: 32kB
                      Buffers: shared hit=7227
                    ->  Hash Join  (cost=50.51..78365.76 rows=231528 width=20) (actual time=0.936..260.236 rows=1000000.00 loops=3)
                          Output: t.tag_1, monthly_data.duration, monthly_data.counter, monthly_data.amount
                          Hash Cond: (monthly_data.entity_id = t.entity_id)
                          Join Filter: ((monthly_data.month >= t.from_day) AND (monthly_data.month <= t.to_day))
                          Rows Removed by Join Filter: 667154
                          Buffers: shared hit=22098
                          Worker 0:  actual time=1.031..261.125 rows=1001480.00 loops=1
                            Buffers: shared hit=7377
                          Worker 1:  actual time=0.947..259.326 rows=981104.00 loops=1
                            Buffers: shared hit=7227
                          ->  Parallel Seq Scan on public.monthly_data_202602 monthly_data  (cost=0.00..40809.00 rows=1250000 width=20) (actual time=0.027..79.622 rows=1000000.00 loops=3)
                                Output: monthly_data.duration, monthly_data.counter, monthly_data.amount, monthly_data.entity_id, monthly_data.month
                                Filter: ((monthly_data.month >= '2026-02-01'::date) AND (monthly_data.month <= '2026-02-28'::date))
                                Buffers: shared hit=22059
                                Worker 0:  actual time=0.030..80.173 rows=1001480.00 loops=1
                                  Buffers: shared hit=7364
                                Worker 1:  actual time=0.031..82.531 rows=981104.00 loops=1
                                  Buffers: shared hit=7214
                          ->  Hash  (cost=29.67..29.67 rows=1667 width=20) (actual time=0.895..0.895 rows=1667.00 loops=3)
                                Output: t.tag_1, t.entity_id, t.from_day, t.to_day
                                Buckets: 2048  Batches: 1  Memory Usage: 106kB
                                Buffers: shared hit=39
                                Worker 0:  actual time=0.983..0.983 rows=1667.00 loops=1
                                  Buffers: shared hit=13
                                Worker 1:  actual time=0.898..0.898 rows=1667.00 loops=1
                                  Buffers: shared hit=13
                                ->  Seq Scan on public.entity_tags t  (cost=0.00..29.67 rows=1667 width=20) (actual time=0.081..0.542 rows=1667.00 loops=3)
                                      Output: t.tag_1, t.entity_id, t.from_day, t.to_day
                                      Buffers: shared hit=39
                                      Worker 0:  actual time=0.118..0.540 rows=1667.00 loops=1
                                        Buffers: shared hit=13
                                      Worker 1:  actual time=0.117..0.483 rows=1667.00 loops=1
                                        Buffers: shared hit=13

pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17
Next
From: Tom Lane
Date:
Subject: Re: Significant performance issues with array_agg() + HashAggregate plans on Postgres 17