BUG #18109: enable_partitionwise_aggregate does not apply when inserting or updating - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18109: enable_partitionwise_aggregate does not apply when inserting or updating
Date
Msg-id 18109-d3ff785fb25ad31e@postgresql.org
Whole thread Raw
Responses Re: BUG #18109: enable_partitionwise_aggregate does not apply when inserting or updating
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18109
Logged by:          Stephen Flavin
Email address:      stephen_flav@hotmail.com
PostgreSQL version: 15.3
Operating system:   macOS Ventura 13.4.1
Description:

I have a partitioned table which contains some raw data that I need to
aggregate and upsert into another table (partitioned by the same key).
The plan is to aggregate the raw table at a regular interval, when
aggregation runs I use a procedure to rename the existing raw table and
create a new one, this stops writes and allow the apps to continue writing
while the aggregation and upsert is running.

The problem I have is the procedure to run the aggregates takes too long and
allows too much lag to build up so the procedure get's slower and slower
over time.

I've been looking at `enable_partitionwise_aggregate` which significantly
increases the speed of the aggregation however, it seems that when I combine
the select that normally launches the parallel aggregations it's forced to
be sequential even if I wrap the aggregation query in a materialised cte.

replication steps:
```
create unlogged table datapoints (id bigint, tag text, day date, count
bigint) partition by hash (id);
create unlogged table aggregated_datapoints (like datapoints, primary key
(id, tag, day)) partition by hash (id);

-- generate partitions
do
$do$
    declare
        i smallint;
    begin
        for i in select generate_series(0, 15)
            loop
                execute $$
                  create unlogged table datapoints_$$ || i || $$
                  partition of datapoints
                  for values with (modulus 16, remainder $$ || i || $$);

                          create unlogged table aggregated_datapoints_$$ || i || $$
                  partition of aggregated_datapoints
                  for values with (modulus 16, remainder $$ || i || $$);
            $$;
            end loop;
    end
$do$;

set work_mem = '128MB';
set force_parallel_mode = 'on';
set max_parallel_workers = 16;
-- max_worker_processes = 16 (set in postgresql.conf and restarted)
set max_parallel_workers_per_gather = 16;
SET parallel_leader_participation = 'off';
set enable_partitionwise_aggregate = 'on';

begin;
-- seed data
insert into datapoints
select 
  (random()*(16*10))::int,
    ((random()*(16*10))::int)::text,
    to_timestamp((random()*16)::int)::date,
    (random()*100)::int 
from generate_series(1, 1000000 * 16); -- 1M per partition
-- INSERT 0 16000000
-- Time: 7395.269 ms (00:07.395)

explain analyse select
  id,
  tag,
  day,
  sum(count)
from datapoints
group by id, tag, day;
--  Gather  (cost=24216.48..172624.93 rows=1039570 width=76) (actual
time=250.718..607.141 rows=25921 loops=1)
--    Workers Planned: 5
--    Workers Launched: 5
--    ->  Parallel Append  (cost=23216.48..67667.93 rows=207914 width=76)
(actual time=258.961..571.139 rows=5184 loops=5)
--  Planning Time: 1.651 ms
--  Execution Time: 623.554 ms

rollback; -- to ensure there's no caching going on

begin;
-- seed data
insert into datapoints
select 
  (random()*(16*10))::int,
    ((random()*(16*10))::int)::text,
    to_timestamp((random()*16)::int)::date,
    (random()*100)::int 
from generate_series(1, 1000000 * 16); -- 1M per partition
-- INSERT 0 16000000
-- Time: 7395.269 ms (00:07.395)

explain analyse insert into aggregated_datapoints
select
  id,
  tag,
  day,
  sum(count)
from datapoints
group by id, tag, day;
--  Insert on aggregated_datapoints  (cost=12093.01..314337.65 rows=0
width=0) (actual time=6973.353..6973.361 rows=0 loops=1)
--    ->  Subquery Scan on "*SELECT*"  (cost=12093.01..314337.65 rows=516734
width=52) (actual time=379.111..6939.722 rows=25921 loops=1)
--          ->  Append  (cost=12093.01..307878.48 rows=516734 width=76)
(actual time=379.110..6937.932 rows=25921 loops=1)
--  Planning Time: 0.166 ms
--  Execution Time: 7001.725 ms
rollback;
```
two side questions here:
 1. I can't seem to get the parallel aggregation to use all 16 available
workers, is there some additional config I need to bump how many workers
would be planned?
 2. The best scenario would be that the insert itself would be partitionwise
since both the raw and aggregated tables will have the same keys in each
partition but I'm not sure if that's possible without running the inserts in
parallel manually on each partitioned table?


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18108: server process was terminated by signal 11: Segmentation fault
Next
From: David Rowley
Date:
Subject: Re: BUG #18109: enable_partitionwise_aggregate does not apply when inserting or updating