Re: A reloption for partitioned tables - parallel_workers - Mailing list pgsql-hackers
From | Laurenz Albe |
---|---|
Subject | Re: A reloption for partitioned tables - parallel_workers |
Date | |
Msg-id | 05f271ec2076d651e98f95fd9fc63784e4f34a57.camel@cybertec.at Whole thread Raw |
In response to | Re: A reloption for partitioned tables - parallel_workers (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: A reloption for partitioned tables - parallel_workers
|
List | pgsql-hackers |
On Wed, 2021-03-24 at 14:14 +1300, David Rowley wrote: > On Fri, 19 Mar 2021 at 02:07, Amit Langote <amitlangote09@gmail.com> wrote: > > Attached a new version rebased over c8f78b616, with the grouping > > relation partitioning enhancements as a separate patch 0001. Sorry > > about the delay. > > I had a quick look at this and wondered if the partitioned table's > parallel workers shouldn't be limited to the sum of the parallel > workers of the Append's subpaths? > > It seems a bit weird to me that the following case requests 4 workers: > > # create table lp (a int) partition by list(a); > # create table lp1 partition of lp for values in(1); > # insert into lp select 1 from generate_series(1,10000000) x; > # alter table lp1 set (parallel_workers = 2); > # alter table lp set (parallel_workers = 4); > # set max_parallel_workers_per_Gather = 8; > # explain select count(*) from lp; > QUERY PLAN > ------------------------------------------------------------------------------------------- > Finalize Aggregate (cost=97331.63..97331.64 rows=1 width=8) > -> Gather (cost=97331.21..97331.62 rows=4 width=8) > Workers Planned: 4 > -> Partial Aggregate (cost=96331.21..96331.22 rows=1 width=8) > -> Parallel Seq Scan on lp1 lp (cost=0.00..85914.57 > rows=4166657 width=0) > (5 rows) > > I can see a good argument that there should only be 2 workers here. Good point, I agree. > If someone sets the partitioned table's parallel_workers high so that > they get a large number of workers when no partitions are pruned > during planning, do they really want the same number of workers in > queries where a large number of partitions are pruned? > > This problem gets a bit more complex in generic plans where the > planner can't prune anything but run-time pruning prunes many > partitions. I'm not so sure what to do about that, but the problem > does exist today to a lesser extent with the current method of > determining the append parallel workers. Also a good point. That would require changing the actual number of parallel workers at execution time, but that is tricky. If we go with your suggestion above, we'd have to disambiguate if the number of workers is set because a partition is large enough to warrant a parallel scan (then it shouldn't be reduced if the executor prunes partitions) or if it is because of the number of partitions (then it should be reduced). Currently, we don't reduce parallelism if the executor prunes partitions, so this could be seen as an independent problem. I don't know if Seamus is still working on that; if not, we might mark it as "returned with feedback". Perhaps Amit's patch 0001 should go in independently. I'll mark the patch as "waiting for author". Yours, Laurenz Albe
pgsql-hackers by date: