Re: A reloption for partitioned tables - parallel_workers - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: A reloption for partitioned tables - parallel_workers |
Date | |
Msg-id | CA+HiwqHMmU=GwUvpWEScQJzDgSF9-voZco8C5ttX_BzYqEuB6w@mail.gmail.com Whole thread Raw |
In response to | Re: A reloption for partitioned tables - parallel_workers (Laurenz Albe <laurenz.albe@cybertec.at>) |
Responses |
Re: A reloption for partitioned tables - parallel_workers
|
List | pgsql-hackers |
On Fri, Apr 2, 2021 at 11:36 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > On Wed, 2021-03-24 at 14:14 +1300, David Rowley wrote: > > On Fri, 19 Mar 2021 at 02:07, Amit Langote <amitlangote09@gmail.com> wrote: > > > Attached a new version rebased over c8f78b616, with the grouping > > > relation partitioning enhancements as a separate patch 0001. Sorry > > > about the delay. > > > > I had a quick look at this and wondered if the partitioned table's > > parallel workers shouldn't be limited to the sum of the parallel > > workers of the Append's subpaths? > > > > It seems a bit weird to me that the following case requests 4 workers: > > > > # create table lp (a int) partition by list(a); > > # create table lp1 partition of lp for values in(1); > > # insert into lp select 1 from generate_series(1,10000000) x; > > # alter table lp1 set (parallel_workers = 2); > > # alter table lp set (parallel_workers = 4); > > # set max_parallel_workers_per_Gather = 8; > > # explain select count(*) from lp; > > QUERY PLAN > > ------------------------------------------------------------------------------------------- > > Finalize Aggregate (cost=97331.63..97331.64 rows=1 width=8) > > -> Gather (cost=97331.21..97331.62 rows=4 width=8) > > Workers Planned: 4 > > -> Partial Aggregate (cost=96331.21..96331.22 rows=1 width=8) > > -> Parallel Seq Scan on lp1 lp (cost=0.00..85914.57 > > rows=4166657 width=0) > > (5 rows) > > > > I can see a good argument that there should only be 2 workers here. > > Good point, I agree. > > > If someone sets the partitioned table's parallel_workers high so that > > they get a large number of workers when no partitions are pruned > > during planning, do they really want the same number of workers in > > queries where a large number of partitions are pruned? > > > > This problem gets a bit more complex in generic plans where the > > planner can't prune anything but run-time pruning prunes many > > partitions. I'm not so sure what to do about that, but the problem > > does exist today to a lesser extent with the current method of > > determining the append parallel workers. > > Also a good point. That would require changing the actual number of > parallel workers at execution time, but that is tricky. > If we go with your suggestion above, we'd have to disambiguate if > the number of workers is set because a partition is large enough > to warrant a parallel scan (then it shouldn't be reduced if the executor > prunes partitions) or if it is because of the number of partitions > (then it should be reduced). Maybe we really want a parallel_append_workers for partitioned tables, instead of piggybacking on parallel_workers? > I don't know if Seamus is still working on that; if not, we might > mark it as "returned with feedback". I have to agree given the time left. > Perhaps Amit's patch 0001 should go in independently. Perhaps, but maybe we should wait until something really needs that. -- Amit Langote EDB: http://www.enterprisedb.com
pgsql-hackers by date: