Re: A reloption for partitioned tables - parallel_workers - Mailing list pgsql-hackers
From | Seamus Abshere |
---|---|
Subject | Re: A reloption for partitioned tables - parallel_workers |
Date | |
Msg-id | 1139f8a8-c6ca-4168-b5dd-1314bae37340@www.fastmail.com Whole thread Raw |
In response to | Re: A reloption for partitioned tables - parallel_workers (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: A reloption for partitioned tables - parallel_workers
RE: A reloption for partitioned tables - parallel_workers |
List | pgsql-hackers |
hi, Here we go, my first patch... solves https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com Best, Seamus On Mon, Feb 15, 2021, at 3:53 AM, Amit Langote wrote: > Hi Seamus, > > On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote: > > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're madeup of fancy column store partitions (see https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com). > > > > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck: > > > > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c > > index cd3fdd259c..f1ade035ac 100644 > > --- a/src/backend/optimizer/path/allpaths.c > > +++ b/src/backend/optimizer/path/allpaths.c > > @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, > > * If the user has set the parallel_workers reloption, use that; otherwise > > * select a default number of workers. > > */ > > + // I want to affect this > > if (rel->rel_parallel_workers != -1) > > parallel_workers = rel->rel_parallel_workers; > > else > > > > so I do this > > > > diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c > > index c687d3ee9e..597b209bfb 100644 > > --- a/src/backend/access/common/reloptions.c > > +++ b/src/backend/access/common/reloptions.c > > @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate) > > bytea * > > partitioned_table_reloptions(Datum reloptions, bool validate) > > { > > - /* > > - * There are no options for partitioned tables yet, but this is able to do > > - * some validation. > > - */ > > + static const relopt_parse_elt tab[] = { > > + {"parallel_workers", RELOPT_TYPE_INT, > > + offsetof(StdRdOptions, parallel_workers)}, > > + }; > > + > > return (bytea *) build_reloptions(reloptions, validate, > > RELOPT_KIND_PARTITIONED, > > - 0, NULL, 0); > > + sizeof(StdRdOptions), > > + tab, lengthof(tab)); > > } > > > > That "works": > > > > postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33); > > ALTER TABLE > > postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned'; > > relname | relkind | reloptions > > -----------------------------+---------+----------------------- > > test_3pd_cstore_partitioned | p | {parallel_workers=33} > > (1 row) > > > > But it seems to be ignored: > > > > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c > > index cd3fdd259c..c68835ce38 100644 > > --- a/src/backend/optimizer/path/allpaths.c > > +++ b/src/backend/optimizer/path/allpaths.c > > @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, > > * If the user has set the parallel_workers reloption, use that; otherwise > > * select a default number of workers. > > */ > > + // I want to affect this, but this assertion always passes > > + Assert(rel->rel_parallel_workers == -1) > > if (rel->rel_parallel_workers != -1) > > parallel_workers = rel->rel_parallel_workers; > > else > > You may see by inspecting the callers of compute_parallel_worker() > that it never gets called on a partitioned table, only its leaf > partitions. Maybe you could try calling compute_parallel_worker() > somewhere in add_paths_to_append_rel(), which has this code to figure > out parallel_workers to use for a parallel Append path for a given > partitioned table: > > /* Find the highest number of workers requested for any subpath. */ > foreach(lc, partial_subpaths) > { > Path *path = lfirst(lc); > > parallel_workers = Max(parallel_workers, path->parallel_workers); > } > Assert(parallel_workers > 0); > > /* > * If the use of parallel append is permitted, always request at least > * log2(# of children) workers. We assume it can be useful to have > * extra workers in this case because they will be spread out across > * the children. The precise formula is just a guess, but we don't > * want to end up with a radically different answer for a table with N > * partitions vs. an unpartitioned table with the same data, so the > * use of some kind of log-scaling here seems to make some sense. > */ > if (enable_parallel_append) > { > parallel_workers = Max(parallel_workers, > fls(list_length(live_childrels))); > parallel_workers = Min(parallel_workers, > max_parallel_workers_per_gather); > } > Assert(parallel_workers > 0); > > /* Generate a partial append path. */ > appendpath = create_append_path(root, rel, NIL, partial_subpaths, > NIL, NULL, parallel_workers, > enable_parallel_append, > -1); > > Note that the 'rel' in this code refers to the partitioned table for > which an Append path is being considered, so compute_parallel_worker() > using that 'rel' would use the partitioned table's > rel_parallel_workers as you are trying to do. > > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Attachment
pgsql-hackers by date: