Thread: A reloption for partitioned tables - parallel_workers

A reloption for partitioned tables - parallel_workers

From
"Seamus Abshere"
Date:
hi,

It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're
madeup of fancy column store partitions (see
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).

Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index cd3fdd259c..f1ade035ac 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
      * If the user has set the parallel_workers reloption, use that; otherwise
      * select a default number of workers.
      */
+    // I want to affect this
     if (rel->rel_parallel_workers != -1)
         parallel_workers = rel->rel_parallel_workers;
     else

so I do this

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index c687d3ee9e..597b209bfb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate)
 bytea *
 partitioned_table_reloptions(Datum reloptions, bool validate)
 {
-    /*
-     * There are no options for partitioned tables yet, but this is able to do
-     * some validation.
-     */
+    static const relopt_parse_elt tab[] = {
+        {"parallel_workers", RELOPT_TYPE_INT,
+        offsetof(StdRdOptions, parallel_workers)},
+    };
+
     return (bytea *) build_reloptions(reloptions, validate,
                                       RELOPT_KIND_PARTITIONED,
-                                      0, NULL, 0);
+                                      sizeof(StdRdOptions),
+                                      tab, lengthof(tab));
 }

That "works":

postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33);
ALTER TABLE
postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned';
           relname           | relkind |      reloptions       
-----------------------------+---------+-----------------------
 test_3pd_cstore_partitioned | p       | {parallel_workers=33}
(1 row)

But it seems to be ignored:

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index cd3fdd259c..c68835ce38 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
      * If the user has set the parallel_workers reloption, use that; otherwise
      * select a default number of workers.
      */
+    // I want to affect this, but this assertion always passes
+    Assert(rel->rel_parallel_workers == -1)
     if (rel->rel_parallel_workers != -1)
         parallel_workers = rel->rel_parallel_workers;
     else

Thanks and please forgive my code pasting etiquette as this is my first post to pgsql-hackers and I'm not quite sure
whatthe right format is.
 

Thank you,
Seamus



Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
Hi Seamus,

On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote:
> It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're
madeup of fancy column store partitions (see
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
>
> Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
>
> diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> index cd3fdd259c..f1ade035ac 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
>          * If the user has set the parallel_workers reloption, use that; otherwise
>          * select a default number of workers.
>          */
> +       // I want to affect this
>         if (rel->rel_parallel_workers != -1)
>                 parallel_workers = rel->rel_parallel_workers;
>         else
>
> so I do this
>
> diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
> index c687d3ee9e..597b209bfb 100644
> --- a/src/backend/access/common/reloptions.c
> +++ b/src/backend/access/common/reloptions.c
> @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate)
>  bytea *
>  partitioned_table_reloptions(Datum reloptions, bool validate)
>  {
> -       /*
> -        * There are no options for partitioned tables yet, but this is able to do
> -        * some validation.
> -        */
> +       static const relopt_parse_elt tab[] = {
> +               {"parallel_workers", RELOPT_TYPE_INT,
> +               offsetof(StdRdOptions, parallel_workers)},
> +       };
> +
>         return (bytea *) build_reloptions(reloptions, validate,
>                                                                           RELOPT_KIND_PARTITIONED,
> -                                                                         0, NULL, 0);
> +                                                                         sizeof(StdRdOptions),
> +                                                                         tab, lengthof(tab));
>  }
>
> That "works":
>
> postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33);
> ALTER TABLE
> postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned';
>            relname           | relkind |      reloptions
> -----------------------------+---------+-----------------------
>  test_3pd_cstore_partitioned | p       | {parallel_workers=33}
> (1 row)
>
> But it seems to be ignored:
>
> diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> index cd3fdd259c..c68835ce38 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
>          * If the user has set the parallel_workers reloption, use that; otherwise
>          * select a default number of workers.
>          */
> +       // I want to affect this, but this assertion always passes
> +       Assert(rel->rel_parallel_workers == -1)
>         if (rel->rel_parallel_workers != -1)
>                 parallel_workers = rel->rel_parallel_workers;
>         else

You may see by inspecting the callers of compute_parallel_worker()
that it never gets called on a partitioned table, only its leaf
partitions.  Maybe you could try calling compute_parallel_worker()
somewhere in add_paths_to_append_rel(), which has this code to figure
out parallel_workers to use for a parallel Append path for a given
partitioned table:

        /* Find the highest number of workers requested for any subpath. */
        foreach(lc, partial_subpaths)
        {
            Path       *path = lfirst(lc);

            parallel_workers = Max(parallel_workers, path->parallel_workers);
        }
        Assert(parallel_workers > 0);

        /*
         * If the use of parallel append is permitted, always request at least
         * log2(# of children) workers.  We assume it can be useful to have
         * extra workers in this case because they will be spread out across
         * the children.  The precise formula is just a guess, but we don't
         * want to end up with a radically different answer for a table with N
         * partitions vs. an unpartitioned table with the same data, so the
         * use of some kind of log-scaling here seems to make some sense.
         */
        if (enable_parallel_append)
        {
            parallel_workers = Max(parallel_workers,
                                   fls(list_length(live_childrels)));
            parallel_workers = Min(parallel_workers,
                                   max_parallel_workers_per_gather);
        }
        Assert(parallel_workers > 0);

        /* Generate a partial append path. */
        appendpath = create_append_path(root, rel, NIL, partial_subpaths,
                                        NIL, NULL, parallel_workers,
                                        enable_parallel_append,
                                        -1);

Note that the 'rel' in this code refers to the partitioned table for
which an Append path is being considered, so compute_parallel_worker()
using that 'rel' would use the partitioned table's
rel_parallel_workers as you are trying to do.

--
Amit Langote
EDB: http://www.enterprisedb.com



Re: A reloption for partitioned tables - parallel_workers

From
"Seamus Abshere"
Date:
hi,

Here we go, my first patch... solves
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com

Best,
Seamus

On Mon, Feb 15, 2021, at 3:53 AM, Amit Langote wrote:
> Hi Seamus,
> 
> On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote:
> > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're
madeup of fancy column store partitions (see
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
> >
> > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
> >
> > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> > index cd3fdd259c..f1ade035ac 100644
> > --- a/src/backend/optimizer/path/allpaths.c
> > +++ b/src/backend/optimizer/path/allpaths.c
> > @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
> >          * If the user has set the parallel_workers reloption, use that; otherwise
> >          * select a default number of workers.
> >          */
> > +       // I want to affect this
> >         if (rel->rel_parallel_workers != -1)
> >                 parallel_workers = rel->rel_parallel_workers;
> >         else
> >
> > so I do this
> >
> > diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
> > index c687d3ee9e..597b209bfb 100644
> > --- a/src/backend/access/common/reloptions.c
> > +++ b/src/backend/access/common/reloptions.c
> > @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate)
> >  bytea *
> >  partitioned_table_reloptions(Datum reloptions, bool validate)
> >  {
> > -       /*
> > -        * There are no options for partitioned tables yet, but this is able to do
> > -        * some validation.
> > -        */
> > +       static const relopt_parse_elt tab[] = {
> > +               {"parallel_workers", RELOPT_TYPE_INT,
> > +               offsetof(StdRdOptions, parallel_workers)},
> > +       };
> > +
> >         return (bytea *) build_reloptions(reloptions, validate,
> >                                                                           RELOPT_KIND_PARTITIONED,
> > -                                                                         0, NULL, 0);
> > +                                                                         sizeof(StdRdOptions),
> > +                                                                         tab, lengthof(tab));
> >  }
> >
> > That "works":
> >
> > postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33);
> > ALTER TABLE
> > postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned';
> >            relname           | relkind |      reloptions
> > -----------------------------+---------+-----------------------
> >  test_3pd_cstore_partitioned | p       | {parallel_workers=33}
> > (1 row)
> >
> > But it seems to be ignored:
> >
> > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> > index cd3fdd259c..c68835ce38 100644
> > --- a/src/backend/optimizer/path/allpaths.c
> > +++ b/src/backend/optimizer/path/allpaths.c
> > @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
> >          * If the user has set the parallel_workers reloption, use that; otherwise
> >          * select a default number of workers.
> >          */
> > +       // I want to affect this, but this assertion always passes
> > +       Assert(rel->rel_parallel_workers == -1)
> >         if (rel->rel_parallel_workers != -1)
> >                 parallel_workers = rel->rel_parallel_workers;
> >         else
> 
> You may see by inspecting the callers of compute_parallel_worker()
> that it never gets called on a partitioned table, only its leaf
> partitions.  Maybe you could try calling compute_parallel_worker()
> somewhere in add_paths_to_append_rel(), which has this code to figure
> out parallel_workers to use for a parallel Append path for a given
> partitioned table:
> 
>         /* Find the highest number of workers requested for any subpath. */
>         foreach(lc, partial_subpaths)
>         {
>             Path       *path = lfirst(lc);
> 
>             parallel_workers = Max(parallel_workers, path->parallel_workers);
>         }
>         Assert(parallel_workers > 0);
> 
>         /*
>          * If the use of parallel append is permitted, always request at least
>          * log2(# of children) workers.  We assume it can be useful to have
>          * extra workers in this case because they will be spread out across
>          * the children.  The precise formula is just a guess, but we don't
>          * want to end up with a radically different answer for a table with N
>          * partitions vs. an unpartitioned table with the same data, so the
>          * use of some kind of log-scaling here seems to make some sense.
>          */
>         if (enable_parallel_append)
>         {
>             parallel_workers = Max(parallel_workers,
>                                    fls(list_length(live_childrels)));
>             parallel_workers = Min(parallel_workers,
>                                    max_parallel_workers_per_gather);
>         }
>         Assert(parallel_workers > 0);
> 
>         /* Generate a partial append path. */
>         appendpath = create_append_path(root, rel, NIL, partial_subpaths,
>                                         NIL, NULL, parallel_workers,
>                                         enable_parallel_append,
>                                         -1);
> 
> Note that the 'rel' in this code refers to the partitioned table for
> which an Append path is being considered, so compute_parallel_worker()
> using that 'rel' would use the partitioned table's
> rel_parallel_workers as you are trying to do.
> 
> -- 
> Amit Langote
> EDB: http://www.enterprisedb.com
>
Attachment

Re: A reloption for partitioned tables - parallel_workers

From
Laurenz Albe
Date:
On Mon, 2021-02-15 at 17:53 +0900, Amit Langote wrote:
> On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote:
> > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables,
> >  at least if they're made up of fancy column store partitions (see
> >  https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
> > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
> 
> You may see by inspecting the callers of compute_parallel_worker()
> that it never gets called on a partitioned table, only its leaf
> partitions.  Maybe you could try calling compute_parallel_worker()
> somewhere in add_paths_to_append_rel(), which has this code to figure
> out parallel_workers to use for a parallel Append path for a given
> partitioned table:
> 
>         /* Find the highest number of workers requested for any subpath. */
>         foreach(lc, partial_subpaths)
>         {
>             Path       *path = lfirst(lc);
> 
>             parallel_workers = Max(parallel_workers, path->parallel_workers);
>         }
>         Assert(parallel_workers > 0);
> 
>         /*
>          * If the use of parallel append is permitted, always request at least
>          * log2(# of children) workers.  We assume it can be useful to have
>          * extra workers in this case because they will be spread out across
>          * the children.  The precise formula is just a guess, but we don't
>          * want to end up with a radically different answer for a table with N
>          * partitions vs. an unpartitioned table with the same data, so the
>          * use of some kind of log-scaling here seems to make some sense.
>          */
>         if (enable_parallel_append)
>         {
>             parallel_workers = Max(parallel_workers,
>                                    fls(list_length(live_childrels)));
>             parallel_workers = Min(parallel_workers,
>                                    max_parallel_workers_per_gather);
>         }
>         Assert(parallel_workers > 0);
> 
> Note that the 'rel' in this code refers to the partitioned table for
> which an Append path is being considered, so compute_parallel_worker()
> using that 'rel' would use the partitioned table's
> rel_parallel_workers as you are trying to do.

Note that there is a second chunk of code quite like that one a few
lines down from there that would also have to be modified.

I am +1 on allowing to override the degree of parallelism on a parallel
append.  If "parallel_workers" on the partitioned table is an option for
that, it might be a simple solution.  On the other hand, perhaps it would
be less confusing to have a different storage parameter name rather than
having "parallel_workers" do double duty.

Also, since there is a design rule that storage parameters can only be used
on partitions, we would have to change that - is that a problem for anybody?


There is another related consideration that doesn't need to be addressed
by this patch, but that is somewhat related: if the executor prunes some
partitions, the degree of parallelism is unaffected, right?
So if the planner decides to use 24 workers for 25 partitions, and the
executor discards all but one of these partition scans, we would end up
with 24 workers scanning a single partition.

I am not sure how that could be improved.

Yours,
Laurenz Albe




Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
Hi Seamus,

Please see my reply below.

On Tue, Feb 16, 2021 at 1:35 AM Seamus Abshere <seamus@abshere.net> wrote:
> On Mon, Feb 15, 2021, at 3:53 AM, Amit Langote wrote:
> > On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote:
> > > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if
they'remade up of fancy column store partitions (see
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
> > >
> > > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
>
> > You may see by inspecting the callers of compute_parallel_worker()
> > that it never gets called on a partitioned table, only its leaf
> > partitions.  Maybe you could try calling compute_parallel_worker()
> > somewhere in add_paths_to_append_rel(), which has this code to figure
> > out parallel_workers to use for a parallel Append path for a given
> > partitioned table:
> >
> >         /* Find the highest number of workers requested for any subpath. */
> >         foreach(lc, partial_subpaths)
> >         {
> >             Path       *path = lfirst(lc);
> >
> >             parallel_workers = Max(parallel_workers, path->parallel_workers);
> >         }
> >         Assert(parallel_workers > 0);
> >
> >         /*
> >          * If the use of parallel append is permitted, always request at least
> >          * log2(# of children) workers.  We assume it can be useful to have
> >          * extra workers in this case because they will be spread out across
> >          * the children.  The precise formula is just a guess, but we don't
> >          * want to end up with a radically different answer for a table with N
> >          * partitions vs. an unpartitioned table with the same data, so the
> >          * use of some kind of log-scaling here seems to make some sense.
> >          */
> >         if (enable_parallel_append)
> >         {
> >             parallel_workers = Max(parallel_workers,
> >                                    fls(list_length(live_childrels)));
> >             parallel_workers = Min(parallel_workers,
> >                                    max_parallel_workers_per_gather);
> >         }
> >         Assert(parallel_workers > 0);
> >
> >         /* Generate a partial append path. */
> >         appendpath = create_append_path(root, rel, NIL, partial_subpaths,
> >                                         NIL, NULL, parallel_workers,
> >                                         enable_parallel_append,
> >                                         -1);
> >
> > Note that the 'rel' in this code refers to the partitioned table for
> > which an Append path is being considered, so compute_parallel_worker()
> > using that 'rel' would use the partitioned table's
> > rel_parallel_workers as you are trying to do.
>
> Here we go, my first patch... solves
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com

Thanks for sending the patch here.

It seems you haven't made enough changes for reloptions code to
recognize parallel_workers as valid for partitioned tables, because
even with the patch applied, I get this:

create table rp (a int) partition by range (a);
create table rp1 partition of rp for values from (minvalue) to (0);
create table rp2 partition of rp for values from (0) to (maxvalue);
alter table rp set (parallel_workers = 1);
ERROR:  unrecognized parameter "parallel_workers"

You need this:

diff --git a/src/backend/access/common/reloptions.c
b/src/backend/access/common/reloptions.c
index 029a73325e..9eb8a0c10d 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -377,7 +377,7 @@ static relopt_int intRelOpts[] =
        {
            "parallel_workers",
            "Number of parallel processes that can be used per
executor node for this relation.",
-           RELOPT_KIND_HEAP,
+           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
            ShareUpdateExclusiveLock
        },
        -1, 0, 1024

which tells reloptions parsing code that parallel_workers is
acceptable for both heap and partitioned relations.

Other comments on the patch:

* This function calling style, where the first argument is not placed
on the same line as the function itself, is not very common in
Postgres:

+       /* First see if there is a root-level setting for parallel_workers */
+       parallel_workers = compute_parallel_worker(
+           rel,
+           -1,
+           -1,
+           max_parallel_workers_per_gather
+

This makes the new code look very different from the rest of the
codebase.  Better to stick to existing styles.

2. It might be a good idea to use this opportunity to add a function,
say compute_append_parallel_workers(), for the code that does what the
function name says.  Then the patch will simply add the new
compute_parallel_worker() call at the top of that function.

3. I think we should consider the Append parent relation's
parallel_workers ONLY if it is a partitioned relation, because it
doesn't make a lot of sense for other types of parent relations.  So
the new code should look like this:

if (IS_PARTITIONED_REL(rel))
    parallel_workers = compute_parallel_worker(rel, -1, -1,
max_parallel_workers_per_gather);

4. Maybe it also doesn't make sense to consider the parent relation's
parallel_workers if Parallel Append is disabled
(enable_parallel_append = off).  That's because with a simple
(non-parallel) Append running under Gather, all launched parallel
workers process the same partition before moving to the next one.
OTOH, one's intention of setting parallel_workers on the parent
partitioned table would most likely be to distribute workers across
partitions, which is only possible with parallel Append
(enable_parallel_append = on).  So, the new code should look like
this:

if (IS_PARTITIONED_REL(rel) && enable_parallel_append)
    parallel_workers = compute_parallel_worker(rel, -1, -1,
max_parallel_workers_per_gather);

BTW, please consider bottom-posting like I've done in this reply,
because that makes it easier to follow discussions involving patch
reviews that can potentially take many emails to reach conclusions.





--
Amit Langote
EDB: http://www.enterprisedb.com



Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
Hi,

On Tue, Feb 16, 2021 at 1:06 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Mon, 2021-02-15 at 17:53 +0900, Amit Langote wrote:
> > On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus@abshere.net> wrote:
> > > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables,
> > >  at least if they're made up of fancy column store partitions (see
> > >  https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com).
> > > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
> >
> > You may see by inspecting the callers of compute_parallel_worker()
> > that it never gets called on a partitioned table, only its leaf
> > partitions.  Maybe you could try calling compute_parallel_worker()
> > somewhere in add_paths_to_append_rel(), which has this code to figure
> > out parallel_workers to use for a parallel Append path for a given
> > partitioned table:
> >
> >         /* Find the highest number of workers requested for any subpath. */
> >         foreach(lc, partial_subpaths)
> >         {
> >             Path       *path = lfirst(lc);
> >
> >             parallel_workers = Max(parallel_workers, path->parallel_workers);
> >         }
> >         Assert(parallel_workers > 0);
> >
> >         /*
> >          * If the use of parallel append is permitted, always request at least
> >          * log2(# of children) workers.  We assume it can be useful to have
> >          * extra workers in this case because they will be spread out across
> >          * the children.  The precise formula is just a guess, but we don't
> >          * want to end up with a radically different answer for a table with N
> >          * partitions vs. an unpartitioned table with the same data, so the
> >          * use of some kind of log-scaling here seems to make some sense.
> >          */
> >         if (enable_parallel_append)
> >         {
> >             parallel_workers = Max(parallel_workers,
> >                                    fls(list_length(live_childrels)));
> >             parallel_workers = Min(parallel_workers,
> >                                    max_parallel_workers_per_gather);
> >         }
> >         Assert(parallel_workers > 0);
> >
> > Note that the 'rel' in this code refers to the partitioned table for
> > which an Append path is being considered, so compute_parallel_worker()
> > using that 'rel' would use the partitioned table's
> > rel_parallel_workers as you are trying to do.
>
> Note that there is a second chunk of code quite like that one a few
> lines down from there that would also have to be modified.
>
> I am +1 on allowing to override the degree of parallelism on a parallel
> append.  If "parallel_workers" on the partitioned table is an option for
> that, it might be a simple solution.  On the other hand, perhaps it would
> be less confusing to have a different storage parameter name rather than
> having "parallel_workers" do double duty.
>
> Also, since there is a design rule that storage parameters can only be used
> on partitions, we would have to change that - is that a problem for anybody?

I am not aware of a rule that suggests that parallel_workers is always
interpreted using storage-level considerations.  If that is indeed a
popular interpretation at this point, then yes, we should be open to
considering a new name for the parameter that this patch wants to add.

Maybe parallel_append_workers?  Perhaps not a bad idea in this patch's
case, but considering that we may want to expand the support of
cross-partition parallelism to operations other than querying, maybe
something else?

This reminds me of something I forgot to mention in my review of the
patch -- it should also update the documentation of parallel_workers
on the CREATE TABLE page to mention that it will be interpreted a bit
differently for partitioned tables than for regular storage-bearing
relations.  Workers specified for partitioned tables would be
distributed by the executor over its partitions, unlike with
storage-bearing relations, where the distribution of specified workers
is controlled by the AM using storage-level considerations.

> There is another related consideration that doesn't need to be addressed
> by this patch, but that is somewhat related: if the executor prunes some
> partitions, the degree of parallelism is unaffected, right?

That's correct.  Launched workers could be less than planned, but that
would not have anything to do with executor pruning.

> So if the planner decides to use 24 workers for 25 partitions, and the
> executor discards all but one of these partition scans, we would end up
> with 24 workers scanning a single partition.

I do remember pondering this when testing my patches to improve the
performance of executing a generic plan to scan a partitioned table
where runtime pruning is possible.  Here is an example:

create table hp (a int) partition by hash (a);
select 'create table hp' || i || ' partition of hp for values with
(modulus 100, remainder ' || i || ');' from generate_series(0, 99) i;
\gexec
insert into hp select generate_series(0, 1000000);
alter table hp set (parallel_workers = 16);
set plan_cache_mode to force_generic_plan ;
set max_parallel_workers_per_gather to 16;
prepare q as select * from hp where a = $1;

explain (analyze, verbose) execute q (1);
                                                            QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..14426.50 rows=5675 width=4) (actual
time=2.370..25.002 rows=1 loops=1)
   Output: hp.a
   Workers Planned: 16
   Workers Launched: 7
   ->  Parallel Append  (cost=0.00..12859.00 rows=400 width=4) (actual
time=0.006..0.384 rows=0 loops=8)
         Worker 0:  actual time=0.001..0.001 rows=0 loops=1
         Worker 1:  actual time=0.001..0.001 rows=0 loops=1
         Worker 2:  actual time=0.001..0.001 rows=0 loops=1
         Worker 3:  actual time=0.001..0.001 rows=0 loops=1
         Worker 4:  actual time=0.001..0.001 rows=0 loops=1
         Worker 5:  actual time=0.001..0.001 rows=0 loops=1
         Worker 6:  actual time=0.001..0.001 rows=0 loops=1
         Subplans Removed: 99
         ->  Parallel Seq Scan on public.hp40 hp_1  (cost=0.00..126.50
rows=33 width=4) (actual time=0.041..3.060 rows=1 loops=1)
               Output: hp_1.a
               Filter: (hp_1.a = $1)
               Rows Removed by Filter: 9813
 Planning Time: 5.543 ms
 Execution Time: 25.139 ms
(19 rows)

deallocate q;
set max_parallel_workers_per_gather to 0;
prepare q as select * from hp where a = $1;
explain (analyze, verbose) execute q (1);
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..18754.88 rows=5675 width=4) (actual
time=0.029..2.474 rows=1 loops=1)
   Subplans Removed: 99
   ->  Seq Scan on public.hp40 hp_1  (cost=0.00..184.25 rows=56
width=4) (actual time=0.028..2.471 rows=1 loops=1)
         Output: hp_1.a
         Filter: (hp_1.a = $1)
         Rows Removed by Filter: 9813
 Planning Time: 2.231 ms
 Execution Time: 2.535 ms
(8 rows)

Comparing the Execution Times above, it's clear that Gather and
workers are pure overhead in this case.

Although in cases where one expects runtime pruning to be useful, the
plan itself is very unlikely to be parallelized. For example, when the
individual partition scans are Index Scans.

deallocate q;
create index on hp (a);
alter table hp set (parallel_workers = 16);
analyze;
set max_parallel_workers_per_gather to 16;
prepare q as select * from hp where a = $1;
explain (analyze, verbose) execute q (1);
                                                               QUERY
PLAN


---------------------------------------------------------------------------------------------------------------------------------------
-
 Append  (cost=0.29..430.75 rows=100 width=4) (actual
time=0.043..0.046 rows=1 loops=1)
   Subplans Removed: 99
   ->  Index Only Scan using hp40_a_idx on public.hp40 hp_1
(cost=0.29..4.30 rows=1 width=4) (actual time=0.042..0.044 rows=1
loops=1)
         Output: hp_1.a
         Index Cond: (hp_1.a = $1)
         Heap Fetches: 0
 Planning Time: 13.769 ms
 Execution Time: 0.115 ms
(8 rows)

> I am not sure how that could be improved.

The planner currently ignores runtime pruning optimization when
assigning costs to the Append path, so fixing that would be a good
start.  There are efforts underway for that, such as [1].

-- 
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://commitfest.postgresql.org/32/2829/



Re: A reloption for partitioned tables - parallel_workers

From
Laurenz Albe
Date:
On Tue, 2021-02-16 at 16:29 +0900, Amit Langote wrote:
> > I am +1 on allowing to override the degree of parallelism on a parallel
> > append.  If "parallel_workers" on the partitioned table is an option for
> > that, it might be a simple solution.  On the other hand, perhaps it would
> > be less confusing to have a different storage parameter name rather than
> > having "parallel_workers" do double duty.
> > Also, since there is a design rule that storage parameters can only be used
> > on partitions, we would have to change that - is that a problem for anybody?
> 
> I am not aware of a rule that suggests that parallel_workers is always
> interpreted using storage-level considerations.  If that is indeed a
> popular interpretation at this point, then yes, we should be open to
> considering a new name for the parameter that this patch wants to add.

Well, https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
says:

 "Specifying these parameters for partitioned tables is not supported,
  but you may specify them for individual leaf partitions."

If we re-purpose "parallel_workers" like this, we'd have to change this.

Then for a normal table, "parallel_workers" would mean how many workers
work on a parallel table scan.  For a partitioned table, it determines
how many workers work on a parallel append.

Perhaps that is similar enough that it is not confusing.

Yours,
Laurenz Albe




Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Tue, Feb 16, 2021 at 11:02 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Tue, 2021-02-16 at 16:29 +0900, Amit Langote wrote:
> > > I am +1 on allowing to override the degree of parallelism on a parallel
> > > append.  If "parallel_workers" on the partitioned table is an option for
> > > that, it might be a simple solution.  On the other hand, perhaps it would
> > > be less confusing to have a different storage parameter name rather than
> > > having "parallel_workers" do double duty.
> > > Also, since there is a design rule that storage parameters can only be used
> > > on partitions, we would have to change that - is that a problem for anybody?
> >
> > I am not aware of a rule that suggests that parallel_workers is always
> > interpreted using storage-level considerations.  If that is indeed a
> > popular interpretation at this point, then yes, we should be open to
> > considering a new name for the parameter that this patch wants to add.
>
> Well, https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS
> says:
>
>  "Specifying these parameters for partitioned tables is not supported,
>   but you may specify them for individual leaf partitions."
>
> If we re-purpose "parallel_workers" like this, we'd have to change this.

Right, as I mentioned in my reply, the patch will need to update the
documentation.

> Then for a normal table, "parallel_workers" would mean how many workers
> work on a parallel table scan.  For a partitioned table, it determines
> how many workers work on a parallel append.
>
> Perhaps that is similar enough that it is not confusing.

I tend to agree with that.

-- 
Amit Langote
EDB: http://www.enterprisedb.com



RE: A reloption for partitioned tables - parallel_workers

From
"Hou, Zhijie"
Date:
> hi,
>
> Here we go, my first patch... solves
> https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520
> ed55@www.fastmail.com
>

Hi,

partitioned_table_reloptions(Datum reloptions, bool validate)
 {
+    static const relopt_parse_elt tab[] = {
+        {"parallel_workers", RELOPT_TYPE_INT,
+        offsetof(StdRdOptions, parallel_workers)},
+    };
+
     return (bytea *) build_reloptions(reloptions, validate,
                                       RELOPT_KIND_PARTITIONED,
-                                      0, NULL, 0);
+                                      sizeof(StdRdOptions),
+                                      tab, lengthof(tab));
 }

I noticed that you plan to store the parallel_workers in the same struct(StdRdOptions) as normal heap relation.
It seems better to store it in a separate struct.

And as commit 1bbd608 said:
----
> Splitting things has the advantage to
> make the information stored in rd_options include only the necessary
> information, reducing the amount of memory used for a relcache entry
> with partitioned tables if new reloptions are introduced at this level.
----

What do you think?

Best regards,
Houzj









Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Thu, Feb 18, 2021 at 6:06 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
> > Here we go, my first patch... solves
> > https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520
> > ed55@www.fastmail.com
> >
>
> Hi,
>
> partitioned_table_reloptions(Datum reloptions, bool validate)
>  {
> +       static const relopt_parse_elt tab[] = {
> +               {"parallel_workers", RELOPT_TYPE_INT,
> +               offsetof(StdRdOptions, parallel_workers)},
> +       };
> +
>         return (bytea *) build_reloptions(reloptions, validate,
>                                                                           RELOPT_KIND_PARTITIONED,
> -                                                                         0, NULL, 0);
> +                                                                         sizeof(StdRdOptions),
> +                                                                         tab, lengthof(tab));
>  }
>
> I noticed that you plan to store the parallel_workers in the same struct(StdRdOptions) as normal heap relation.
> It seems better to store it in a separate struct.
>
> And as commit 1bbd608 said:
> ----
> > Splitting things has the advantage to
> > make the information stored in rd_options include only the necessary
> > information, reducing the amount of memory used for a relcache entry
> > with partitioned tables if new reloptions are introduced at this level.
> ----
>
> What do you think?

That may be a good idea.  So instead of referring to the
parallel_workers in StdRdOptions, define a new one, say,
PartitionedTableRdOptions as follows and refer to it in
partitioned_table_reloptions():

typedef struct PartitionedTableRdOptions
{
    int32       vl_len_;        /* varlena header (do not touch directly!) */
    int         parallel_workers;   /* max number of parallel workers */
} PartitionedTableRdOptions;

-- 
Amit Langote
EDB: http://www.enterprisedb.com



Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Tue, Feb 16, 2021 at 3:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Tue, Feb 16, 2021 at 1:35 AM Seamus Abshere <seamus@abshere.net> wrote:
> > Here we go, my first patch... solves
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com
>
> Thanks for sending the patch here.
>
> It seems you haven't made enough changes for reloptions code to
> recognize parallel_workers as valid for partitioned tables, because
> even with the patch applied, I get this:
>
> create table rp (a int) partition by range (a);
> create table rp1 partition of rp for values from (minvalue) to (0);
> create table rp2 partition of rp for values from (0) to (maxvalue);
> alter table rp set (parallel_workers = 1);
> ERROR:  unrecognized parameter "parallel_workers"
>
> You need this:
>
> diff --git a/src/backend/access/common/reloptions.c
> b/src/backend/access/common/reloptions.c
> index 029a73325e..9eb8a0c10d 100644
> --- a/src/backend/access/common/reloptions.c
> +++ b/src/backend/access/common/reloptions.c
> @@ -377,7 +377,7 @@ static relopt_int intRelOpts[] =
>         {
>             "parallel_workers",
>             "Number of parallel processes that can be used per
> executor node for this relation.",
> -           RELOPT_KIND_HEAP,
> +           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
>             ShareUpdateExclusiveLock
>         },
>         -1, 0, 1024
>
> which tells reloptions parsing code that parallel_workers is
> acceptable for both heap and partitioned relations.
>
> Other comments on the patch:
>
> * This function calling style, where the first argument is not placed
> on the same line as the function itself, is not very common in
> Postgres:
>
> +       /* First see if there is a root-level setting for parallel_workers */
> +       parallel_workers = compute_parallel_worker(
> +           rel,
> +           -1,
> +           -1,
> +           max_parallel_workers_per_gather
> +
>
> This makes the new code look very different from the rest of the
> codebase.  Better to stick to existing styles.
>
> 2. It might be a good idea to use this opportunity to add a function,
> say compute_append_parallel_workers(), for the code that does what the
> function name says.  Then the patch will simply add the new
> compute_parallel_worker() call at the top of that function.
>
> 3. I think we should consider the Append parent relation's
> parallel_workers ONLY if it is a partitioned relation, because it
> doesn't make a lot of sense for other types of parent relations.  So
> the new code should look like this:
>
> if (IS_PARTITIONED_REL(rel))
>     parallel_workers = compute_parallel_worker(rel, -1, -1,
> max_parallel_workers_per_gather);
>
> 4. Maybe it also doesn't make sense to consider the parent relation's
> parallel_workers if Parallel Append is disabled
> (enable_parallel_append = off).  That's because with a simple
> (non-parallel) Append running under Gather, all launched parallel
> workers process the same partition before moving to the next one.
> OTOH, one's intention of setting parallel_workers on the parent
> partitioned table would most likely be to distribute workers across
> partitions, which is only possible with parallel Append
> (enable_parallel_append = on).  So, the new code should look like
> this:
>
> if (IS_PARTITIONED_REL(rel) && enable_parallel_append)
>     parallel_workers = compute_parallel_worker(rel, -1, -1,
> max_parallel_workers_per_gather);

Here is an updated version of the Seamus' patch that takes into
account these and other comments received on this thread so far.
Maybe warrants adding some tests too but I haven't.

Seamus, please register this patch in the next commit-fest:
https://commitfest.postgresql.org/32/

If you haven't already, you will need to create a community account to
use that site.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

Re: A reloption for partitioned tables - parallel_workers

From
"Seamus Abshere"
Date:
hi Amit,

Thanks so much for doing this. I had created

https://commitfest.postgresql.org/32/2987/

and it looks like it now shows your patch as the one to use. Let me know if I should do anything else.

Best,
Seamus

On Fri, Feb 19, 2021, at 2:30 AM, Amit Langote wrote:
> On Tue, Feb 16, 2021 at 3:05 PM Amit Langote <amitlangote09@gmail.com> wrote:
> > On Tue, Feb 16, 2021 at 1:35 AM Seamus Abshere <seamus@abshere.net> wrote:
> > > Here we go, my first patch... solves
https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com
> >
> > Thanks for sending the patch here.
> >
> > It seems you haven't made enough changes for reloptions code to
> > recognize parallel_workers as valid for partitioned tables, because
> > even with the patch applied, I get this:
> >
> > create table rp (a int) partition by range (a);
> > create table rp1 partition of rp for values from (minvalue) to (0);
> > create table rp2 partition of rp for values from (0) to (maxvalue);
> > alter table rp set (parallel_workers = 1);
> > ERROR:  unrecognized parameter "parallel_workers"
> >
> > You need this:
> >
> > diff --git a/src/backend/access/common/reloptions.c
> > b/src/backend/access/common/reloptions.c
> > index 029a73325e..9eb8a0c10d 100644
> > --- a/src/backend/access/common/reloptions.c
> > +++ b/src/backend/access/common/reloptions.c
> > @@ -377,7 +377,7 @@ static relopt_int intRelOpts[] =
> >         {
> >             "parallel_workers",
> >             "Number of parallel processes that can be used per
> > executor node for this relation.",
> > -           RELOPT_KIND_HEAP,
> > +           RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED,
> >             ShareUpdateExclusiveLock
> >         },
> >         -1, 0, 1024
> >
> > which tells reloptions parsing code that parallel_workers is
> > acceptable for both heap and partitioned relations.
> >
> > Other comments on the patch:
> >
> > * This function calling style, where the first argument is not placed
> > on the same line as the function itself, is not very common in
> > Postgres:
> >
> > +       /* First see if there is a root-level setting for parallel_workers */
> > +       parallel_workers = compute_parallel_worker(
> > +           rel,
> > +           -1,
> > +           -1,
> > +           max_parallel_workers_per_gather
> > +
> >
> > This makes the new code look very different from the rest of the
> > codebase.  Better to stick to existing styles.
> >
> > 2. It might be a good idea to use this opportunity to add a function,
> > say compute_append_parallel_workers(), for the code that does what the
> > function name says.  Then the patch will simply add the new
> > compute_parallel_worker() call at the top of that function.
> >
> > 3. I think we should consider the Append parent relation's
> > parallel_workers ONLY if it is a partitioned relation, because it
> > doesn't make a lot of sense for other types of parent relations.  So
> > the new code should look like this:
> >
> > if (IS_PARTITIONED_REL(rel))
> >     parallel_workers = compute_parallel_worker(rel, -1, -1,
> > max_parallel_workers_per_gather);
> >
> > 4. Maybe it also doesn't make sense to consider the parent relation's
> > parallel_workers if Parallel Append is disabled
> > (enable_parallel_append = off).  That's because with a simple
> > (non-parallel) Append running under Gather, all launched parallel
> > workers process the same partition before moving to the next one.
> > OTOH, one's intention of setting parallel_workers on the parent
> > partitioned table would most likely be to distribute workers across
> > partitions, which is only possible with parallel Append
> > (enable_parallel_append = on).  So, the new code should look like
> > this:
> >
> > if (IS_PARTITIONED_REL(rel) && enable_parallel_append)
> >     parallel_workers = compute_parallel_worker(rel, -1, -1,
> > max_parallel_workers_per_gather);
> 
> Here is an updated version of the Seamus' patch that takes into
> account these and other comments received on this thread so far.
> Maybe warrants adding some tests too but I haven't.
> 
> Seamus, please register this patch in the next commit-fest:
> https://commitfest.postgresql.org/32/
> 
> If you haven't already, you will need to create a community account to
> use that site.
> 
> -- 
> Amit Langote
> EDB: http://www.enterprisedb.com
> 
> Attachments:
> * v2-0001-Allow-setting-parallel_workers-on-partitioned-tab.patch



Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Fri, Mar 5, 2021 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Fri, 2021-03-05 at 22:55 +0900, Amit Langote wrote:
> > On Fri, Mar 5, 2021 at 10:47 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > > On Wed, 2021-03-03 at 17:58 +0900, Amit Langote wrote:
> > > > For example, with the attached PoC patch:
> > >
> > > I have incorporated your POC patch and added a regression test.
> > >
> > > I didn't test it thoroughly though.
> >
> > Thanks.  Although, I wonder if we should rather consider it a
> > standalone patch to fix a partition planning code deficiency.
>
> Oh - I didn't realize that your patch was independent.

Attached a new version rebased over c8f78b616, with the grouping
relation partitioning enhancements as a separate patch 0001.  Sorry
about the delay.

I'd also like to change compute_append_parallel_workers(), as also
mentioned upthread, such that disabling Parallel Append by setting
parallel_workers=0 on a parent partitioned table does not also disable
the partitions themselves being scanned in parallel even though under
an Append.  I didn't get time today to work on that though.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

Re: A reloption for partitioned tables - parallel_workers

From
David Rowley
Date:
On Fri, 19 Mar 2021 at 02:07, Amit Langote <amitlangote09@gmail.com> wrote:
> Attached a new version rebased over c8f78b616, with the grouping
> relation partitioning enhancements as a separate patch 0001.  Sorry
> about the delay.

I had a quick look at this and wondered if the partitioned table's
parallel workers shouldn't be limited to the sum of the parallel
workers of the Append's subpaths?

It seems a bit weird to me that the following case requests 4 workers:

# create table lp (a int) partition by list(a);
# create table lp1 partition of lp for values in(1);
# insert into lp select 1 from generate_series(1,10000000) x;
# alter table lp1 set (parallel_workers = 2);
# alter table lp set (parallel_workers = 4);
# set max_parallel_workers_per_Gather = 8;
# explain select count(*) from lp;
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.63..97331.64 rows=1 width=8)
   ->  Gather  (cost=97331.21..97331.62 rows=4 width=8)
         Workers Planned: 4
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
               ->  Parallel Seq Scan on lp1 lp  (cost=0.00..85914.57
rows=4166657 width=0)
(5 rows)

I can see a good argument that there should only be 2 workers here.

If someone sets the partitioned table's parallel_workers high so that
they get a large number of workers when no partitions are pruned
during planning, do they really want the same number of workers in
queries where a large number of partitions are pruned?

This problem gets a bit more complex in generic plans where the
planner can't prune anything but run-time pruning prunes many
partitions. I'm not so sure what to do about that, but the problem
does exist today to a lesser extent with the current method of
determining the append parallel workers.

David



Re: A reloption for partitioned tables - parallel_workers

From
Laurenz Albe
Date:
On Wed, 2021-03-24 at 14:14 +1300, David Rowley wrote:
> On Fri, 19 Mar 2021 at 02:07, Amit Langote <amitlangote09@gmail.com> wrote:
> > Attached a new version rebased over c8f78b616, with the grouping
> > relation partitioning enhancements as a separate patch 0001.  Sorry
> > about the delay.
> 
> I had a quick look at this and wondered if the partitioned table's
> parallel workers shouldn't be limited to the sum of the parallel
> workers of the Append's subpaths?
> 
> It seems a bit weird to me that the following case requests 4 workers:
> 
> # create table lp (a int) partition by list(a);
> # create table lp1 partition of lp for values in(1);
> # insert into lp select 1 from generate_series(1,10000000) x;
> # alter table lp1 set (parallel_workers = 2);
> # alter table lp set (parallel_workers = 4);
> # set max_parallel_workers_per_Gather = 8;
> # explain select count(*) from lp;
>                                         QUERY PLAN
> -------------------------------------------------------------------------------------------
>  Finalize Aggregate  (cost=97331.63..97331.64 rows=1 width=8)
>    ->  Gather  (cost=97331.21..97331.62 rows=4 width=8)
>          Workers Planned: 4
>          ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
>                ->  Parallel Seq Scan on lp1 lp  (cost=0.00..85914.57
> rows=4166657 width=0)
> (5 rows)
> 
> I can see a good argument that there should only be 2 workers here.

Good point, I agree.

> If someone sets the partitioned table's parallel_workers high so that
> they get a large number of workers when no partitions are pruned
> during planning, do they really want the same number of workers in
> queries where a large number of partitions are pruned?
> 
> This problem gets a bit more complex in generic plans where the
> planner can't prune anything but run-time pruning prunes many
> partitions. I'm not so sure what to do about that, but the problem
> does exist today to a lesser extent with the current method of
> determining the append parallel workers.

Also a good point.  That would require changing the actual number of
parallel workers at execution time, but that is tricky.
If we go with your suggestion above, we'd have to disambiguate if
the number of workers is set because a partition is large enough
to warrant a parallel scan (then it shouldn't be reduced if the executor
prunes partitions) or if it is because of the number of partitions
(then it should be reduced).

Currently, we don't reduce parallelism if the executor prunes
partitions, so this could be seen as an independent problem.

I don't know if Seamus is still working on that; if not, we might
mark it as "returned with feedback".

Perhaps Amit's patch 0001 should go in independently.

I'll mark the patch as "waiting for author".

Yours,
Laurenz Albe




Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Fri, Apr 2, 2021 at 11:36 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> On Wed, 2021-03-24 at 14:14 +1300, David Rowley wrote:
> > On Fri, 19 Mar 2021 at 02:07, Amit Langote <amitlangote09@gmail.com> wrote:
> > > Attached a new version rebased over c8f78b616, with the grouping
> > > relation partitioning enhancements as a separate patch 0001.  Sorry
> > > about the delay.
> >
> > I had a quick look at this and wondered if the partitioned table's
> > parallel workers shouldn't be limited to the sum of the parallel
> > workers of the Append's subpaths?
> >
> > It seems a bit weird to me that the following case requests 4 workers:
> >
> > # create table lp (a int) partition by list(a);
> > # create table lp1 partition of lp for values in(1);
> > # insert into lp select 1 from generate_series(1,10000000) x;
> > # alter table lp1 set (parallel_workers = 2);
> > # alter table lp set (parallel_workers = 4);
> > # set max_parallel_workers_per_Gather = 8;
> > # explain select count(*) from lp;
> >                                         QUERY PLAN
> > -------------------------------------------------------------------------------------------
> >  Finalize Aggregate  (cost=97331.63..97331.64 rows=1 width=8)
> >    ->  Gather  (cost=97331.21..97331.62 rows=4 width=8)
> >          Workers Planned: 4
> >          ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
> >                ->  Parallel Seq Scan on lp1 lp  (cost=0.00..85914.57
> > rows=4166657 width=0)
> > (5 rows)
> >
> > I can see a good argument that there should only be 2 workers here.
>
> Good point, I agree.
>
> > If someone sets the partitioned table's parallel_workers high so that
> > they get a large number of workers when no partitions are pruned
> > during planning, do they really want the same number of workers in
> > queries where a large number of partitions are pruned?
> >
> > This problem gets a bit more complex in generic plans where the
> > planner can't prune anything but run-time pruning prunes many
> > partitions. I'm not so sure what to do about that, but the problem
> > does exist today to a lesser extent with the current method of
> > determining the append parallel workers.
>
> Also a good point.  That would require changing the actual number of
> parallel workers at execution time, but that is tricky.
> If we go with your suggestion above, we'd have to disambiguate if
> the number of workers is set because a partition is large enough
> to warrant a parallel scan (then it shouldn't be reduced if the executor
> prunes partitions) or if it is because of the number of partitions
> (then it should be reduced).

Maybe we really want a parallel_append_workers for partitioned tables,
instead of piggybacking on parallel_workers?

> I don't know if Seamus is still working on that; if not, we might
> mark it as "returned with feedback".

I have to agree given the time left.

> Perhaps Amit's patch 0001 should go in independently.

Perhaps, but maybe we should wait until something really needs that.

--
Amit Langote
EDB: http://www.enterprisedb.com



Re: A reloption for partitioned tables - parallel_workers

From
Daniel Gustafsson
Date:
> On 6 Apr 2021, at 09:46, Amit Langote <amitlangote09@gmail.com> wrote:
> On Fri, Apr 2, 2021 at 11:36 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

>> I don't know if Seamus is still working on that; if not, we might
>> mark it as "returned with feedback".
>
> I have to agree given the time left.

This thread has stalled and the patch no longer applies.  I propose that we
mark this Returned with Feedback, is that Ok with you Amit?

--
Daniel Gustafsson        https://vmware.com/




Re: A reloption for partitioned tables - parallel_workers

From
Laurenz Albe
Date:
On Fri, 2021-09-03 at 18:24 +0200, Daniel Gustafsson wrote:
> > On 6 Apr 2021, at 09:46, Amit Langote <amitlangote09@gmail.com> wrote:
> > On Fri, Apr 2, 2021 at 11:36 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> 
> > > I don't know if Seamus is still working on that; if not, we might
> > > mark it as "returned with feedback".
> > 
> > I have to agree given the time left.
> 
> This thread has stalled and the patch no longer applies.  I propose that we
> mark this Returned with Feedback, is that Ok with you Amit?

+1.  That requires more thought.

Yours,
Laurenz Albe




Re: A reloption for partitioned tables - parallel_workers

From
Amit Langote
Date:
On Sat, Sep 4, 2021 at 3:10 Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2021-09-03 at 18:24 +0200, Daniel Gustafsson wrote:
> > On 6 Apr 2021, at 09:46, Amit Langote <amitlangote09@gmail.com> wrote:
> > On Fri, Apr 2, 2021 at 11:36 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> > > I don't know if Seamus is still working on that; if not, we might
> > > mark it as "returned with feedback".
> >
> > I have to agree given the time left.
>
> This thread has stalled and the patch no longer applies.  I propose that we
> mark this Returned with Feedback, is that Ok with you Amit?

+1.  That requires more thought.

Yes, I think so too.
--

Re: A reloption for partitioned tables - parallel_workers

From
Daniel Gustafsson
Date:
> On 4 Sep 2021, at 01:17, Amit Langote <amitlangote09@gmail.com> wrote:
> On Sat, Sep 4, 2021 at 3:10 Laurenz Albe <laurenz.albe@cybertec.at <mailto:laurenz.albe@cybertec.at>> wrote:
> On Fri, 2021-09-03 at 18:24 +0200, Daniel Gustafsson wrote:

> > This thread has stalled and the patch no longer applies.  I propose that we
> > mark this Returned with Feedback, is that Ok with you Amit?
>
> +1.  That requires more thought.
>
> Yes, I think so too.

Done that way, it can always be resubmitted in a later CF.

--
Daniel Gustafsson        https://vmware.com/