Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CA+hUKG+D6rA=dbYntrcwoA8kYWB5LAQ0uay+q1u7QLZo98OLJw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)
List pgsql-hackers
On Sun, Oct 11, 2020 at 12:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> + /*
> + * For Parallel INSERT, provided no tuples are returned from workers
> + * to gather/leader node, don't add a cost-per-row, as each worker
> + * parallelly inserts the tuples that result from its chunk of plan
> + * execution. This change may make the parallel plan cheap among all
> + * other plans, and influence the planner to consider this parallel
> + * plan.
> + */
> + if (!(IsA(path->subpath, ModifyTablePath) &&
> + castNode(ModifyTablePath, path->subpath)->operation == CMD_INSERT &&
> + castNode(ModifyTablePath, path->subpath)->returningLists != NULL))
> + {
> + run_cost += parallel_tuple_cost * path->path.rows;
> + }
>
> Isn't the last condition in above check "castNode(ModifyTablePath,
> path->subpath)->returningLists != NULL" should be
> "castNode(ModifyTablePath, path->subpath)->returningLists == NULL"
> instead? Because otherwise when there is returning list it won't count
> the cost for passing tuples via Gather node. This might be reason of
> what Thomas has seen in his recent email [2].

Yeah, I think this is trying to fix the problem too late.  Instead, we
should fix the incorrect row estimates so we don't have to fudge it
later like that.  For example, this should be estimating rows=0:

postgres=# explain analyze insert into s select * from t t1 join t t2 using (i);
...
 Insert on s  (cost=30839.08..70744.45 rows=1000226 width=4) (actual
time=2940.560..2940.562 rows=0 loops=1)

I think that should be done with something like this:

--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -3583,16 +3583,11 @@ create_modifytable_path(PlannerInfo *root,
RelOptInfo *rel,
                if (lc == list_head(subpaths))  /* first node? */
                        pathnode->path.startup_cost = subpath->startup_cost;
                pathnode->path.total_cost += subpath->total_cost;
-               pathnode->path.rows += subpath->rows;
+               if (returningLists != NIL)
+                       pathnode->path.rows += subpath->rows;
                total_size += subpath->pathtarget->width * subpath->rows;
        }

-       /*
-        * Set width to the average width of the subpath outputs.  XXX this is
-        * totally wrong: we should report zero if no RETURNING, else an average
-        * of the RETURNING tlist widths.  But it's what happened historically,
-        * and improving it is a task for another day.
-        */
        if (pathnode->path.rows > 0)
                total_size /= pathnode->path.rows;
        pathnode->path.pathtarget->width = rint(total_size);



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Noah Misch
Date:
Subject: Re: powerpc pg_atomic_compare_exchange_u32_impl: error: comparison of integer expressions of different signedness (Re: pgsql: For all ppc compilers, implement compare_exchange and) fetch_add