Re: fix cost subqueryscan wrong parallel cost - Mailing list pgsql-hackers

From Tom Lane
Subject Re: fix cost subqueryscan wrong parallel cost
Date
Msg-id 440376.1651261083@sss.pgh.pa.us
Whole thread Raw
In response to Re: fix cost subqueryscan wrong parallel cost  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: fix cost subqueryscan wrong parallel cost  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
I wrote:
> So perhaps we should do it more like the attached, which produces
> this plan for the UNION case:

sigh ... actually attached this time.

            regards, tom lane

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b787c6f81a..18749e842d 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1395,6 +1395,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
 {
     Cost        startup_cost;
     Cost        run_cost;
+    List       *qpquals;
     QualCost    qpqual_cost;
     Cost        cpu_per_tuple;

@@ -1402,11 +1403,24 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
     Assert(baserel->relid > 0);
     Assert(baserel->rtekind == RTE_SUBQUERY);

-    /* Mark the path with the correct row estimate */
+    /*
+     * We compute the rowcount estimate as the subplan's estimate times the
+     * selectivity of relevant restriction clauses.  In simple cases this will
+     * come out the same as baserel->rows; but when dealing with parallelized
+     * paths we must do it like this to get the right answer.
+     */
     if (param_info)
-        path->path.rows = param_info->ppi_rows;
+        qpquals = list_concat_copy(param_info->ppi_clauses,
+                                   baserel->baserestrictinfo);
     else
-        path->path.rows = baserel->rows;
+        qpquals = baserel->baserestrictinfo;
+
+    path->path.rows = clamp_row_est(path->subpath->rows *
+                                    clauselist_selectivity(root,
+                                                           qpquals,
+                                                           0,
+                                                           JOIN_INNER,
+                                                           NULL));

     /*
      * Cost of path is cost of evaluating the subplan, plus cost of evaluating
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 21c429226f..0d8d77140a 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1487,14 +1487,12 @@ explain (costs off) select * from t union select * from t order by 1,3;
    ->  Unique
          ->  Sort
                Sort Key: t.a, t.b, t.c
-               ->  Append
-                     ->  Gather
-                           Workers Planned: 2
+               ->  Gather
+                     Workers Planned: 2
+                     ->  Parallel Append
                            ->  Parallel Seq Scan on t
-                     ->  Gather
-                           Workers Planned: 2
                            ->  Parallel Seq Scan on t t_1
-(13 rows)
+(11 rows)

 -- Full sort, not just incremental sort can be pushed below a gather merge path
 -- by generate_useful_gather_paths.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: fix cost subqueryscan wrong parallel cost
Next
From: Andres Freund
Date:
Subject: Re: failures in t/031_recovery_conflict.pl on CI