Thread: Improve planner cost estimations for alternative subplans

Improve planner cost estimations for alternative subplans

From
Alexey Bashtanov
Date:
Hello,

Currently, in case of alternative subplans that do hashed vs per-row 
lookups,
the per-row estimate is used when planning the rest of the query.
It's also coded in not quite an explicit way.

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the 
planner,
and as a result it doesn't elaborate on choosing the optimal one.

The patch is to fix it. Our linear model for costs cannot quite accommodate
the piecewise linear matter of alternative subplans,
so it is based on ugly heuristics and still cannot be very precise,
but I think it's better than the current one.

Thoughts?

Best, Alex

[1] 
https://www.postgresql.org/message-id/flat/ff42b25b-ff03-27f8-ed11-b8255d658cd5%40imap.cc

Attachment

Re: Improve planner cost estimations for alternative subplans

From
Melanie Plageman
Date:

On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.


I've just started looking at this patch today, but I was wondering if
you might include a test case which minimally reproduces the original
problem you had.
The only plan diff I see is in updatable_views.sql, and it doesn't
illustrate the problem as well as a more straightforward SELECT query
with EXISTS sublink might.

After putting in some logging, I see that there are only a
few non-catalog queries which exercise this code path.
This query from groupingsets.sql is an example of one such query:

select ten, sum(distinct four) from onek a
group by grouping sets((ten,four),(ten))
having exists (select 1 from onek b where sum(distinct a.four) = b.four);

But, the chosen plan for this query stays the same.

It would be helpful to see a query where a different plan is chosen
because of this change that is not from updatable_views.sql.

--
Melanie Plageman

Re: Improve planner cost estimations for alternative subplans

From
Melanie Plageman
Date:


On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.


Did this geometric average method result in choosing the desired plan for
this case?
 
The patch is to fix it. Our linear model for costs cannot quite accommodate
the piecewise linear matter of alternative subplans,
so it is based on ugly heuristics and still cannot be very precise,
but I think it's better than the current one.

Thoughts?


Is there another place in planner where two alternatives are averaged
together and that cost is used?

To me, it feels a little bit weird that we are averaging together the
startup cost of a plan which will always have a 0 startup cost and a
plan that will always have a non-zero startup cost and the per tuple
cost of a plan that will always have a negligible per tuple cost and one
that might have a very large per tuple cost.

I guess it feels different because instead of comparing alternatives you
are blending them.

I don't have any academic basis for saying that the alternatives costs
shouldn't be averaged together for use in the rest of the plan, so I
could definitely be wrong.
 
--
Melanie Plageman

Re: Improve planner cost estimations for alternative subplans

From
Tomas Vondra
Date:
On Wed, Jun 17, 2020 at 06:21:58PM -0700, Melanie Plageman wrote:
>On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:
>
>>
>> In [1] we found a situation where it leads to a suboptimal plan,
>> as it bloats the overall cost into large figures,
>> a decision related to an outer part of the plan look negligible to the
>> planner,
>> and as a result it doesn't elaborate on choosing the optimal one.
>>
>>
>Did this geometric average method result in choosing the desired plan for
>this case?
>
>
>> The patch is to fix it. Our linear model for costs cannot quite accommodate
>> the piecewise linear matter of alternative subplans,
>> so it is based on ugly heuristics and still cannot be very precise,
>> but I think it's better than the current one.
>>
>> Thoughts?
>>
>>
>Is there another place in planner where two alternatives are averaged
>together and that cost is used?
>
>To me, it feels a little bit weird that we are averaging together the
>startup cost of a plan which will always have a 0 startup cost and a
>plan that will always have a non-zero startup cost and the per tuple
>cost of a plan that will always have a negligible per tuple cost and one
>that might have a very large per tuple cost.
>
>I guess it feels different because instead of comparing alternatives you
>are blending them.
>
>I don't have any academic basis for saying that the alternatives costs
>shouldn't be averaged together for use in the rest of the plan, so I
>could definitely be wrong.
>

I agree it feels weird. Even if it actually improved the problematic
case, I think it'll be quite hard to convince ourselves this helps in
general. For example, for cases that actually end up using the first
plan, this is bound to make the estimates worse. I find it hard to
believe it won't cause regressions in at least some cases.

Maybe this heuristics really is better than the old one, but I think we
need to understand why - a single query probably is not enough.

I think the crucial limitation here is that we don't know which of the
alternative plans will be used. Is there a chance to improve this,
perhaps by making some sort of guess?

I'm not particularly familiar with AlternativeSubPlans, but I see we're
picking the one in nodeSubplan.c based on plan_rows. Can't we do the
same thing in cost_qual_eval_walker?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Improve planner cost estimations for alternative subplans

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I'm not particularly familiar with AlternativeSubPlans, but I see we're
> picking the one in nodeSubplan.c based on plan_rows. Can't we do the
> same thing in cost_qual_eval_walker?

Nope.  The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

I agree that averaging together the costs of the alternatives seems
wrong in principle.  It's going to be one or the other, not some
quantum-mechanical superposition.  Maybe there's a case for taking the
min costs (if you're feeling lucky) or the max costs (if you're not),
on the belief that the executor will/will not pick the choice that
contributes least to the total query cost.  But I have a feeling that
either of those would distort our estimates too much.  The existing
costing behavior at least can be seen to match *one* actual runtime
behavior.

            regards, tom lane



Re: Improve planner cost estimations for alternative subplans

From
Tom Lane
Date:
I wrote:
> Nope.  The entire reason why we have that kluge is that we don't know
> until much later how many times we expect to execute the subplan.
> AlternativeSubPlan allows the decision which subplan form to use to be
> postponed till runtime; but when we're doing things like estimating the
> cost and selectivity of a where-clause, we don't know that.

> Maybe there's some way to recast things to avoid that problem,
> but I have little clue what it'd look like.

Actually ... maybe it's not that bad.  Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity.  Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.

I experimented with adding a number-of-evaluations parameter to
cost_qual_eval, and found that the majority of call sites do have
something realistic they can pass.  The attached very-much-WIP
patch shows my results so far.  There's a lot of loose ends:

* Any call site using COST_QUAL_EVAL_DUMMY_NUM_EVALS is a potential spot
for future improvement.  The only one that seems like it might be
fundamentally unsolvable is cost_subplan's usage; but that would only
matter if a subplan's testexpr contains an AlternativeSubPlan, which is
probably a negligible case in practice.  The other ones seem to just
require more refactoring than I cared to do on a Sunday afternoon.

* I did not do anything for postgres_fdw.c beyond making it compile.
We can surely do better there, but it might require some rethinking
of the way that plan costs get cached.

* The merge and hash join costsize.c functions estimate costs of qpquals
(i.e. quals to be applied at the join that are not being used as merge
or hash conditions) by computing cost_qual_eval of the whole
joinrestrictlist and then subtracting off the cost of the merge or hash
quals.  This is kind of broken if we want to use different num_eval
estimates for the qpquals and the merge/hash quals, which I think we do.
This probably just needs some refactoring to fix.  We also need to save
the relevant rowcounts in the join Path nodes so that createplan.c can
do the right thing.

* I think it might be possible to improve the situation in
get_agg_clause_costs() if we're willing to postpone collection
of the actual aggregate costs till later.  This'd require two
passes over the aggregate expressions, but that seems like it
might not be terribly expensive.  (I'd be inclined to also look
at the idea of merging duplicate agg calls at plan time not
run time, if we refactor that.)

* I had to increase the number of table rows in one updatable_views.sql
test to keep the plans the same.  Without that, the planner realized
that a seqscan would be cheaper than an indexscan.  The result wasn't
wrong exactly, but it failed to prove that leakproof quals could be
used as indexquals, so I think we need to keep the plan choice the same.

Anyway, this is kind of invasive, but I think it shouldn't really
add noticeable costs as long as we save relevant rowcounts rather
than recomputing them in createplan.c.  Is it worth doing?  I dunno.
AlternativeSubPlan is pretty much a backwater, I think --- if it
were interesting performance-wise to a lot of people, more would
have been done with it by now.

            regards, tom lane

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 9fc53cad68..1149c298ba 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -656,7 +656,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
                                                      JOIN_INNER,
                                                      NULL);

-    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

     /*
      * Set # of retrieved rows and cached relation costs to some negative
@@ -2766,7 +2767,8 @@ estimate_path_cost_size(PlannerInfo *root,
         /* Add in the eval cost of the locally-checked quals */
         startup_cost += fpinfo->local_conds_cost.startup;
         total_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
-        cost_qual_eval(&local_cost, local_param_join_conds, root);
+        cost_qual_eval(&local_cost, local_param_join_conds,
+                       COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
         startup_cost += local_cost.startup;
         total_cost += local_cost.per_tuple * retrieved_rows;

@@ -2782,7 +2784,8 @@ estimate_path_cost_size(PlannerInfo *root,
         {
             QualCost    tlist_cost;

-            cost_qual_eval(&tlist_cost, fdw_scan_tlist, root);
+            cost_qual_eval(&tlist_cost, fdw_scan_tlist,
+                           COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
             startup_cost -= tlist_cost.startup;
             total_cost -= tlist_cost.startup;
             total_cost -= tlist_cost.per_tuple * rows;
@@ -2870,9 +2873,11 @@ estimate_path_cost_size(PlannerInfo *root,
             /*
              * Calculate the cost of clauses pushed down to the foreign server
              */
-            cost_qual_eval(&remote_conds_cost, fpinfo->remote_conds, root);
+            cost_qual_eval(&remote_conds_cost, fpinfo->remote_conds,
+                           COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
             /* Calculate the cost of applying join clauses */
-            cost_qual_eval(&join_cost, fpinfo->joinclauses, root);
+            cost_qual_eval(&join_cost, fpinfo->joinclauses,
+                           COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

             /*
              * Startup cost includes startup cost of joining relations and the
@@ -3019,7 +3024,8 @@ estimate_path_cost_size(PlannerInfo *root,
                 QualCost    remote_cost;

                 /* Add in the eval cost of the remotely-checked quals */
-                cost_qual_eval(&remote_cost, fpinfo->remote_conds, root);
+                cost_qual_eval(&remote_cost, fpinfo->remote_conds,
+                               COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
                 startup_cost += remote_cost.startup;
                 run_cost += remote_cost.per_tuple * numGroups;
                 /* Add in the eval cost of the locally-checked quals */
@@ -5514,7 +5520,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
                                                      0,
                                                      JOIN_INNER,
                                                      NULL);
-    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

     /*
      * If we are going to estimate costs locally, estimate the join clause
@@ -5912,7 +5919,8 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
                                                      JOIN_INNER,
                                                      NULL);

-    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+    cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

     /* Estimate the cost of push down */
     estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index af87f172a7..008efbac06 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -1380,7 +1380,7 @@ amcostestimate (PlannerInfo *root,
  * Also, we charge for evaluation of the indexquals at each index row.
  * All the costs are assumed to be paid incrementally during the scan.
  */
-cost_qual_eval(&index_qual_cost, path->indexquals, root);
+cost_qual_eval(&index_qual_cost, path->indexquals, numIndexTuples, root);
 *indexStartupCost = index_qual_cost.startup;
 *indexTotalCost = seq_page_cost * numIndexPages +
     (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 4ff3c7a2fd..e3e1e80096 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -91,6 +91,7 @@
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
+#include "optimizer/subselect.h"
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 #include "utils/selfuncs.h"
@@ -145,6 +146,8 @@ bool        enable_partition_pruning = true;
 typedef struct
 {
     PlannerInfo *root;
+    double        num_evals;
+    bool        num_evals_used;
     QualCost    total;
 } cost_qual_eval_context;

@@ -176,7 +179,7 @@ static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
                                                     List **restrictlist);
 static Cost append_nonpartial_cost(List *subpaths, int numpaths,
                                    int parallel_workers);
-static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
+static void set_rel_width(PlannerInfo *root, RelOptInfo *rel, double num_evals);
 static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
@@ -722,7 +725,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
      * What we want here is cpu_tuple_cost plus the evaluation costs of any
      * qual clauses that we have to evaluate as qpquals.
      */
-    cost_qual_eval(&qpqual_cost, qpquals, root);
+    cost_qual_eval(&qpqual_cost, qpquals, tuples_fetched, root);

     startup_cost += qpqual_cost.startup;
     cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
@@ -1247,7 +1250,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
      * The TID qual expressions will be computed once, any other baserestrict
      * quals once per retrieved tuple.
      */
-    cost_qual_eval(&tid_qual_cost, tidquals, root);
+    cost_qual_eval(&tid_qual_cost, tidquals, 1, root);

     /* fetch estimated page cost for tablespace containing table */
     get_tablespace_page_costs(baserel->reltablespace,
@@ -1365,7 +1368,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
      * estimates for functions tend to be, there's not a lot of point in that
      * refinement right now.
      */
-    cost_qual_eval_node(&exprcost, (Node *) rte->functions, root);
+    cost_qual_eval_node(&exprcost, (Node *) rte->functions, 1, root);

     startup_cost += exprcost.startup + exprcost.per_tuple;

@@ -1421,7 +1424,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
      * estimates for tablefuncs tend to be, there's not a lot of point in that
      * refinement right now.
      */
-    cost_qual_eval_node(&exprcost, (Node *) rte->tablefunc, root);
+    cost_qual_eval_node(&exprcost, (Node *) rte->tablefunc, 1, root);

     startup_cost += exprcost.startup + exprcost.per_tuple;

@@ -2467,7 +2470,7 @@ cost_agg(Path *path, PlannerInfo *root,
     {
         QualCost    qual_cost;

-        cost_qual_eval(&qual_cost, quals, root);
+        cost_qual_eval(&qual_cost, quals, output_tuples, root);
         startup_cost += qual_cost.startup;
         total_cost += qual_cost.startup + output_tuples * qual_cost.per_tuple;

@@ -2526,7 +2529,8 @@ cost_windowagg(Path *path, PlannerInfo *root,
         wfunccost = argcosts.per_tuple;

         /* also add the input expressions' cost to per-input-row costs */
-        cost_qual_eval_node(&argcosts, (Node *) wfunc->args, root);
+        cost_qual_eval_node(&argcosts, (Node *) wfunc->args,
+                            input_tuples, root);
         startup_cost += argcosts.startup;
         wfunccost += argcosts.per_tuple;

@@ -2534,7 +2538,8 @@ cost_windowagg(Path *path, PlannerInfo *root,
          * Add the filter's cost to per-input-row costs.  XXX We should reduce
          * input expression costs according to filter selectivity.
          */
-        cost_qual_eval_node(&argcosts, (Node *) wfunc->aggfilter, root);
+        cost_qual_eval_node(&argcosts, (Node *) wfunc->aggfilter,
+                            input_tuples, root);
         startup_cost += argcosts.startup;
         wfunccost += argcosts.per_tuple;

@@ -2594,7 +2599,7 @@ cost_group(Path *path, PlannerInfo *root,
     {
         QualCost    qual_cost;

-        cost_qual_eval(&qual_cost, quals, root);
+        cost_qual_eval(&qual_cost, quals, output_tuples, root);
         startup_cost += qual_cost.startup;
         total_cost += qual_cost.startup + output_tuples * qual_cost.per_tuple;

@@ -2874,7 +2879,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
     }

     /* CPU costs */
-    cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root);
+    cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, ntuples, root);
     startup_cost += restrict_qual_cost.startup;
     cpu_per_tuple = cpu_tuple_cost + restrict_qual_cost.per_tuple;
     run_cost += cpu_per_tuple * ntuples;
@@ -3200,12 +3205,23 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
     if (!enable_mergejoin)
         startup_cost += disable_cost;

+    /*
+     * Get approx # tuples passing the mergequals.  We use approx_tuple_count
+     * here because we need an estimate done with JOIN_INNER semantics.
+     */
+    mergejointuples = approx_tuple_count(root, &path->jpath, mergeclauses);
+
     /*
      * Compute cost of the mergequals and qpquals (other restriction clauses)
-     * separately.
+     * separately.  We use mergejointuples for num_evals for all the quals,
+     * else the arithmetic below would be bogus.  That's the right thing for
+     * the qpquals, and it seems unlikely that there'd be subplans in the
+     * mergequals.
      */
-    cost_qual_eval(&merge_qual_cost, mergeclauses, root);
-    cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo, root);
+    cost_qual_eval(&merge_qual_cost, mergeclauses,
+                   mergejointuples, root);
+    cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo,
+                   mergejointuples, root);
     qp_qual_cost.startup -= merge_qual_cost.startup;
     qp_qual_cost.per_tuple -= merge_qual_cost.per_tuple;

@@ -3224,12 +3240,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
     else
         path->skip_mark_restore = false;

-    /*
-     * Get approx # tuples passing the mergequals.  We use approx_tuple_count
-     * here because we need an estimate done with JOIN_INNER semantics.
-     */
-    mergejointuples = approx_tuple_count(root, &path->jpath, mergeclauses);
-
     /*
      * When there are equal merge keys in the outer relation, the mergejoin
      * must rescan any matching tuples in the inner relation. This means
@@ -3729,13 +3739,12 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
         startup_cost += disable_cost;

     /*
-     * Compute cost of the hashquals and qpquals (other restriction clauses)
-     * separately.
+     * Compute cost of the hashquals.  We don't expend effort on estimating
+     * how often they'll be evaluated before doing so, reasoning that it's
+     * unlikely they'll contain any subplans.
      */
-    cost_qual_eval(&hash_qual_cost, hashclauses, root);
-    cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo, root);
-    qp_qual_cost.startup -= hash_qual_cost.startup;
-    qp_qual_cost.per_tuple -= hash_qual_cost.per_tuple;
+    cost_qual_eval(&hash_qual_cost, hashclauses,
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

     /* CPU costs */

@@ -3812,6 +3821,16 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
         hashjointuples = approx_tuple_count(root, &path->jpath, hashclauses);
     }

+    /*
+     * Compute cost of the qpquals (join quals that aren't hash quals) by
+     * costing the joinrestrictinfo list and then subtracting hash_qual_cost.
+     * XXX This could give a bogus answer if the hash quals contain subplans.
+     */
+    cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo,
+                   hashjointuples, root);
+    qp_qual_cost.startup -= hash_qual_cost.startup;
+    qp_qual_cost.per_tuple -= hash_qual_cost.per_tuple;
+
     /*
      * For each tuple that gets through the hashjoin proper, we charge
      * cpu_tuple_cost plus the cost of evaluating additional restriction
@@ -3846,6 +3865,7 @@ cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan)
     /* Figure any cost for evaluating the testexpr */
     cost_qual_eval(&sp_cost,
                    make_ands_implicit((Expr *) subplan->testexpr),
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS,
                    root);

     if (subplan->useHashTable)
@@ -4037,14 +4057,21 @@ cost_rescan(PlannerInfo *root, Path *path,
  *        preferred since it allows caching of the results.)
  *        The result includes both a one-time (startup) component,
  *        and a per-evaluation component.
+ *
+ * The caller is also asked to provide an estimated number of evaluations.
+ * That's often fairly bogus, since we may not have any good estimate yet.
+ * We use it only to decide which entry of an AlternativeSubPlan to consider.
  */
 void
-cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root)
+cost_qual_eval(QualCost *cost, List *quals,
+               double num_evals, PlannerInfo *root)
 {
     cost_qual_eval_context context;
     ListCell   *l;

     context.root = root;
+    context.num_evals = num_evals;
+    context.num_evals_used = false;
     context.total.startup = 0;
     context.total.per_tuple = 0;

@@ -4065,11 +4092,14 @@ cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root)
  *        As above, for a single RestrictInfo or expression.
  */
 void
-cost_qual_eval_node(QualCost *cost, Node *qual, PlannerInfo *root)
+cost_qual_eval_node(QualCost *cost, Node *qual,
+                    double num_evals, PlannerInfo *root)
 {
     cost_qual_eval_context context;

     context.root = root;
+    context.num_evals = num_evals;
+    context.num_evals_used = false;
     context.total.startup = 0;
     context.total.per_tuple = 0;

@@ -4087,8 +4117,9 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
     /*
      * RestrictInfo nodes contain an eval_cost field reserved for this
      * routine's use, so that it's not necessary to evaluate the qual clause's
-     * cost more than once.  If the clause's cost hasn't been computed yet,
-     * the field's startup value will contain -1.
+     * cost more than once (unless the clause contains an AlternativeSubPlan).
+     * If the clause's cost hasn't been cached, the field's startup value will
+     * contain -1.
      */
     if (IsA(node, RestrictInfo))
     {
@@ -4099,6 +4130,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
             cost_qual_eval_context locContext;

             locContext.root = context->root;
+            locContext.num_evals = context->num_evals;
+            locContext.num_evals_used = false;
             locContext.total.startup = 0;
             locContext.total.per_tuple = 0;

@@ -4121,6 +4154,18 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
                 locContext.total.startup += locContext.total.per_tuple;
                 locContext.total.per_tuple = 0;
             }
+
+            /*
+             * If we made use of num_evals, we can't cache the result.
+             */
+            if (locContext.num_evals_used)
+            {
+                context->total.startup += locContext.total.startup;
+                context->total.per_tuple += locContext.total.per_tuple;
+                /* do NOT recurse into children */
+                return false;
+            }
+
             rinfo->eval_cost = locContext.total;
         }
         context->total.startup += rinfo->eval_cost.startup;
@@ -4218,14 +4263,19 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
     else if (IsA(node, ArrayCoerceExpr))
     {
         ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) node;
-        QualCost    perelemcost;
-
-        cost_qual_eval_node(&perelemcost, (Node *) acoerce->elemexpr,
-                            context->root);
-        context->total.startup += perelemcost.startup;
-        if (perelemcost.per_tuple > 0)
-            context->total.per_tuple += perelemcost.per_tuple *
+        cost_qual_eval_context elemContext;
+
+        elemContext.root = context->root;
+        elemContext.num_evals = context->num_evals;
+        elemContext.num_evals_used = false;
+        elemContext.total.startup = 0;
+        elemContext.total.per_tuple = 0;
+        cost_qual_eval_walker((Node *) acoerce->elemexpr, &elemContext);
+        context->total.startup += elemContext.total.startup;
+        if (elemContext.total.per_tuple > 0)
+            context->total.per_tuple += elemContext.total.per_tuple *
                 estimate_array_length((Node *) acoerce->arg);
+        context->num_evals_used |= elemContext.num_evals_used;
     }
     else if (IsA(node, RowCompareExpr))
     {
@@ -4282,15 +4332,15 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
     else if (IsA(node, AlternativeSubPlan))
     {
         /*
-         * Arbitrarily use the first alternative plan for costing.  (We should
-         * certainly only include one alternative, and we don't yet have
-         * enough information to know which one the executor is most likely to
-         * use.)
+         * Make use of the caller's estimated number of evaluations to decide
+         * which subplan should be used.
          */
         AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;
+        SubPlan    *subplan;

-        return cost_qual_eval_walker((Node *) linitial(asplan->subplans),
-                                     context);
+        subplan = SS_choose_alternative_subplan(asplan, context->num_evals);
+        context->num_evals_used = true;
+        return cost_qual_eval_walker((Node *) subplan, context);
     }
     else if (IsA(node, PlaceHolderVar))
     {
@@ -4332,7 +4382,9 @@ get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
     if (param_info)
     {
         /* Include costs of pushed-down clauses */
-        cost_qual_eval(qpqual_cost, param_info->ppi_clauses, root);
+        /* XXX baserel->tuples is wrong for a few callers */
+        cost_qual_eval(qpqual_cost, param_info->ppi_clauses,
+                       baserel->tuples, root);

         qpqual_cost->startup += baserel->baserestrictcost.startup;
         qpqual_cost->per_tuple += baserel->baserestrictcost.per_tuple;
@@ -4640,9 +4692,10 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)

     rel->rows = clamp_row_est(nrows);

-    cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
+    cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo,
+                   rel->tuples, root);

-    set_rel_width(root, rel);
+    set_rel_width(root, rel, rel->tuples);
 }

 /*
@@ -5410,9 +5463,10 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)

     rel->rows = 1000;            /* entirely bogus default estimate */

-    cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
+    cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo,
+                   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

-    set_rel_width(root, rel);
+    set_rel_width(root, rel, COST_QUAL_EVAL_DUMMY_NUM_EVALS);
 }


@@ -5426,6 +5480,7 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
  * we'd need to pass upwards in case of a sort, hash, etc.
  *
  * This function also sets reltarget->cost, so it's a bit misnamed now.
+ * (Because it estimates costs, it also requires a num_evals estimate.)
  *
  * NB: this works best on plain relations because it prefers to look at
  * real Vars.  For subqueries, set_subquery_size_estimates will already have
@@ -5438,7 +5493,7 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
  * building join relations or post-scan/join pathtargets.
  */
 static void
-set_rel_width(PlannerInfo *root, RelOptInfo *rel)
+set_rel_width(PlannerInfo *root, RelOptInfo *rel, double num_evals)
 {
     Oid            reloid = planner_rt_fetch(rel->relid, root)->relid;
     int32        tuple_width = 0;
@@ -5523,7 +5578,7 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
             QualCost    cost;

             tuple_width += phinfo->ph_width;
-            cost_qual_eval_node(&cost, (Node *) phv->phexpr, root);
+            cost_qual_eval_node(&cost, (Node *) phv->phexpr, num_evals, root);
             rel->reltarget->cost.startup += cost.startup;
             rel->reltarget->cost.per_tuple += cost.per_tuple;
         }
@@ -5541,7 +5596,7 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
             Assert(item_width > 0);
             tuple_width += item_width;
             /* Not entirely clear if we need to account for cost, but do so */
-            cost_qual_eval_node(&cost, node, root);
+            cost_qual_eval_node(&cost, node, num_evals, root);
             rel->reltarget->cost.startup += cost.startup;
             rel->reltarget->cost.per_tuple += cost.per_tuple;
         }
@@ -5656,7 +5711,8 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
             tuple_width += item_width;

             /* Account for cost, too */
-            cost_qual_eval_node(&cost, node, root);
+            /* XXX have caller pass rowcount est */
+            cost_qual_eval_node(&cost, node, COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
             target->cost.startup += cost.startup;
             target->cost.per_tuple += cost.per_tuple;
         }
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index eb9543f6ad..6848aaff74 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -165,7 +165,8 @@ static Node *fix_indexqual_clause(PlannerInfo *root,
                                   Node *clause, List *indexcolnos);
 static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol);
 static List *get_switched_clauses(List *clauses, Relids outerrelids);
-static List *order_qual_clauses(PlannerInfo *root, List *clauses);
+static List *order_qual_clauses(PlannerInfo *root, List *clauses,
+                                double num_evals);
 static void copy_generic_path_info(Plan *dest, Path *src);
 static void copy_plan_costsize(Plan *dest, Plan *src);
 static void label_sort_with_costsize(PlannerInfo *root, Sort *plan,
@@ -941,8 +942,11 @@ get_gating_quals(PlannerInfo *root, List *quals)
     if (!root->hasPseudoConstantQuals)
         return NIL;

-    /* Sort into desirable execution order while still in RestrictInfo form */
-    quals = order_qual_clauses(root, quals);
+    /*
+     * Sort into desirable execution order while still in RestrictInfo form.
+     * Use num_evals = 1 since we only care about the gating quals here.
+     */
+    quals = order_qual_clauses(root, quals, 1);

     /* Pull out any pseudoconstant quals from the RestrictInfo list */
     return extract_actual_clauses(quals, true);
@@ -1454,7 +1458,7 @@ create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path)
     tlist = build_path_tlist(root, &best_path->path);

     /* best_path->quals is just bare clauses */
-    quals = order_qual_clauses(root, best_path->quals);
+    quals = order_qual_clauses(root, best_path->quals, 1);

     plan = make_result(tlist, (Node *) quals, NULL);

@@ -2055,7 +2059,7 @@ create_group_plan(PlannerInfo *root, GroupPath *best_path)

     tlist = build_path_tlist(root, &best_path->path);

-    quals = order_qual_clauses(root, best_path->qual);
+    quals = order_qual_clauses(root, best_path->qual, subplan->plan_rows);

     plan = make_group(tlist,
                       quals,
@@ -2132,7 +2136,7 @@ create_agg_plan(PlannerInfo *root, AggPath *best_path)

     tlist = build_path_tlist(root, &best_path->path);

-    quals = order_qual_clauses(root, best_path->qual);
+    quals = order_qual_clauses(root, best_path->qual, subplan->plan_rows);

     plan = make_agg(tlist, quals,
                     best_path->aggstrategy,
@@ -2771,14 +2775,15 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
                     List *tlist, List *scan_clauses)
 {
     SeqScan    *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;

     /* it should be a base rel... */
     Assert(scan_relid > 0);
-    Assert(best_path->parent->rtekind == RTE_RELATION);
+    Assert(rel->rtekind == RTE_RELATION);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -2809,7 +2814,8 @@ create_samplescan_plan(PlannerInfo *root, Path *best_path,
                        List *tlist, List *scan_clauses)
 {
     SampleScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     TableSampleClause *tsc;

@@ -2821,7 +2827,7 @@ create_samplescan_plan(PlannerInfo *root, Path *best_path,
     Assert(tsc != NULL);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -2865,7 +2871,8 @@ create_indexscan_plan(PlannerInfo *root,
     Scan       *scan_plan;
     List       *indexclauses = best_path->indexclauses;
     List       *indexorderbys = best_path->indexorderbys;
-    Index        baserelid = best_path->path.parent->relid;
+    RelOptInfo *rel = best_path->path.parent;
+    Index        baserelid = rel->relid;
     Oid            indexoid = best_path->indexinfo->indexoid;
     List       *qpqual;
     List       *stripped_indexquals;
@@ -2876,7 +2883,7 @@ create_indexscan_plan(PlannerInfo *root,

     /* it should be a base rel... */
     Assert(baserelid > 0);
-    Assert(best_path->path.parent->rtekind == RTE_RELATION);
+    Assert(rel->rtekind == RTE_RELATION);

     /*
      * Extract the index qual expressions (stripped of RestrictInfos) from the
@@ -2938,7 +2945,9 @@ create_indexscan_plan(PlannerInfo *root,
     }

     /* Sort clauses into best execution order */
-    qpqual = order_qual_clauses(root, qpqual);
+    qpqual = order_qual_clauses(root, qpqual,
+                                clamp_row_est(best_path->indexselectivity *
+                                              rel->tuples));

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     qpqual = extract_actual_clauses(qpqual, false);
@@ -3100,7 +3109,7 @@ create_bitmap_scan_plan(PlannerInfo *root,
     }

     /* Sort clauses into best execution order */
-    qpqual = order_qual_clauses(root, qpqual);
+    qpqual = order_qual_clauses(root, qpqual, bitmapqualplan->plan_rows);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     qpqual = extract_actual_clauses(qpqual, false);
@@ -3371,12 +3380,13 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
                     List *tlist, List *scan_clauses)
 {
     TidScan    *scan_plan;
-    Index        scan_relid = best_path->path.parent->relid;
+    RelOptInfo *rel = best_path->path.parent;
+    Index        scan_relid = rel->relid;
     List       *tidquals = best_path->tidquals;

     /* it should be a base rel... */
     Assert(scan_relid > 0);
-    Assert(best_path->path.parent->rtekind == RTE_RELATION);
+    Assert(rel->rtekind == RTE_RELATION);

     /*
      * The qpqual list must contain all restrictions not enforced by the
@@ -3418,7 +3428,8 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
     }

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    /* XXX need selectivity of tidquals to compute num_evals properly */
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo lists to bare expressions; ignore pseudoconstants */
     tidquals = extract_actual_clauses(tidquals, false);
@@ -3484,7 +3495,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
     subplan = create_plan(rel->subroot, best_path->subpath);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, subplan->plan_rows);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3518,7 +3529,8 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
                          List *tlist, List *scan_clauses)
 {
     FunctionScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     List       *functions;

@@ -3529,7 +3541,7 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
     functions = rte->functions;

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3561,7 +3573,8 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
                           List *tlist, List *scan_clauses)
 {
     TableFuncScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     TableFunc  *tablefunc;

@@ -3572,7 +3585,7 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
     tablefunc = rte->tablefunc;

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3604,7 +3617,8 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
                        List *tlist, List *scan_clauses)
 {
     ValuesScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     List       *values_lists;

@@ -3615,7 +3629,7 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
     values_lists = rte->values_lists;

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3648,7 +3662,8 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
                     List *tlist, List *scan_clauses)
 {
     CteScan    *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     SubPlan    *ctesplan = NULL;
     int            plan_id;
@@ -3711,7 +3726,7 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
     cte_param_id = linitial_int(ctesplan->setParam);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3742,7 +3757,8 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
                                 List *tlist, List *scan_clauses)
 {
     NamedTuplestoreScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;

     Assert(scan_relid > 0);
@@ -3750,7 +3766,7 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
     Assert(rte->rtekind == RTE_NAMEDTUPLESTORE);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3781,7 +3797,8 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
                        List *tlist, List *scan_clauses)
 {
     Result       *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;

     Assert(scan_relid > 0);
@@ -3789,7 +3806,7 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
     Assert(rte->rtekind == RTE_RESULT);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3818,7 +3835,8 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
                           List *tlist, List *scan_clauses)
 {
     WorkTableScan *scan_plan;
-    Index        scan_relid = best_path->parent->relid;
+    RelOptInfo *rel = best_path->parent;
+    Index        scan_relid = rel->relid;
     RangeTblEntry *rte;
     Index        levelsup;
     PlannerInfo *cteroot;
@@ -3848,7 +3866,7 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
         elog(ERROR, "could not find param ID for CTE \"%s\"", rte->ctename);

     /* Sort clauses into best execution order */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
     scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3908,7 +3926,7 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
      * Sort clauses into best execution order.  We do this first since the FDW
      * might have more info than we do and wish to adjust the ordering.
      */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /*
      * Let the FDW perform its processing on the restriction clauses and
@@ -4039,7 +4057,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
      * Sort clauses into the best execution order, although custom-scan
      * provider can reorder them again.
      */
-    scan_clauses = order_qual_clauses(root, scan_clauses);
+    scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);

     /*
      * Invoke custom plan provider to create the Plan node represented by the
@@ -4117,7 +4135,9 @@ create_nestloop_plan(PlannerInfo *root,
     root->curOuterRels = saveOuterRels;

     /* Sort join qual clauses into best execution order */
-    joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses);
+    /* XXX this underestimates the number of evaluations */
+    joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses,
+                                             best_path->path.parent->rows);

     /* Get the join qual clauses (in plain expression form) */
     /* Any pseudoconstant clauses are ignored here */
@@ -4205,7 +4225,9 @@ create_mergejoin_plan(PlannerInfo *root,

     /* Sort join qual clauses into best execution order */
     /* NB: do NOT reorder the mergeclauses */
-    joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo);
+    /* XXX this underestimates the number of evaluations */
+    joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo,
+                                     best_path->jpath.path.parent->rows);

     /* Get the join qual clauses (in plain expression form) */
     /* Any pseudoconstant clauses are ignored here */
@@ -4506,7 +4528,9 @@ create_hashjoin_plan(PlannerInfo *root,
                                      CP_SMALL_TLIST);

     /* Sort join qual clauses into best execution order */
-    joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo);
+    /* XXX this underestimates the number of evaluations */
+    joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo,
+                                     best_path->jpath.path.parent->rows);
     /* There's no point in sorting the hash clauses ... */

     /* Get the join qual clauses (in plain expression form) */
@@ -5045,7 +5069,7 @@ get_switched_clauses(List *clauses, Relids outerrelids)
  * selectivity in the ordering would likely do the wrong thing.
  */
 static List *
-order_qual_clauses(PlannerInfo *root, List *clauses)
+order_qual_clauses(PlannerInfo *root, List *clauses, double num_evals)
 {
     typedef struct
     {
@@ -5074,7 +5098,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
         Node       *clause = (Node *) lfirst(lc);
         QualCost    qcost;

-        cost_qual_eval_node(&qcost, clause, root);
+        cost_qual_eval_node(&qcost, clause, num_evals, root);
         items[i].clause = clause;
         items[i].cost = qcost.per_tuple;
         if (IsA(clause, RestrictInfo))
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4131019fc9..4fd2017cf7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5894,7 +5894,8 @@ make_sort_input_target(PlannerInfo *root,
                  */
                 QualCost    cost;

-                cost_qual_eval_node(&cost, (Node *) expr, root);
+                cost_qual_eval_node(&cost, (Node *) expr,
+                                    COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);

                 /*
                  * We arbitrarily define "expensive" as "more than 10X
@@ -6316,7 +6317,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
      * the sort, since tuplesort.c will have to re-evaluate the index
      * expressions each time.  (XXX that's pretty inefficient...)
      */
-    cost_qual_eval(&indexExprCost, indexInfo->indexprs, root);
+    cost_qual_eval(&indexExprCost, indexInfo->indexprs, rel->tuples, root);
     comparisonCost = 2.0 * (indexExprCost.startup + indexExprCost.per_tuple);

     /* Estimate the cost of seq scan + sort */
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index b02fcb9bfe..b6b761c9b3 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -148,7 +148,8 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  *
  * The result is whatever we need to substitute in place of the SubLink node
  * in the executable expression.  If we're going to do the subplan as a
- * regular subplan, this will be the constructed SubPlan node.  If we're going
+ * regular subplan, this will be the constructed SubPlan node, or possibly
+ * an AlternativeSubPlan node with a list of options.  If we're going
  * to do the subplan as an InitPlan, the SubPlan node instead goes into
  * root->init_plans, and what we return here is an expression tree
  * representing the InitPlan's result: usually just a Param node representing
@@ -246,7 +247,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
      * likely to be better (it depends on the expected number of executions of
      * the EXISTS qual, and we are much too early in planning the outer query
      * to be able to guess that).  So we generate both plans, if possible, and
-     * leave it to the executor to decide which to use.
+     * wait till later to decide which to use.
      */
     if (simple_exists && IsA(result, SubPlan))
     {
@@ -298,7 +299,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
                 /* build_subplan won't have filled in paramIds */
                 hashplan->paramIds = paramIds;

-                /* Leave it to the executor to decide which plan to use */
+                /* Build an AlternativeSubPlan, which we'll resolve later */
                 asplan = makeNode(AlternativeSubPlan);
                 asplan->subplans = list_make2(result, hashplan);
                 result = (Node *) asplan;
@@ -2863,6 +2864,39 @@ finalize_agg_primnode(Node *node, finalize_primnode_context *context)
                                   (void *) context);
 }

+/*
+ * SS_choose_alternative_subplan - choose one subplan using num_evals
+ *
+ * This function resolves the choice that make_subplan() previously punted on.
+ * We call it later in planning when we have a better idea of how many outer
+ * rows the subplan will be invoked for.
+ */
+SubPlan *
+SS_choose_alternative_subplan(AlternativeSubPlan *asplan,
+                              double num_evals)
+{
+    SubPlan    *subplan1;
+    SubPlan    *subplan2;
+    Cost        cost1;
+    Cost        cost2;
+
+    /*
+     * make_subplan() saved enough info so that we don't have to work very
+     * hard to estimate the total cost, given the number-of-calls estimate.
+     */
+    Assert(list_length(asplan->subplans) == 2);
+    subplan1 = (SubPlan *) linitial(asplan->subplans);
+    subplan2 = (SubPlan *) lsecond(asplan->subplans);
+
+    cost1 = subplan1->startup_cost + num_evals * subplan1->per_call_cost;
+    cost2 = subplan2->startup_cost + num_evals * subplan2->per_call_cost;
+
+    if (cost1 < cost2)
+        return subplan1;
+    else
+        return subplan2;
+}
+
 /*
  * SS_make_initplan_output_param - make a Param for an initPlan's output
  *
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 0c6fe0115a..592773b717 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -371,7 +371,8 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
         if (!DO_AGGSPLIT_COMBINE(context->aggsplit))
         {
             /* add the input expressions' cost to per-input-row costs */
-            cost_qual_eval_node(&argcosts, (Node *) aggref->args, context->root);
+            cost_qual_eval_node(&argcosts, (Node *) aggref->args,
+                                COST_QUAL_EVAL_DUMMY_NUM_EVALS, context->root);
             costs->transCost.startup += argcosts.startup;
             costs->transCost.per_tuple += argcosts.per_tuple;

@@ -385,7 +386,7 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
             if (aggref->aggfilter)
             {
                 cost_qual_eval_node(&argcosts, (Node *) aggref->aggfilter,
-                                    context->root);
+                                    COST_QUAL_EVAL_DUMMY_NUM_EVALS, context->root);
                 costs->transCost.startup += argcosts.startup;
                 costs->transCost.per_tuple += argcosts.per_tuple;
             }
@@ -398,7 +399,7 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
         if (aggref->aggdirectargs)
         {
             cost_qual_eval_node(&argcosts, (Node *) aggref->aggdirectargs,
-                                context->root);
+                                1, context->root);
             costs->finalCost.startup += argcosts.startup;
             costs->finalCost.per_tuple += argcosts.per_tuple;
         }
@@ -4625,7 +4626,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
              */
             if (contain_subplans(param))
                 goto fail;
-            cost_qual_eval(&eval_cost, list_make1(param), NULL);
+            cost_qual_eval_node(&eval_cost, param, 1, NULL);
             if (eval_cost.startup + eval_cost.per_tuple >
                 10 * cpu_operator_cost)
                 goto fail;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e845a4b1ae..ac4c7a991d 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1477,8 +1477,8 @@ create_group_result_path(PlannerInfo *root, RelOptInfo *rel,
     {
         QualCost    qual_cost;

-        cost_qual_eval(&qual_cost, havingqual, root);
         /* havingqual is evaluated once at startup */
+        cost_qual_eval(&qual_cost, havingqual, 1, root);
         pathnode->path.startup_cost += qual_cost.startup + qual_cost.per_tuple;
         pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
     }
@@ -3247,7 +3247,7 @@ create_minmaxagg_path(PlannerInfo *root,
     {
         QualCost    qual_cost;

-        cost_qual_eval(&qual_cost, quals, root);
+        cost_qual_eval(&qual_cost, quals, 1, root);
         pathnode->path.startup_cost += qual_cost.startup;
         pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
     }
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index 6abdc0299b..49460c4b04 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -439,6 +439,9 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
                  * charging the PHV's cost for some join paths.  For now, live
                  * with that; but we might want to improve it later by
                  * refiguring the reltarget costs for each pair of inputs.
+                 *
+                 * Unfortunately, the joinrel's size isn't set yet, so we
+                 * cannot use that in cost_qual_eval_node().
                  */
                 if (!bms_is_subset(phinfo->ph_eval_at, outer_rel->relids) &&
                     !bms_is_subset(phinfo->ph_eval_at, inner_rel->relids))
@@ -446,7 +449,7 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
                     QualCost    cost;

                     cost_qual_eval_node(&cost, (Node *) phinfo->ph_var->phexpr,
-                                        root);
+                                        COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
                     joinrel->reltarget->cost.startup += cost.startup;
                     joinrel->reltarget->cost.per_tuple += cost.per_tuple;
                 }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index be08eb4814..ef732db13c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5985,7 +5985,7 @@ index_other_operands_eval_cost(PlannerInfo *root, List *indexquals)
             other_operand = NULL;    /* keep compiler quiet */
         }

-        cost_qual_eval_node(&index_qual_cost, other_operand, root);
+        cost_qual_eval_node(&index_qual_cost, other_operand, 1, root);
         qual_arg_cost += index_qual_cost.startup + index_qual_cost.per_tuple;
     }
     return qual_arg_cost;
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 92e70ec0d9..489b0ff4bd 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -31,6 +31,12 @@

 #define DEFAULT_EFFECTIVE_CACHE_SIZE  524288    /* measured in pages */

+/*
+ * If a caller of cost_qual_eval() or cost_qual_eval_node() has no good number
+ * to use for num_evals, use this.
+ */
+#define COST_QUAL_EVAL_DUMMY_NUM_EVALS 1000
+
 typedef enum
 {
     CONSTRAINT_EXCLUSION_OFF,    /* do not use c_e */
@@ -166,8 +172,10 @@ extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
                               Cost input_startup_cost, Cost input_total_cost,
                               double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
-extern void cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root);
-extern void cost_qual_eval_node(QualCost *cost, Node *qual, PlannerInfo *root);
+extern void cost_qual_eval(QualCost *cost, List *quals, double num_evals,
+                           PlannerInfo *root);
+extern void cost_qual_eval_node(QualCost *cost, Node *qual, double num_evals,
+                                PlannerInfo *root);
 extern void compute_semi_anti_join_factors(PlannerInfo *root,
                                            RelOptInfo *joinrel,
                                            RelOptInfo *outerrel,
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index d6a872bd2c..332c6c5b19 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -30,6 +30,8 @@ extern void SS_identify_outer_params(PlannerInfo *root);
 extern void SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel);
 extern void SS_attach_initplans(PlannerInfo *root, Plan *plan);
 extern void SS_finalize_plan(PlannerInfo *root, Plan *plan);
+extern SubPlan *SS_choose_alternative_subplan(AlternativeSubPlan *asplan,
+                                              double num_evals);
 extern Param *SS_make_initplan_output_param(PlannerInfo *root,
                                             Oid resulttype, int32 resulttypmod,
                                             Oid resultcollation);
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 5de53f2782..04819419fa 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -2261,17 +2261,17 @@ NOTICE:  drop cascades to view rw_view1
 CREATE TABLE t1 (a int, b float, c text);
 CREATE INDEX t1_a_idx ON t1(a);
 INSERT INTO t1
-SELECT i,i,'t1' FROM generate_series(1,10) g(i);
+SELECT i,i,'t1' FROM generate_series(1,30) g(i);
 ANALYZE t1;
 CREATE TABLE t11 (d text) INHERITS (t1);
 CREATE INDEX t11_a_idx ON t11(a);
 INSERT INTO t11
-SELECT i,i,'t11','t11d' FROM generate_series(1,10) g(i);
+SELECT i,i,'t11','t11d' FROM generate_series(1,30) g(i);
 ANALYZE t11;
 CREATE TABLE t12 (e int[]) INHERITS (t1);
 CREATE INDEX t12_a_idx ON t12(a);
 INSERT INTO t12
-SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t12;
 CREATE TABLE t111 () INHERITS (t11, t12);
 NOTICE:  merging multiple inherited definitions of column "a"
@@ -2279,7 +2279,7 @@ NOTICE:  merging multiple inherited definitions of column "b"
 NOTICE:  merging multiple inherited definitions of column "c"
 CREATE INDEX t111_a_idx ON t111(a);
 INSERT INTO t111
-SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t111;
 CREATE VIEW v1 WITH (security_barrier=true) AS
 SELECT *, (SELECT d FROM t11 WHERE t11.a = t1.a LIMIT 1) AS d
@@ -2407,21 +2407,101 @@ NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 TABLE t1; -- verify all a<=5 are intact
  a | b |  c
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 64f23d0902..12e972d5e0 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -1077,25 +1077,25 @@ DROP TABLE base_tbl CASCADE;
 CREATE TABLE t1 (a int, b float, c text);
 CREATE INDEX t1_a_idx ON t1(a);
 INSERT INTO t1
-SELECT i,i,'t1' FROM generate_series(1,10) g(i);
+SELECT i,i,'t1' FROM generate_series(1,30) g(i);
 ANALYZE t1;

 CREATE TABLE t11 (d text) INHERITS (t1);
 CREATE INDEX t11_a_idx ON t11(a);
 INSERT INTO t11
-SELECT i,i,'t11','t11d' FROM generate_series(1,10) g(i);
+SELECT i,i,'t11','t11d' FROM generate_series(1,30) g(i);
 ANALYZE t11;

 CREATE TABLE t12 (e int[]) INHERITS (t1);
 CREATE INDEX t12_a_idx ON t12(a);
 INSERT INTO t12
-SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t12;

 CREATE TABLE t111 () INHERITS (t11, t12);
 CREATE INDEX t111_a_idx ON t111(a);
 INSERT INTO t111
-SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t111;

 CREATE VIEW v1 WITH (security_barrier=true) AS

Re: Improve planner cost estimations for alternative subplans

From
Alexey Bashtanov
Date:
Hi Melanie,

Sorry for the delay.

I've just started looking at this patch today, but I was wondering if
you might include a test case which minimally reproduces the original
problem you had.
I could reproduce it with an easier generated data set, please see attached.

However, to be honest with you, while searching I encountered a few examples of the opposite behavior,
when the patched version was slower than the master branch.
So I'm not so sure whether we should use the patch, maybe we should rather consider Tom's approach.

Best, Alex
Attachment

Re: Improve planner cost estimations for alternative subplans

From
Alexey Bashtanov
Date:
Hi Tom,

sorry for the delay,
> I experimented with adding a number-of-evaluations parameter to
> cost_qual_eval, and found that the majority of call sites do have
> something realistic they can pass.  The attached very-much-WIP
> patch shows my results so far.  There's a lot of loose ends:
I like the idea, so if we alternative subplans remain there
I think we should implement it.

Best, Alex



Re: Improve planner cost estimations for alternative subplans

From
Andy Fan
Date:


On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Nope.  The entire reason why we have that kluge is that we don't know
> until much later how many times we expect to execute the subplan.
> AlternativeSubPlan allows the decision which subplan form to use to be
> postponed till runtime; but when we're doing things like estimating the
> cost and selectivity of a where-clause, we don't know that.

> Maybe there's some way to recast things to avoid that problem,
> but I have little clue what it'd look like.

Actually ... maybe it's not that bad.  Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity.  Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.


I read your idea of "ripping out all executor support for AlternativeSubPlan
 and instead having the planner replace an AlternativeSubPlan with
 the desired specific SubPlan somewhere late in planning, possibly setrefs.c."
in [1].  I was thinking that if we can do such a replacement sooner, 
for example once we know the num_calls for the subplans, Unknown if it
is possible though.  If we can, then we can handle the issue here as well.

The attached is a very PoC version,  I'm not sure if it is the right direction
to go.  I'm sorry that I still need more time to understand your solution
below but I'm too excited about your original idea. 


 
I experimented with adding a number-of-evaluations parameter to
cost_qual_eval, and found that the majority of call sites do have
something realistic they can pass.  The attached very-much-WIP
patch shows my results so far.  There's a lot of loose ends:

* Any call site using COST_QUAL_EVAL_DUMMY_NUM_EVALS is a potential spot
for future improvement.  The only one that seems like it might be
fundamentally unsolvable is cost_subplan's usage; but that would only
matter if a subplan's testexpr contains an AlternativeSubPlan, which is
probably a negligible case in practice.  The other ones seem to just
require more refactoring than I cared to do on a Sunday afternoon.

* I did not do anything for postgres_fdw.c beyond making it compile.
We can surely do better there, but it might require some rethinking
of the way that plan costs get cached.

* The merge and hash join costsize.c functions estimate costs of qpquals
(i.e. quals to be applied at the join that are not being used as merge
or hash conditions) by computing cost_qual_eval of the whole
joinrestrictlist and then subtracting off the cost of the merge or hash
quals.  This is kind of broken if we want to use different num_eval
estimates for the qpquals and the merge/hash quals, which I think we do.
This probably just needs some refactoring to fix.  We also need to save
the relevant rowcounts in the join Path nodes so that createplan.c can
do the right thing.

* I think it might be possible to improve the situation in
get_agg_clause_costs() if we're willing to postpone collection
of the actual aggregate costs till later.  This'd require two
passes over the aggregate expressions, but that seems like it
might not be terribly expensive.  (I'd be inclined to also look
at the idea of merging duplicate agg calls at plan time not
run time, if we refactor that.)

* I had to increase the number of table rows in one updatable_views.sql
test to keep the plans the same.  Without that, the planner realized
that a seqscan would be cheaper than an indexscan.  The result wasn't
wrong exactly, but it failed to prove that leakproof quals could be
used as indexquals, so I think we need to keep the plan choice the same.

Anyway, this is kind of invasive, but I think it shouldn't really
add noticeable costs as long as we save relevant rowcounts rather
than recomputing them in createplan.c.  Is it worth doing?  I dunno.
AlternativeSubPlan is pretty much a backwater, I think --- if it
were interesting performance-wise to a lot of people, more would
have been done with it by now.

                        regards, tom lane



--
Best Regards
Andy Fan
Attachment

Re: Improve planner cost estimations for alternative subplans

From
Andy Fan
Date:


On Mon, Aug 17, 2020 at 10:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:


On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Nope.  The entire reason why we have that kluge is that we don't know
> until much later how many times we expect to execute the subplan.
> AlternativeSubPlan allows the decision which subplan form to use to be
> postponed till runtime; but when we're doing things like estimating the
> cost and selectivity of a where-clause, we don't know that.

> Maybe there's some way to recast things to avoid that problem,
> but I have little clue what it'd look like.

Actually ... maybe it's not that bad.  Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity.  Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.


I read your idea of "ripping out all executor support for AlternativeSubPlan
 and instead having the planner replace an AlternativeSubPlan with
 the desired specific SubPlan somewhere late in planning, possibly setrefs.c."
in [1].  I was thinking that if we can do such a replacement sooner, 
for example once we know the num_calls for the subplans, Unknown if it
is possible though.  If we can, then we can handle the issue here as well.

The attached is a very PoC version,  I'm not sure if it is the right direction
to go.  

The idea behind it is if we have a RelOptInfo which have some  AlternativeSubPlan,
and assume these subplans have some correlated vars which can be expressed as
deps_relids.  Then we can convert the AlternativeSubPlan to SubPlan once 
bms_is_subset(deps_relids,  rel->relids).  My patch is able to fix the issue reported
here and it only converts the AlternativeSubPlan in rel->reltarget for demo purpose. 

--
Best Regards
Andy Fan

Re: Improve planner cost estimations for alternative subplans

From
Andy Fan
Date:


On Wed, Aug 26, 2020 at 4:21 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:


On Mon, Aug 17, 2020 at 10:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:


On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Nope.  The entire reason why we have that kluge is that we don't know
> until much later how many times we expect to execute the subplan.
> AlternativeSubPlan allows the decision which subplan form to use to be
> postponed till runtime; but when we're doing things like estimating the
> cost and selectivity of a where-clause, we don't know that.

> Maybe there's some way to recast things to avoid that problem,
> but I have little clue what it'd look like.

Actually ... maybe it's not that bad.  Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity.  Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.


I read your idea of "ripping out all executor support for AlternativeSubPlan
 and instead having the planner replace an AlternativeSubPlan with
 the desired specific SubPlan somewhere late in planning, possibly setrefs.c."
in [1].  I was thinking that if we can do such a replacement sooner, 
for example once we know the num_calls for the subplans, Unknown if it
is possible though.  If we can, then we can handle the issue here as well.

The attached is a very PoC version,  I'm not sure if it is the right direction
to go.  

The idea behind it is if we have a RelOptInfo which have some  AlternativeSubPlan,
and assume these subplans have some correlated vars which can be expressed as
deps_relids.  Then we can convert the AlternativeSubPlan to SubPlan once 
bms_is_subset(subplan->deps_relids,  rel->relids). 

The way of figuring out subplan->deps_relids was wrong in my patch,  I will fix it later.
But the general idea is the same. 
 
My patch is able to fix the issue reported
here and it only converts the AlternativeSubPlan in rel->reltarget for demo purpose. 

--
Best Regards
Andy Fan


--
Best Regards
Andy Fan