Improve planner cost estimations for alternative subplans

Started by Alexey Bashtanovover 5 years ago11 messages
#1Alexey Bashtanov
bashtanov@imap.cc
1 attachment(s)

Hello,

Currently, in case of alternative subplans that do hashed vs per-row
lookups,
the per-row estimate is used when planning the rest of the query.
It's also coded in not quite an explicit way.

In [1]/messages/by-id/ff42b25b-ff03-27f8-ed11-b8255d658cd5@imap.cc we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.

The patch is to fix it. Our linear model for costs cannot quite accommodate
the piecewise linear matter of alternative subplans,
so it is based on ugly heuristics and still cannot be very precise,
but I think it's better than the current one.

Thoughts?

Best, Alex

[1]: /messages/by-id/ff42b25b-ff03-27f8-ed11-b8255d658cd5@imap.cc
/messages/by-id/ff42b25b-ff03-27f8-ed11-b8255d658cd5@imap.cc

Attachments:

alternatives-cost-as-geom-mean-v2.patchtext/x-patch; charset=UTF-8; name=alternatives-cost-as-geom-mean-v2.patchDownload
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b976afb69d..5e2c5ec822 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -92,6 +92,7 @@
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
 #include "parser/parsetree.h"
+#include "utils/float.h"
 #include "utils/lsyscache.h"
 #include "utils/selfuncs.h"
 #include "utils/spccache.h"
@@ -4283,15 +4284,57 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 	else if (IsA(node, AlternativeSubPlan))
 	{
 		/*
-		 * Arbitrarily use the first alternative plan for costing.  (We should
-		 * certainly only include one alternative, and we don't yet have
-		 * enough information to know which one the executor is most likely to
-		 * use.)
+		 * Estimate the cost as some mean of the alternatives.
+		 * It's not uncommon for the alternative costs to be of different
+		 * orders of magnitude, so arithmetic mean would be too biased towards
+		 * the slower one. That's why for per-tuple coefficient we use geometric
+		 * mean.
+		 *
+		 * It's tempting to use geometric mean for startup costs too, but
+		 * one of them can be zero, which will result in substantial
+		 * underestimation. So instead we estimate the cost of returning *first*
+		 * tuple as geometric mean of single-tuple costs for the alternatives.
 		 */
-		AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;
+		List   *subplans = ((AlternativeSubPlan *) node)->subplans;
+		Cost 	per_tuple_mean = 1;
+		Cost 	first_tuple_mean = 1;
+		Cost 	startup_min = get_float8_infinity();
+		Cost 	startup_max = 0;
+		Cost 	startup_mean;
+		ListCell   *lc;
+
+		foreach(lc, subplans)
+		{
+			cost_qual_eval_context subplanContext;
+
+			subplanContext.root = context->root;
+			subplanContext.total.startup = 0;
+			subplanContext.total.per_tuple = 0;
 
-		return cost_qual_eval_walker((Node *) linitial(asplan->subplans),
-									 context);
+			cost_qual_eval_walker((Node *) lfirst(lc), &subplanContext);
+
+			Assert(subplanContext.total.startup >= 0);
+			Assert(subplanContext.total.per_tuple >= 0);
+
+			per_tuple_mean *= subplanContext.total.per_tuple;
+			first_tuple_mean *= subplanContext.total.startup + subplanContext.total.per_tuple;
+			startup_min = float8_min(startup_min, subplanContext.total.startup);
+			startup_max = float8_max(startup_min, subplanContext.total.startup);
+		}
+		per_tuple_mean =
+			pow(per_tuple_mean, 1.0 / list_length(subplans));
+		first_tuple_mean =
+			pow(first_tuple_mean, 1.0 / list_length(subplans));
+		startup_mean = first_tuple_mean - per_tuple_mean;
+		/* make sure this calculation makes a sane value */
+		startup_mean = float8_max(startup_mean, startup_min);
+		startup_mean = float8_min(startup_mean, startup_max);
+
+		context->total.startup += startup_mean;
+		context->total.per_tuple += per_tuple_mean;
+
+		/* We've already recursed into children, skip the generic recursion */
+		return false;
 	}
 	else if (IsA(node, PlaceHolderVar))
 	{
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 5de53f2782..98787e75e0 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -2301,17 +2301,16 @@ SELECT * FROM v1 WHERE a=8;
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a < 7 AND a != 6;
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                                         QUERY PLAN                                                                          
+-------------------------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.t1
    Update on public.t1
    Update on public.t11 t1_1
    Update on public.t12 t1_2
    Update on public.t111 t1_3
-   ->  Index Scan using t1_a_idx on public.t1
+   ->  Seq Scan on public.t1
          Output: 100, t1.b, t1.c, t1.ctid
-         Index Cond: ((t1.a > 5) AND (t1.a < 7))
-         Filter: ((t1.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1.a) AND leakproof(t1.a))
+         Filter: ((t1.a > 5) AND (t1.a < 7) AND (t1.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1.a) AND leakproof(t1.a))
          SubPlan 1
            ->  Append
                  ->  Seq Scan on public.t12 t12_1
@@ -2324,19 +2323,16 @@ UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a < 7 AND a != 6;
                        Output: t12_4.a
                  ->  Seq Scan on public.t111 t12_5
                        Output: t12_5.a
-   ->  Index Scan using t11_a_idx on public.t11 t1_1
+   ->  Seq Scan on public.t11 t1_1
          Output: 100, t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
-         Index Cond: ((t1_1.a > 5) AND (t1_1.a < 7))
-         Filter: ((t1_1.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_1.a) AND leakproof(t1_1.a))
-   ->  Index Scan using t12_a_idx on public.t12 t1_2
+         Filter: ((t1_1.a > 5) AND (t1_1.a < 7) AND (t1_1.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_1.a) AND leakproof(t1_1.a))
+   ->  Seq Scan on public.t12 t1_2
          Output: 100, t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
-         Index Cond: ((t1_2.a > 5) AND (t1_2.a < 7))
-         Filter: ((t1_2.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_2.a) AND leakproof(t1_2.a))
-   ->  Index Scan using t111_a_idx on public.t111 t1_3
+         Filter: ((t1_2.a > 5) AND (t1_2.a < 7) AND (t1_2.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_2.a) AND leakproof(t1_2.a))
+   ->  Seq Scan on public.t111 t1_3
          Output: 100, t1_3.b, t1_3.c, t1_3.d, t1_3.e, t1_3.ctid
-         Index Cond: ((t1_3.a > 5) AND (t1_3.a < 7))
-         Filter: ((t1_3.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_3.a) AND leakproof(t1_3.a))
-(33 rows)
+         Filter: ((t1_3.a > 5) AND (t1_3.a < 7) AND (t1_3.a <> 6) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_3.a) AND leakproof(t1_3.a))
+(29 rows)
 
 UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a < 7 AND a != 6;
 SELECT * FROM v1 WHERE a=100; -- Nothing should have been changed to 100
@@ -2351,17 +2347,16 @@ SELECT * FROM t1 WHERE a=100; -- Nothing should have been changed to 100
 
 EXPLAIN (VERBOSE, COSTS OFF)
 UPDATE v1 SET a=a+1 WHERE snoop(a) AND leakproof(a) AND a = 8;
-                                               QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
+                                                                QUERY PLAN                                                                 
+-------------------------------------------------------------------------------------------------------------------------------------------
  Update on public.t1
    Update on public.t1
    Update on public.t11 t1_1
    Update on public.t12 t1_2
    Update on public.t111 t1_3
-   ->  Index Scan using t1_a_idx on public.t1
+   ->  Seq Scan on public.t1
          Output: (t1.a + 1), t1.b, t1.c, t1.ctid
-         Index Cond: ((t1.a > 5) AND (t1.a = 8))
-         Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1.a) AND leakproof(t1.a))
+         Filter: ((t1.a > 5) AND (t1.a = 8) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1.a) AND leakproof(t1.a))
          SubPlan 1
            ->  Append
                  ->  Seq Scan on public.t12 t12_1
@@ -2374,19 +2369,16 @@ UPDATE v1 SET a=a+1 WHERE snoop(a) AND leakproof(a) AND a = 8;
                        Output: t12_4.a
                  ->  Seq Scan on public.t111 t12_5
                        Output: t12_5.a
-   ->  Index Scan using t11_a_idx on public.t11 t1_1
+   ->  Seq Scan on public.t11 t1_1
          Output: (t1_1.a + 1), t1_1.b, t1_1.c, t1_1.d, t1_1.ctid
-         Index Cond: ((t1_1.a > 5) AND (t1_1.a = 8))
-         Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_1.a) AND leakproof(t1_1.a))
-   ->  Index Scan using t12_a_idx on public.t12 t1_2
+         Filter: ((t1_1.a > 5) AND (t1_1.a = 8) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_1.a) AND leakproof(t1_1.a))
+   ->  Seq Scan on public.t12 t1_2
          Output: (t1_2.a + 1), t1_2.b, t1_2.c, t1_2.e, t1_2.ctid
-         Index Cond: ((t1_2.a > 5) AND (t1_2.a = 8))
-         Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_2.a) AND leakproof(t1_2.a))
-   ->  Index Scan using t111_a_idx on public.t111 t1_3
+         Filter: ((t1_2.a > 5) AND (t1_2.a = 8) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_2.a) AND leakproof(t1_2.a))
+   ->  Seq Scan on public.t111 t1_3
          Output: (t1_3.a + 1), t1_3.b, t1_3.c, t1_3.d, t1_3.e, t1_3.ctid
-         Index Cond: ((t1_3.a > 5) AND (t1_3.a = 8))
-         Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_3.a) AND leakproof(t1_3.a))
-(33 rows)
+         Filter: ((t1_3.a > 5) AND (t1_3.a = 8) AND (alternatives: SubPlan 1 or hashed SubPlan 2) AND snoop(t1_3.a) AND leakproof(t1_3.a))
+(29 rows)
 
 UPDATE v1 SET a=a+1 WHERE snoop(a) AND leakproof(a) AND a = 8;
 NOTICE:  snooped value: 8
#2Melanie Plageman
melanieplageman@gmail.com
In reply to: Alexey Bashtanov (#1)
Re: Improve planner cost estimations for alternative subplans

On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.

I've just started looking at this patch today, but I was wondering if
you might include a test case which minimally reproduces the original
problem you had.
The only plan diff I see is in updatable_views.sql, and it doesn't
illustrate the problem as well as a more straightforward SELECT query
with EXISTS sublink might.

After putting in some logging, I see that there are only a
few non-catalog queries which exercise this code path.
This query from groupingsets.sql is an example of one such query:

select ten, sum(distinct four) from onek a
group by grouping sets((ten,four),(ten))
having exists (select 1 from onek b where sum(distinct a.four) = b.four);

But, the chosen plan for this query stays the same.

It would be helpful to see a query where a different plan is chosen
because of this change that is not from updatable_views.sql.

--
Melanie Plageman

#3Melanie Plageman
melanieplageman@gmail.com
In reply to: Alexey Bashtanov (#1)
Re: Improve planner cost estimations for alternative subplans

On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.

Did this geometric average method result in choosing the desired plan for
this case?

The patch is to fix it. Our linear model for costs cannot quite accommodate
the piecewise linear matter of alternative subplans,
so it is based on ugly heuristics and still cannot be very precise,
but I think it's better than the current one.

Thoughts?

Is there another place in planner where two alternatives are averaged
together and that cost is used?

To me, it feels a little bit weird that we are averaging together the
startup cost of a plan which will always have a 0 startup cost and a
plan that will always have a non-zero startup cost and the per tuple
cost of a plan that will always have a negligible per tuple cost and one
that might have a very large per tuple cost.

I guess it feels different because instead of comparing alternatives you
are blending them.

I don't have any academic basis for saying that the alternatives costs
shouldn't be averaged together for use in the rest of the plan, so I
could definitely be wrong.

--
Melanie Plageman

#4Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Melanie Plageman (#3)
Re: Improve planner cost estimations for alternative subplans

On Wed, Jun 17, 2020 at 06:21:58PM -0700, Melanie Plageman wrote:

On Fri, Jun 5, 2020 at 9:08 AM Alexey Bashtanov <bashtanov@imap.cc> wrote:

In [1] we found a situation where it leads to a suboptimal plan,
as it bloats the overall cost into large figures,
a decision related to an outer part of the plan look negligible to the
planner,
and as a result it doesn't elaborate on choosing the optimal one.

Did this geometric average method result in choosing the desired plan for
this case?

The patch is to fix it. Our linear model for costs cannot quite accommodate
the piecewise linear matter of alternative subplans,
so it is based on ugly heuristics and still cannot be very precise,
but I think it's better than the current one.

Thoughts?

Is there another place in planner where two alternatives are averaged
together and that cost is used?

To me, it feels a little bit weird that we are averaging together the
startup cost of a plan which will always have a 0 startup cost and a
plan that will always have a non-zero startup cost and the per tuple
cost of a plan that will always have a negligible per tuple cost and one
that might have a very large per tuple cost.

I guess it feels different because instead of comparing alternatives you
are blending them.

I don't have any academic basis for saying that the alternatives costs
shouldn't be averaged together for use in the rest of the plan, so I
could definitely be wrong.

I agree it feels weird. Even if it actually improved the problematic
case, I think it'll be quite hard to convince ourselves this helps in
general. For example, for cases that actually end up using the first
plan, this is bound to make the estimates worse. I find it hard to
believe it won't cause regressions in at least some cases.

Maybe this heuristics really is better than the old one, but I think we
need to understand why - a single query probably is not enough.

I think the crucial limitation here is that we don't know which of the
alternative plans will be used. Is there a chance to improve this,
perhaps by making some sort of guess?

I'm not particularly familiar with AlternativeSubPlans, but I see we're
picking the one in nodeSubplan.c based on plan_rows. Can't we do the
same thing in cost_qual_eval_walker?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#4)
Re: Improve planner cost estimations for alternative subplans

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

I'm not particularly familiar with AlternativeSubPlans, but I see we're
picking the one in nodeSubplan.c based on plan_rows. Can't we do the
same thing in cost_qual_eval_walker?

Nope. The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

I agree that averaging together the costs of the alternatives seems
wrong in principle. It's going to be one or the other, not some
quantum-mechanical superposition. Maybe there's a case for taking the
min costs (if you're feeling lucky) or the max costs (if you're not),
on the belief that the executor will/will not pick the choice that
contributes least to the total query cost. But I have a feeling that
either of those would distort our estimates too much. The existing
costing behavior at least can be seen to match *one* actual runtime
behavior.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
1 attachment(s)
Re: Improve planner cost estimations for alternative subplans

I wrote:

Nope. The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

Actually ... maybe it's not that bad. Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity. Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.

I experimented with adding a number-of-evaluations parameter to
cost_qual_eval, and found that the majority of call sites do have
something realistic they can pass. The attached very-much-WIP
patch shows my results so far. There's a lot of loose ends:

* Any call site using COST_QUAL_EVAL_DUMMY_NUM_EVALS is a potential spot
for future improvement. The only one that seems like it might be
fundamentally unsolvable is cost_subplan's usage; but that would only
matter if a subplan's testexpr contains an AlternativeSubPlan, which is
probably a negligible case in practice. The other ones seem to just
require more refactoring than I cared to do on a Sunday afternoon.

* I did not do anything for postgres_fdw.c beyond making it compile.
We can surely do better there, but it might require some rethinking
of the way that plan costs get cached.

* The merge and hash join costsize.c functions estimate costs of qpquals
(i.e. quals to be applied at the join that are not being used as merge
or hash conditions) by computing cost_qual_eval of the whole
joinrestrictlist and then subtracting off the cost of the merge or hash
quals. This is kind of broken if we want to use different num_eval
estimates for the qpquals and the merge/hash quals, which I think we do.
This probably just needs some refactoring to fix. We also need to save
the relevant rowcounts in the join Path nodes so that createplan.c can
do the right thing.

* I think it might be possible to improve the situation in
get_agg_clause_costs() if we're willing to postpone collection
of the actual aggregate costs till later. This'd require two
passes over the aggregate expressions, but that seems like it
might not be terribly expensive. (I'd be inclined to also look
at the idea of merging duplicate agg calls at plan time not
run time, if we refactor that.)

* I had to increase the number of table rows in one updatable_views.sql
test to keep the plans the same. Without that, the planner realized
that a seqscan would be cheaper than an indexscan. The result wasn't
wrong exactly, but it failed to prove that leakproof quals could be
used as indexquals, so I think we need to keep the plan choice the same.

Anyway, this is kind of invasive, but I think it shouldn't really
add noticeable costs as long as we save relevant rowcounts rather
than recomputing them in createplan.c. Is it worth doing? I dunno.
AlternativeSubPlan is pretty much a backwater, I think --- if it
were interesting performance-wise to a lot of people, more would
have been done with it by now.

regards, tom lane

Attachments:

better-alternativesubplan-cost-ests-1.patchtext/x-diff; charset=us-ascii; name=better-alternativesubplan-cost-ests-1.patchDownload
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 9fc53cad68..1149c298ba 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -656,7 +656,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 													 JOIN_INNER,
 													 NULL);
 
-	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 	/*
 	 * Set # of retrieved rows and cached relation costs to some negative
@@ -2766,7 +2767,8 @@ estimate_path_cost_size(PlannerInfo *root,
 		/* Add in the eval cost of the locally-checked quals */
 		startup_cost += fpinfo->local_conds_cost.startup;
 		total_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
-		cost_qual_eval(&local_cost, local_param_join_conds, root);
+		cost_qual_eval(&local_cost, local_param_join_conds,
+					   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 		startup_cost += local_cost.startup;
 		total_cost += local_cost.per_tuple * retrieved_rows;
 
@@ -2782,7 +2784,8 @@ estimate_path_cost_size(PlannerInfo *root,
 		{
 			QualCost	tlist_cost;
 
-			cost_qual_eval(&tlist_cost, fdw_scan_tlist, root);
+			cost_qual_eval(&tlist_cost, fdw_scan_tlist,
+						   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 			startup_cost -= tlist_cost.startup;
 			total_cost -= tlist_cost.startup;
 			total_cost -= tlist_cost.per_tuple * rows;
@@ -2870,9 +2873,11 @@ estimate_path_cost_size(PlannerInfo *root,
 			/*
 			 * Calculate the cost of clauses pushed down to the foreign server
 			 */
-			cost_qual_eval(&remote_conds_cost, fpinfo->remote_conds, root);
+			cost_qual_eval(&remote_conds_cost, fpinfo->remote_conds,
+						   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 			/* Calculate the cost of applying join clauses */
-			cost_qual_eval(&join_cost, fpinfo->joinclauses, root);
+			cost_qual_eval(&join_cost, fpinfo->joinclauses,
+						   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 			/*
 			 * Startup cost includes startup cost of joining relations and the
@@ -3019,7 +3024,8 @@ estimate_path_cost_size(PlannerInfo *root,
 				QualCost	remote_cost;
 
 				/* Add in the eval cost of the remotely-checked quals */
-				cost_qual_eval(&remote_cost, fpinfo->remote_conds, root);
+				cost_qual_eval(&remote_cost, fpinfo->remote_conds,
+							   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 				startup_cost += remote_cost.startup;
 				run_cost += remote_cost.per_tuple * numGroups;
 				/* Add in the eval cost of the locally-checked quals */
@@ -5514,7 +5520,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 													 0,
 													 JOIN_INNER,
 													 NULL);
-	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 	/*
 	 * If we are going to estimate costs locally, estimate the join clause
@@ -5912,7 +5919,8 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 													 JOIN_INNER,
 													 NULL);
 
-	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+	cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds,
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 	/* Estimate the cost of push down */
 	estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index af87f172a7..008efbac06 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -1380,7 +1380,7 @@ amcostestimate (PlannerInfo *root,
  * Also, we charge for evaluation of the indexquals at each index row.
  * All the costs are assumed to be paid incrementally during the scan.
  */
-cost_qual_eval(&amp;index_qual_cost, path-&gt;indexquals, root);
+cost_qual_eval(&amp;index_qual_cost, path-&gt;indexquals, numIndexTuples, root);
 *indexStartupCost = index_qual_cost.startup;
 *indexTotalCost = seq_page_cost * numIndexPages +
     (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 4ff3c7a2fd..e3e1e80096 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -91,6 +91,7 @@
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
+#include "optimizer/subselect.h"
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 #include "utils/selfuncs.h"
@@ -145,6 +146,8 @@ bool		enable_partition_pruning = true;
 typedef struct
 {
 	PlannerInfo *root;
+	double		num_evals;
+	bool		num_evals_used;
 	QualCost	total;
 } cost_qual_eval_context;
 
@@ -176,7 +179,7 @@ static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
 													List **restrictlist);
 static Cost append_nonpartial_cost(List *subpaths, int numpaths,
 								   int parallel_workers);
-static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
+static void set_rel_width(PlannerInfo *root, RelOptInfo *rel, double num_evals);
 static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
@@ -722,7 +725,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 	 * What we want here is cpu_tuple_cost plus the evaluation costs of any
 	 * qual clauses that we have to evaluate as qpquals.
 	 */
-	cost_qual_eval(&qpqual_cost, qpquals, root);
+	cost_qual_eval(&qpqual_cost, qpquals, tuples_fetched, root);
 
 	startup_cost += qpqual_cost.startup;
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
@@ -1247,7 +1250,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
 	 */
-	cost_qual_eval(&tid_qual_cost, tidquals, root);
+	cost_qual_eval(&tid_qual_cost, tidquals, 1, root);
 
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
@@ -1365,7 +1368,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	 * estimates for functions tend to be, there's not a lot of point in that
 	 * refinement right now.
 	 */
-	cost_qual_eval_node(&exprcost, (Node *) rte->functions, root);
+	cost_qual_eval_node(&exprcost, (Node *) rte->functions, 1, root);
 
 	startup_cost += exprcost.startup + exprcost.per_tuple;
 
@@ -1421,7 +1424,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
 	 * estimates for tablefuncs tend to be, there's not a lot of point in that
 	 * refinement right now.
 	 */
-	cost_qual_eval_node(&exprcost, (Node *) rte->tablefunc, root);
+	cost_qual_eval_node(&exprcost, (Node *) rte->tablefunc, 1, root);
 
 	startup_cost += exprcost.startup + exprcost.per_tuple;
 
@@ -2467,7 +2470,7 @@ cost_agg(Path *path, PlannerInfo *root,
 	{
 		QualCost	qual_cost;
 
-		cost_qual_eval(&qual_cost, quals, root);
+		cost_qual_eval(&qual_cost, quals, output_tuples, root);
 		startup_cost += qual_cost.startup;
 		total_cost += qual_cost.startup + output_tuples * qual_cost.per_tuple;
 
@@ -2526,7 +2529,8 @@ cost_windowagg(Path *path, PlannerInfo *root,
 		wfunccost = argcosts.per_tuple;
 
 		/* also add the input expressions' cost to per-input-row costs */
-		cost_qual_eval_node(&argcosts, (Node *) wfunc->args, root);
+		cost_qual_eval_node(&argcosts, (Node *) wfunc->args,
+							input_tuples, root);
 		startup_cost += argcosts.startup;
 		wfunccost += argcosts.per_tuple;
 
@@ -2534,7 +2538,8 @@ cost_windowagg(Path *path, PlannerInfo *root,
 		 * Add the filter's cost to per-input-row costs.  XXX We should reduce
 		 * input expression costs according to filter selectivity.
 		 */
-		cost_qual_eval_node(&argcosts, (Node *) wfunc->aggfilter, root);
+		cost_qual_eval_node(&argcosts, (Node *) wfunc->aggfilter,
+							input_tuples, root);
 		startup_cost += argcosts.startup;
 		wfunccost += argcosts.per_tuple;
 
@@ -2594,7 +2599,7 @@ cost_group(Path *path, PlannerInfo *root,
 	{
 		QualCost	qual_cost;
 
-		cost_qual_eval(&qual_cost, quals, root);
+		cost_qual_eval(&qual_cost, quals, output_tuples, root);
 		startup_cost += qual_cost.startup;
 		total_cost += qual_cost.startup + output_tuples * qual_cost.per_tuple;
 
@@ -2874,7 +2879,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	}
 
 	/* CPU costs */
-	cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root);
+	cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, ntuples, root);
 	startup_cost += restrict_qual_cost.startup;
 	cpu_per_tuple = cpu_tuple_cost + restrict_qual_cost.per_tuple;
 	run_cost += cpu_per_tuple * ntuples;
@@ -3200,12 +3205,23 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	if (!enable_mergejoin)
 		startup_cost += disable_cost;
 
+	/*
+	 * Get approx # tuples passing the mergequals.  We use approx_tuple_count
+	 * here because we need an estimate done with JOIN_INNER semantics.
+	 */
+	mergejointuples = approx_tuple_count(root, &path->jpath, mergeclauses);
+
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
-	 * separately.
+	 * separately.  We use mergejointuples for num_evals for all the quals,
+	 * else the arithmetic below would be bogus.  That's the right thing for
+	 * the qpquals, and it seems unlikely that there'd be subplans in the
+	 * mergequals.
 	 */
-	cost_qual_eval(&merge_qual_cost, mergeclauses, root);
-	cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo, root);
+	cost_qual_eval(&merge_qual_cost, mergeclauses,
+				   mergejointuples, root);
+	cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo,
+				   mergejointuples, root);
 	qp_qual_cost.startup -= merge_qual_cost.startup;
 	qp_qual_cost.per_tuple -= merge_qual_cost.per_tuple;
 
@@ -3224,12 +3240,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	else
 		path->skip_mark_restore = false;
 
-	/*
-	 * Get approx # tuples passing the mergequals.  We use approx_tuple_count
-	 * here because we need an estimate done with JOIN_INNER semantics.
-	 */
-	mergejointuples = approx_tuple_count(root, &path->jpath, mergeclauses);
-
 	/*
 	 * When there are equal merge keys in the outer relation, the mergejoin
 	 * must rescan any matching tuples in the inner relation. This means
@@ -3729,13 +3739,12 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 		startup_cost += disable_cost;
 
 	/*
-	 * Compute cost of the hashquals and qpquals (other restriction clauses)
-	 * separately.
+	 * Compute cost of the hashquals.  We don't expend effort on estimating
+	 * how often they'll be evaluated before doing so, reasoning that it's
+	 * unlikely they'll contain any subplans.
 	 */
-	cost_qual_eval(&hash_qual_cost, hashclauses, root);
-	cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo, root);
-	qp_qual_cost.startup -= hash_qual_cost.startup;
-	qp_qual_cost.per_tuple -= hash_qual_cost.per_tuple;
+	cost_qual_eval(&hash_qual_cost, hashclauses,
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 	/* CPU costs */
 
@@ -3812,6 +3821,16 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 		hashjointuples = approx_tuple_count(root, &path->jpath, hashclauses);
 	}
 
+	/*
+	 * Compute cost of the qpquals (join quals that aren't hash quals) by
+	 * costing the joinrestrictinfo list and then subtracting hash_qual_cost.
+	 * XXX This could give a bogus answer if the hash quals contain subplans.
+	 */
+	cost_qual_eval(&qp_qual_cost, path->jpath.joinrestrictinfo,
+				   hashjointuples, root);
+	qp_qual_cost.startup -= hash_qual_cost.startup;
+	qp_qual_cost.per_tuple -= hash_qual_cost.per_tuple;
+
 	/*
 	 * For each tuple that gets through the hashjoin proper, we charge
 	 * cpu_tuple_cost plus the cost of evaluating additional restriction
@@ -3846,6 +3865,7 @@ cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan)
 	/* Figure any cost for evaluating the testexpr */
 	cost_qual_eval(&sp_cost,
 				   make_ands_implicit((Expr *) subplan->testexpr),
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS,
 				   root);
 
 	if (subplan->useHashTable)
@@ -4037,14 +4057,21 @@ cost_rescan(PlannerInfo *root, Path *path,
  *		preferred since it allows caching of the results.)
  *		The result includes both a one-time (startup) component,
  *		and a per-evaluation component.
+ *
+ * The caller is also asked to provide an estimated number of evaluations.
+ * That's often fairly bogus, since we may not have any good estimate yet.
+ * We use it only to decide which entry of an AlternativeSubPlan to consider.
  */
 void
-cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root)
+cost_qual_eval(QualCost *cost, List *quals,
+			   double num_evals, PlannerInfo *root)
 {
 	cost_qual_eval_context context;
 	ListCell   *l;
 
 	context.root = root;
+	context.num_evals = num_evals;
+	context.num_evals_used = false;
 	context.total.startup = 0;
 	context.total.per_tuple = 0;
 
@@ -4065,11 +4092,14 @@ cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root)
  *		As above, for a single RestrictInfo or expression.
  */
 void
-cost_qual_eval_node(QualCost *cost, Node *qual, PlannerInfo *root)
+cost_qual_eval_node(QualCost *cost, Node *qual,
+					double num_evals, PlannerInfo *root)
 {
 	cost_qual_eval_context context;
 
 	context.root = root;
+	context.num_evals = num_evals;
+	context.num_evals_used = false;
 	context.total.startup = 0;
 	context.total.per_tuple = 0;
 
@@ -4087,8 +4117,9 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 	/*
 	 * RestrictInfo nodes contain an eval_cost field reserved for this
 	 * routine's use, so that it's not necessary to evaluate the qual clause's
-	 * cost more than once.  If the clause's cost hasn't been computed yet,
-	 * the field's startup value will contain -1.
+	 * cost more than once (unless the clause contains an AlternativeSubPlan).
+	 * If the clause's cost hasn't been cached, the field's startup value will
+	 * contain -1.
 	 */
 	if (IsA(node, RestrictInfo))
 	{
@@ -4099,6 +4130,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 			cost_qual_eval_context locContext;
 
 			locContext.root = context->root;
+			locContext.num_evals = context->num_evals;
+			locContext.num_evals_used = false;
 			locContext.total.startup = 0;
 			locContext.total.per_tuple = 0;
 
@@ -4121,6 +4154,18 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 				locContext.total.startup += locContext.total.per_tuple;
 				locContext.total.per_tuple = 0;
 			}
+
+			/*
+			 * If we made use of num_evals, we can't cache the result.
+			 */
+			if (locContext.num_evals_used)
+			{
+				context->total.startup += locContext.total.startup;
+				context->total.per_tuple += locContext.total.per_tuple;
+				/* do NOT recurse into children */
+				return false;
+			}
+
 			rinfo->eval_cost = locContext.total;
 		}
 		context->total.startup += rinfo->eval_cost.startup;
@@ -4218,14 +4263,19 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 	else if (IsA(node, ArrayCoerceExpr))
 	{
 		ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) node;
-		QualCost	perelemcost;
-
-		cost_qual_eval_node(&perelemcost, (Node *) acoerce->elemexpr,
-							context->root);
-		context->total.startup += perelemcost.startup;
-		if (perelemcost.per_tuple > 0)
-			context->total.per_tuple += perelemcost.per_tuple *
+		cost_qual_eval_context elemContext;
+
+		elemContext.root = context->root;
+		elemContext.num_evals = context->num_evals;
+		elemContext.num_evals_used = false;
+		elemContext.total.startup = 0;
+		elemContext.total.per_tuple = 0;
+		cost_qual_eval_walker((Node *) acoerce->elemexpr, &elemContext);
+		context->total.startup += elemContext.total.startup;
+		if (elemContext.total.per_tuple > 0)
+			context->total.per_tuple += elemContext.total.per_tuple *
 				estimate_array_length((Node *) acoerce->arg);
+		context->num_evals_used |= elemContext.num_evals_used;
 	}
 	else if (IsA(node, RowCompareExpr))
 	{
@@ -4282,15 +4332,15 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 	else if (IsA(node, AlternativeSubPlan))
 	{
 		/*
-		 * Arbitrarily use the first alternative plan for costing.  (We should
-		 * certainly only include one alternative, and we don't yet have
-		 * enough information to know which one the executor is most likely to
-		 * use.)
+		 * Make use of the caller's estimated number of evaluations to decide
+		 * which subplan should be used.
 		 */
 		AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;
+		SubPlan    *subplan;
 
-		return cost_qual_eval_walker((Node *) linitial(asplan->subplans),
-									 context);
+		subplan = SS_choose_alternative_subplan(asplan, context->num_evals);
+		context->num_evals_used = true;
+		return cost_qual_eval_walker((Node *) subplan, context);
 	}
 	else if (IsA(node, PlaceHolderVar))
 	{
@@ -4332,7 +4382,9 @@ get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 	if (param_info)
 	{
 		/* Include costs of pushed-down clauses */
-		cost_qual_eval(qpqual_cost, param_info->ppi_clauses, root);
+		/* XXX baserel->tuples is wrong for a few callers */
+		cost_qual_eval(qpqual_cost, param_info->ppi_clauses,
+					   baserel->tuples, root);
 
 		qpqual_cost->startup += baserel->baserestrictcost.startup;
 		qpqual_cost->per_tuple += baserel->baserestrictcost.per_tuple;
@@ -4640,9 +4692,10 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 
 	rel->rows = clamp_row_est(nrows);
 
-	cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
+	cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo,
+				   rel->tuples, root);
 
-	set_rel_width(root, rel);
+	set_rel_width(root, rel, rel->tuples);
 }
 
 /*
@@ -5410,9 +5463,10 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 
 	rel->rows = 1000;			/* entirely bogus default estimate */
 
-	cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
+	cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo,
+				   COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
-	set_rel_width(root, rel);
+	set_rel_width(root, rel, COST_QUAL_EVAL_DUMMY_NUM_EVALS);
 }
 
 
@@ -5426,6 +5480,7 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
  * we'd need to pass upwards in case of a sort, hash, etc.
  *
  * This function also sets reltarget->cost, so it's a bit misnamed now.
+ * (Because it estimates costs, it also requires a num_evals estimate.)
  *
  * NB: this works best on plain relations because it prefers to look at
  * real Vars.  For subqueries, set_subquery_size_estimates will already have
@@ -5438,7 +5493,7 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
  * building join relations or post-scan/join pathtargets.
  */
 static void
-set_rel_width(PlannerInfo *root, RelOptInfo *rel)
+set_rel_width(PlannerInfo *root, RelOptInfo *rel, double num_evals)
 {
 	Oid			reloid = planner_rt_fetch(rel->relid, root)->relid;
 	int32		tuple_width = 0;
@@ -5523,7 +5578,7 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 			QualCost	cost;
 
 			tuple_width += phinfo->ph_width;
-			cost_qual_eval_node(&cost, (Node *) phv->phexpr, root);
+			cost_qual_eval_node(&cost, (Node *) phv->phexpr, num_evals, root);
 			rel->reltarget->cost.startup += cost.startup;
 			rel->reltarget->cost.per_tuple += cost.per_tuple;
 		}
@@ -5541,7 +5596,7 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 			Assert(item_width > 0);
 			tuple_width += item_width;
 			/* Not entirely clear if we need to account for cost, but do so */
-			cost_qual_eval_node(&cost, node, root);
+			cost_qual_eval_node(&cost, node, num_evals, root);
 			rel->reltarget->cost.startup += cost.startup;
 			rel->reltarget->cost.per_tuple += cost.per_tuple;
 		}
@@ -5656,7 +5711,8 @@ set_pathtarget_cost_width(PlannerInfo *root, PathTarget *target)
 			tuple_width += item_width;
 
 			/* Account for cost, too */
-			cost_qual_eval_node(&cost, node, root);
+			/* XXX have caller pass rowcount est */
+			cost_qual_eval_node(&cost, node, COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 			target->cost.startup += cost.startup;
 			target->cost.per_tuple += cost.per_tuple;
 		}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index eb9543f6ad..6848aaff74 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -165,7 +165,8 @@ static Node *fix_indexqual_clause(PlannerInfo *root,
 								  Node *clause, List *indexcolnos);
 static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol);
 static List *get_switched_clauses(List *clauses, Relids outerrelids);
-static List *order_qual_clauses(PlannerInfo *root, List *clauses);
+static List *order_qual_clauses(PlannerInfo *root, List *clauses,
+								double num_evals);
 static void copy_generic_path_info(Plan *dest, Path *src);
 static void copy_plan_costsize(Plan *dest, Plan *src);
 static void label_sort_with_costsize(PlannerInfo *root, Sort *plan,
@@ -941,8 +942,11 @@ get_gating_quals(PlannerInfo *root, List *quals)
 	if (!root->hasPseudoConstantQuals)
 		return NIL;
 
-	/* Sort into desirable execution order while still in RestrictInfo form */
-	quals = order_qual_clauses(root, quals);
+	/*
+	 * Sort into desirable execution order while still in RestrictInfo form.
+	 * Use num_evals = 1 since we only care about the gating quals here.
+	 */
+	quals = order_qual_clauses(root, quals, 1);
 
 	/* Pull out any pseudoconstant quals from the RestrictInfo list */
 	return extract_actual_clauses(quals, true);
@@ -1454,7 +1458,7 @@ create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path)
 	tlist = build_path_tlist(root, &best_path->path);
 
 	/* best_path->quals is just bare clauses */
-	quals = order_qual_clauses(root, best_path->quals);
+	quals = order_qual_clauses(root, best_path->quals, 1);
 
 	plan = make_result(tlist, (Node *) quals, NULL);
 
@@ -2055,7 +2059,7 @@ create_group_plan(PlannerInfo *root, GroupPath *best_path)
 
 	tlist = build_path_tlist(root, &best_path->path);
 
-	quals = order_qual_clauses(root, best_path->qual);
+	quals = order_qual_clauses(root, best_path->qual, subplan->plan_rows);
 
 	plan = make_group(tlist,
 					  quals,
@@ -2132,7 +2136,7 @@ create_agg_plan(PlannerInfo *root, AggPath *best_path)
 
 	tlist = build_path_tlist(root, &best_path->path);
 
-	quals = order_qual_clauses(root, best_path->qual);
+	quals = order_qual_clauses(root, best_path->qual, subplan->plan_rows);
 
 	plan = make_agg(tlist, quals,
 					best_path->aggstrategy,
@@ -2771,14 +2775,15 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 					List *tlist, List *scan_clauses)
 {
 	SeqScan    *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 
 	/* it should be a base rel... */
 	Assert(scan_relid > 0);
-	Assert(best_path->parent->rtekind == RTE_RELATION);
+	Assert(rel->rtekind == RTE_RELATION);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -2809,7 +2814,8 @@ create_samplescan_plan(PlannerInfo *root, Path *best_path,
 					   List *tlist, List *scan_clauses)
 {
 	SampleScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	TableSampleClause *tsc;
 
@@ -2821,7 +2827,7 @@ create_samplescan_plan(PlannerInfo *root, Path *best_path,
 	Assert(tsc != NULL);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -2865,7 +2871,8 @@ create_indexscan_plan(PlannerInfo *root,
 	Scan	   *scan_plan;
 	List	   *indexclauses = best_path->indexclauses;
 	List	   *indexorderbys = best_path->indexorderbys;
-	Index		baserelid = best_path->path.parent->relid;
+	RelOptInfo *rel = best_path->path.parent;
+	Index		baserelid = rel->relid;
 	Oid			indexoid = best_path->indexinfo->indexoid;
 	List	   *qpqual;
 	List	   *stripped_indexquals;
@@ -2876,7 +2883,7 @@ create_indexscan_plan(PlannerInfo *root,
 
 	/* it should be a base rel... */
 	Assert(baserelid > 0);
-	Assert(best_path->path.parent->rtekind == RTE_RELATION);
+	Assert(rel->rtekind == RTE_RELATION);
 
 	/*
 	 * Extract the index qual expressions (stripped of RestrictInfos) from the
@@ -2938,7 +2945,9 @@ create_indexscan_plan(PlannerInfo *root,
 	}
 
 	/* Sort clauses into best execution order */
-	qpqual = order_qual_clauses(root, qpqual);
+	qpqual = order_qual_clauses(root, qpqual,
+								clamp_row_est(best_path->indexselectivity *
+											  rel->tuples));
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	qpqual = extract_actual_clauses(qpqual, false);
@@ -3100,7 +3109,7 @@ create_bitmap_scan_plan(PlannerInfo *root,
 	}
 
 	/* Sort clauses into best execution order */
-	qpqual = order_qual_clauses(root, qpqual);
+	qpqual = order_qual_clauses(root, qpqual, bitmapqualplan->plan_rows);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	qpqual = extract_actual_clauses(qpqual, false);
@@ -3371,12 +3380,13 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
 					List *tlist, List *scan_clauses)
 {
 	TidScan    *scan_plan;
-	Index		scan_relid = best_path->path.parent->relid;
+	RelOptInfo *rel = best_path->path.parent;
+	Index		scan_relid = rel->relid;
 	List	   *tidquals = best_path->tidquals;
 
 	/* it should be a base rel... */
 	Assert(scan_relid > 0);
-	Assert(best_path->path.parent->rtekind == RTE_RELATION);
+	Assert(rel->rtekind == RTE_RELATION);
 
 	/*
 	 * The qpqual list must contain all restrictions not enforced by the
@@ -3418,7 +3428,8 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
 	}
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	/* XXX need selectivity of tidquals to compute num_evals properly */
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo lists to bare expressions; ignore pseudoconstants */
 	tidquals = extract_actual_clauses(tidquals, false);
@@ -3484,7 +3495,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 	subplan = create_plan(rel->subroot, best_path->subpath);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, subplan->plan_rows);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3518,7 +3529,8 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
 						 List *tlist, List *scan_clauses)
 {
 	FunctionScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	List	   *functions;
 
@@ -3529,7 +3541,7 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
 	functions = rte->functions;
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3561,7 +3573,8 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
 						  List *tlist, List *scan_clauses)
 {
 	TableFuncScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	TableFunc  *tablefunc;
 
@@ -3572,7 +3585,7 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
 	tablefunc = rte->tablefunc;
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3604,7 +3617,8 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
 					   List *tlist, List *scan_clauses)
 {
 	ValuesScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	List	   *values_lists;
 
@@ -3615,7 +3629,7 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
 	values_lists = rte->values_lists;
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3648,7 +3662,8 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
 					List *tlist, List *scan_clauses)
 {
 	CteScan    *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	SubPlan    *ctesplan = NULL;
 	int			plan_id;
@@ -3711,7 +3726,7 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
 	cte_param_id = linitial_int(ctesplan->setParam);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3742,7 +3757,8 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
 								List *tlist, List *scan_clauses)
 {
 	NamedTuplestoreScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 
 	Assert(scan_relid > 0);
@@ -3750,7 +3766,7 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
 	Assert(rte->rtekind == RTE_NAMEDTUPLESTORE);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3781,7 +3797,8 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
 					   List *tlist, List *scan_clauses)
 {
 	Result	   *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY;
 
 	Assert(scan_relid > 0);
@@ -3789,7 +3806,7 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
 	Assert(rte->rtekind == RTE_RESULT);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3818,7 +3835,8 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
 						  List *tlist, List *scan_clauses)
 {
 	WorkTableScan *scan_plan;
-	Index		scan_relid = best_path->parent->relid;
+	RelOptInfo *rel = best_path->parent;
+	Index		scan_relid = rel->relid;
 	RangeTblEntry *rte;
 	Index		levelsup;
 	PlannerInfo *cteroot;
@@ -3848,7 +3866,7 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
 		elog(ERROR, "could not find param ID for CTE \"%s\"", rte->ctename);
 
 	/* Sort clauses into best execution order */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
 	scan_clauses = extract_actual_clauses(scan_clauses, false);
@@ -3908,7 +3926,7 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	 * Sort clauses into best execution order.  We do this first since the FDW
 	 * might have more info than we do and wish to adjust the ordering.
 	 */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/*
 	 * Let the FDW perform its processing on the restriction clauses and
@@ -4039,7 +4057,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	 * Sort clauses into the best execution order, although custom-scan
 	 * provider can reorder them again.
 	 */
-	scan_clauses = order_qual_clauses(root, scan_clauses);
+	scan_clauses = order_qual_clauses(root, scan_clauses, rel->tuples);
 
 	/*
 	 * Invoke custom plan provider to create the Plan node represented by the
@@ -4117,7 +4135,9 @@ create_nestloop_plan(PlannerInfo *root,
 	root->curOuterRels = saveOuterRels;
 
 	/* Sort join qual clauses into best execution order */
-	joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses);
+	/* XXX this underestimates the number of evaluations */
+	joinrestrictclauses = order_qual_clauses(root, joinrestrictclauses,
+											 best_path->path.parent->rows);
 
 	/* Get the join qual clauses (in plain expression form) */
 	/* Any pseudoconstant clauses are ignored here */
@@ -4205,7 +4225,9 @@ create_mergejoin_plan(PlannerInfo *root,
 
 	/* Sort join qual clauses into best execution order */
 	/* NB: do NOT reorder the mergeclauses */
-	joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo);
+	/* XXX this underestimates the number of evaluations */
+	joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo,
+									 best_path->jpath.path.parent->rows);
 
 	/* Get the join qual clauses (in plain expression form) */
 	/* Any pseudoconstant clauses are ignored here */
@@ -4506,7 +4528,9 @@ create_hashjoin_plan(PlannerInfo *root,
 									 CP_SMALL_TLIST);
 
 	/* Sort join qual clauses into best execution order */
-	joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo);
+	/* XXX this underestimates the number of evaluations */
+	joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo,
+									 best_path->jpath.path.parent->rows);
 	/* There's no point in sorting the hash clauses ... */
 
 	/* Get the join qual clauses (in plain expression form) */
@@ -5045,7 +5069,7 @@ get_switched_clauses(List *clauses, Relids outerrelids)
  * selectivity in the ordering would likely do the wrong thing.
  */
 static List *
-order_qual_clauses(PlannerInfo *root, List *clauses)
+order_qual_clauses(PlannerInfo *root, List *clauses, double num_evals)
 {
 	typedef struct
 	{
@@ -5074,7 +5098,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
 		Node	   *clause = (Node *) lfirst(lc);
 		QualCost	qcost;
 
-		cost_qual_eval_node(&qcost, clause, root);
+		cost_qual_eval_node(&qcost, clause, num_evals, root);
 		items[i].clause = clause;
 		items[i].cost = qcost.per_tuple;
 		if (IsA(clause, RestrictInfo))
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4131019fc9..4fd2017cf7 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5894,7 +5894,8 @@ make_sort_input_target(PlannerInfo *root,
 				 */
 				QualCost	cost;
 
-				cost_qual_eval_node(&cost, (Node *) expr, root);
+				cost_qual_eval_node(&cost, (Node *) expr,
+									COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 
 				/*
 				 * We arbitrarily define "expensive" as "more than 10X
@@ -6316,7 +6317,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	 * the sort, since tuplesort.c will have to re-evaluate the index
 	 * expressions each time.  (XXX that's pretty inefficient...)
 	 */
-	cost_qual_eval(&indexExprCost, indexInfo->indexprs, root);
+	cost_qual_eval(&indexExprCost, indexInfo->indexprs, rel->tuples, root);
 	comparisonCost = 2.0 * (indexExprCost.startup + indexExprCost.per_tuple);
 
 	/* Estimate the cost of seq scan + sort */
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index b02fcb9bfe..b6b761c9b3 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -148,7 +148,8 @@ get_first_col_type(Plan *plan, Oid *coltype, int32 *coltypmod,
  *
  * The result is whatever we need to substitute in place of the SubLink node
  * in the executable expression.  If we're going to do the subplan as a
- * regular subplan, this will be the constructed SubPlan node.  If we're going
+ * regular subplan, this will be the constructed SubPlan node, or possibly
+ * an AlternativeSubPlan node with a list of options.  If we're going
  * to do the subplan as an InitPlan, the SubPlan node instead goes into
  * root->init_plans, and what we return here is an expression tree
  * representing the InitPlan's result: usually just a Param node representing
@@ -246,7 +247,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	 * likely to be better (it depends on the expected number of executions of
 	 * the EXISTS qual, and we are much too early in planning the outer query
 	 * to be able to guess that).  So we generate both plans, if possible, and
-	 * leave it to the executor to decide which to use.
+	 * wait till later to decide which to use.
 	 */
 	if (simple_exists && IsA(result, SubPlan))
 	{
@@ -298,7 +299,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 				/* build_subplan won't have filled in paramIds */
 				hashplan->paramIds = paramIds;
 
-				/* Leave it to the executor to decide which plan to use */
+				/* Build an AlternativeSubPlan, which we'll resolve later */
 				asplan = makeNode(AlternativeSubPlan);
 				asplan->subplans = list_make2(result, hashplan);
 				result = (Node *) asplan;
@@ -2863,6 +2864,39 @@ finalize_agg_primnode(Node *node, finalize_primnode_context *context)
 								  (void *) context);
 }
 
+/*
+ * SS_choose_alternative_subplan - choose one subplan using num_evals
+ *
+ * This function resolves the choice that make_subplan() previously punted on.
+ * We call it later in planning when we have a better idea of how many outer
+ * rows the subplan will be invoked for.
+ */
+SubPlan *
+SS_choose_alternative_subplan(AlternativeSubPlan *asplan,
+							  double num_evals)
+{
+	SubPlan    *subplan1;
+	SubPlan    *subplan2;
+	Cost		cost1;
+	Cost		cost2;
+
+	/*
+	 * make_subplan() saved enough info so that we don't have to work very
+	 * hard to estimate the total cost, given the number-of-calls estimate.
+	 */
+	Assert(list_length(asplan->subplans) == 2);
+	subplan1 = (SubPlan *) linitial(asplan->subplans);
+	subplan2 = (SubPlan *) lsecond(asplan->subplans);
+
+	cost1 = subplan1->startup_cost + num_evals * subplan1->per_call_cost;
+	cost2 = subplan2->startup_cost + num_evals * subplan2->per_call_cost;
+
+	if (cost1 < cost2)
+		return subplan1;
+	else
+		return subplan2;
+}
+
 /*
  * SS_make_initplan_output_param - make a Param for an initPlan's output
  *
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 0c6fe0115a..592773b717 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -371,7 +371,8 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
 		if (!DO_AGGSPLIT_COMBINE(context->aggsplit))
 		{
 			/* add the input expressions' cost to per-input-row costs */
-			cost_qual_eval_node(&argcosts, (Node *) aggref->args, context->root);
+			cost_qual_eval_node(&argcosts, (Node *) aggref->args,
+								COST_QUAL_EVAL_DUMMY_NUM_EVALS, context->root);
 			costs->transCost.startup += argcosts.startup;
 			costs->transCost.per_tuple += argcosts.per_tuple;
 
@@ -385,7 +386,7 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
 			if (aggref->aggfilter)
 			{
 				cost_qual_eval_node(&argcosts, (Node *) aggref->aggfilter,
-									context->root);
+									COST_QUAL_EVAL_DUMMY_NUM_EVALS, context->root);
 				costs->transCost.startup += argcosts.startup;
 				costs->transCost.per_tuple += argcosts.per_tuple;
 			}
@@ -398,7 +399,7 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context)
 		if (aggref->aggdirectargs)
 		{
 			cost_qual_eval_node(&argcosts, (Node *) aggref->aggdirectargs,
-								context->root);
+								1, context->root);
 			costs->finalCost.startup += argcosts.startup;
 			costs->finalCost.per_tuple += argcosts.per_tuple;
 		}
@@ -4625,7 +4626,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid,
 			 */
 			if (contain_subplans(param))
 				goto fail;
-			cost_qual_eval(&eval_cost, list_make1(param), NULL);
+			cost_qual_eval_node(&eval_cost, param, 1, NULL);
 			if (eval_cost.startup + eval_cost.per_tuple >
 				10 * cpu_operator_cost)
 				goto fail;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e845a4b1ae..ac4c7a991d 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1477,8 +1477,8 @@ create_group_result_path(PlannerInfo *root, RelOptInfo *rel,
 	{
 		QualCost	qual_cost;
 
-		cost_qual_eval(&qual_cost, havingqual, root);
 		/* havingqual is evaluated once at startup */
+		cost_qual_eval(&qual_cost, havingqual, 1, root);
 		pathnode->path.startup_cost += qual_cost.startup + qual_cost.per_tuple;
 		pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
 	}
@@ -3247,7 +3247,7 @@ create_minmaxagg_path(PlannerInfo *root,
 	{
 		QualCost	qual_cost;
 
-		cost_qual_eval(&qual_cost, quals, root);
+		cost_qual_eval(&qual_cost, quals, 1, root);
 		pathnode->path.startup_cost += qual_cost.startup;
 		pathnode->path.total_cost += qual_cost.startup + qual_cost.per_tuple;
 	}
diff --git a/src/backend/optimizer/util/placeholder.c b/src/backend/optimizer/util/placeholder.c
index 6abdc0299b..49460c4b04 100644
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -439,6 +439,9 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
 				 * charging the PHV's cost for some join paths.  For now, live
 				 * with that; but we might want to improve it later by
 				 * refiguring the reltarget costs for each pair of inputs.
+				 *
+				 * Unfortunately, the joinrel's size isn't set yet, so we
+				 * cannot use that in cost_qual_eval_node().
 				 */
 				if (!bms_is_subset(phinfo->ph_eval_at, outer_rel->relids) &&
 					!bms_is_subset(phinfo->ph_eval_at, inner_rel->relids))
@@ -446,7 +449,7 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
 					QualCost	cost;
 
 					cost_qual_eval_node(&cost, (Node *) phinfo->ph_var->phexpr,
-										root);
+										COST_QUAL_EVAL_DUMMY_NUM_EVALS, root);
 					joinrel->reltarget->cost.startup += cost.startup;
 					joinrel->reltarget->cost.per_tuple += cost.per_tuple;
 				}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index be08eb4814..ef732db13c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5985,7 +5985,7 @@ index_other_operands_eval_cost(PlannerInfo *root, List *indexquals)
 			other_operand = NULL;	/* keep compiler quiet */
 		}
 
-		cost_qual_eval_node(&index_qual_cost, other_operand, root);
+		cost_qual_eval_node(&index_qual_cost, other_operand, 1, root);
 		qual_arg_cost += index_qual_cost.startup + index_qual_cost.per_tuple;
 	}
 	return qual_arg_cost;
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 92e70ec0d9..489b0ff4bd 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -31,6 +31,12 @@
 
 #define DEFAULT_EFFECTIVE_CACHE_SIZE  524288	/* measured in pages */
 
+/*
+ * If a caller of cost_qual_eval() or cost_qual_eval_node() has no good number
+ * to use for num_evals, use this.
+ */
+#define COST_QUAL_EVAL_DUMMY_NUM_EVALS 1000
+
 typedef enum
 {
 	CONSTRAINT_EXCLUSION_OFF,	/* do not use c_e */
@@ -166,8 +172,10 @@ extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
-extern void cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root);
-extern void cost_qual_eval_node(QualCost *cost, Node *qual, PlannerInfo *root);
+extern void cost_qual_eval(QualCost *cost, List *quals, double num_evals,
+						   PlannerInfo *root);
+extern void cost_qual_eval_node(QualCost *cost, Node *qual, double num_evals,
+								PlannerInfo *root);
 extern void compute_semi_anti_join_factors(PlannerInfo *root,
 										   RelOptInfo *joinrel,
 										   RelOptInfo *outerrel,
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index d6a872bd2c..332c6c5b19 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -30,6 +30,8 @@ extern void SS_identify_outer_params(PlannerInfo *root);
 extern void SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel);
 extern void SS_attach_initplans(PlannerInfo *root, Plan *plan);
 extern void SS_finalize_plan(PlannerInfo *root, Plan *plan);
+extern SubPlan *SS_choose_alternative_subplan(AlternativeSubPlan *asplan,
+											  double num_evals);
 extern Param *SS_make_initplan_output_param(PlannerInfo *root,
 											Oid resulttype, int32 resulttypmod,
 											Oid resultcollation);
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 5de53f2782..04819419fa 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -2261,17 +2261,17 @@ NOTICE:  drop cascades to view rw_view1
 CREATE TABLE t1 (a int, b float, c text);
 CREATE INDEX t1_a_idx ON t1(a);
 INSERT INTO t1
-SELECT i,i,'t1' FROM generate_series(1,10) g(i);
+SELECT i,i,'t1' FROM generate_series(1,30) g(i);
 ANALYZE t1;
 CREATE TABLE t11 (d text) INHERITS (t1);
 CREATE INDEX t11_a_idx ON t11(a);
 INSERT INTO t11
-SELECT i,i,'t11','t11d' FROM generate_series(1,10) g(i);
+SELECT i,i,'t11','t11d' FROM generate_series(1,30) g(i);
 ANALYZE t11;
 CREATE TABLE t12 (e int[]) INHERITS (t1);
 CREATE INDEX t12_a_idx ON t12(a);
 INSERT INTO t12
-SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t12;
 CREATE TABLE t111 () INHERITS (t11, t12);
 NOTICE:  merging multiple inherited definitions of column "a"
@@ -2279,7 +2279,7 @@ NOTICE:  merging multiple inherited definitions of column "b"
 NOTICE:  merging multiple inherited definitions of column "c"
 CREATE INDEX t111_a_idx ON t111(a);
 INSERT INTO t111
-SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t111;
 CREATE VIEW v1 WITH (security_barrier=true) AS
 SELECT *, (SELECT d FROM t11 WHERE t11.a = t1.a LIMIT 1) AS d
@@ -2407,21 +2407,101 @@ NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 6
 NOTICE:  snooped value: 7
 NOTICE:  snooped value: 9
 NOTICE:  snooped value: 10
+NOTICE:  snooped value: 11
+NOTICE:  snooped value: 12
+NOTICE:  snooped value: 13
+NOTICE:  snooped value: 14
+NOTICE:  snooped value: 15
+NOTICE:  snooped value: 16
+NOTICE:  snooped value: 17
+NOTICE:  snooped value: 18
+NOTICE:  snooped value: 19
+NOTICE:  snooped value: 20
+NOTICE:  snooped value: 21
+NOTICE:  snooped value: 22
+NOTICE:  snooped value: 23
+NOTICE:  snooped value: 24
+NOTICE:  snooped value: 25
+NOTICE:  snooped value: 26
+NOTICE:  snooped value: 27
+NOTICE:  snooped value: 28
+NOTICE:  snooped value: 29
+NOTICE:  snooped value: 30
 NOTICE:  snooped value: 9
 TABLE t1; -- verify all a<=5 are intact
  a | b |  c   
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 64f23d0902..12e972d5e0 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -1077,25 +1077,25 @@ DROP TABLE base_tbl CASCADE;
 CREATE TABLE t1 (a int, b float, c text);
 CREATE INDEX t1_a_idx ON t1(a);
 INSERT INTO t1
-SELECT i,i,'t1' FROM generate_series(1,10) g(i);
+SELECT i,i,'t1' FROM generate_series(1,30) g(i);
 ANALYZE t1;
 
 CREATE TABLE t11 (d text) INHERITS (t1);
 CREATE INDEX t11_a_idx ON t11(a);
 INSERT INTO t11
-SELECT i,i,'t11','t11d' FROM generate_series(1,10) g(i);
+SELECT i,i,'t11','t11d' FROM generate_series(1,30) g(i);
 ANALYZE t11;
 
 CREATE TABLE t12 (e int[]) INHERITS (t1);
 CREATE INDEX t12_a_idx ON t12(a);
 INSERT INTO t12
-SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t12','{1,2}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t12;
 
 CREATE TABLE t111 () INHERITS (t11, t12);
 CREATE INDEX t111_a_idx ON t111(a);
 INSERT INTO t111
-SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,10) g(i);
+SELECT i,i,'t111','t111d','{1,1,1}'::int[] FROM generate_series(1,30) g(i);
 ANALYZE t111;
 
 CREATE VIEW v1 WITH (security_barrier=true) AS
#7Alexey Bashtanov
bashtanov@imap.cc
In reply to: Melanie Plageman (#2)
1 attachment(s)
Re: Improve planner cost estimations for alternative subplans

Hi Melanie,

Sorry for the delay.

I've just started looking at this patch today, but I was wondering if
you might include a test case which minimally reproduces the original
problem you had.

I could reproduce it with an easier generated data set, please see attached.

However, to be honest with you, while searching I encountered a few
examples of the opposite behavior,
when the patched version was slower than the master branch.
So I'm not so sure whether we should use the patch, maybe we should
rather consider Tom's approach.

Best, Alex

Attachments:

altplan-example.sqlapplication/sql; name=altplan-example.sqlDownload
#8Alexey Bashtanov
bashtanov@imap.cc
In reply to: Tom Lane (#6)
Re: Improve planner cost estimations for alternative subplans

Hi Tom,

sorry for the delay,

I experimented with adding a number-of-evaluations parameter to
cost_qual_eval, and found that the majority of call sites do have
something realistic they can pass. The attached very-much-WIP
patch shows my results so far. There's a lot of loose ends:

I like the idea, so if we alternative subplans remain there
I think we should implement it.

Best, Alex

#9Andy Fan
zhihui.fan1213@gmail.com
In reply to: Tom Lane (#6)
1 attachment(s)
Re: Improve planner cost estimations for alternative subplans

On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Nope. The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

Actually ... maybe it's not that bad. Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity. Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.

I read your idea of "ripping out all executor support for AlternativeSubPlan
and instead having the planner replace an AlternativeSubPlan with
the desired specific SubPlan somewhere late in planning, possibly
setrefs.c."
in [1]/messages/by-id/1992952.1592785225@sss.pgh.pa.us. I was thinking that if we can do such a replacement sooner,
for example once we know the num_calls for the subplans, Unknown if it
is possible though. If we can, then we can handle the issue here as well.

The attached is a very PoC version, I'm not sure if it is the right
direction
to go. I'm sorry that I still need more time to understand your solution
below but I'm too excited about your original idea.

[1]: /messages/by-id/1992952.1592785225@sss.pgh.pa.us

I experimented with adding a number-of-evaluations parameter to
cost_qual_eval, and found that the majority of call sites do have
something realistic they can pass. The attached very-much-WIP
patch shows my results so far. There's a lot of loose ends:

* Any call site using COST_QUAL_EVAL_DUMMY_NUM_EVALS is a potential spot
for future improvement. The only one that seems like it might be
fundamentally unsolvable is cost_subplan's usage; but that would only
matter if a subplan's testexpr contains an AlternativeSubPlan, which is
probably a negligible case in practice. The other ones seem to just
require more refactoring than I cared to do on a Sunday afternoon.

* I did not do anything for postgres_fdw.c beyond making it compile.
We can surely do better there, but it might require some rethinking
of the way that plan costs get cached.

* The merge and hash join costsize.c functions estimate costs of qpquals
(i.e. quals to be applied at the join that are not being used as merge
or hash conditions) by computing cost_qual_eval of the whole
joinrestrictlist and then subtracting off the cost of the merge or hash
quals. This is kind of broken if we want to use different num_eval
estimates for the qpquals and the merge/hash quals, which I think we do.
This probably just needs some refactoring to fix. We also need to save
the relevant rowcounts in the join Path nodes so that createplan.c can
do the right thing.

* I think it might be possible to improve the situation in
get_agg_clause_costs() if we're willing to postpone collection
of the actual aggregate costs till later. This'd require two
passes over the aggregate expressions, but that seems like it
might not be terribly expensive. (I'd be inclined to also look
at the idea of merging duplicate agg calls at plan time not
run time, if we refactor that.)

* I had to increase the number of table rows in one updatable_views.sql
test to keep the plans the same. Without that, the planner realized
that a seqscan would be cheaper than an indexscan. The result wasn't
wrong exactly, but it failed to prove that leakproof quals could be
used as indexquals, so I think we need to keep the plan choice the same.

Anyway, this is kind of invasive, but I think it shouldn't really
add noticeable costs as long as we save relevant rowcounts rather
than recomputing them in createplan.c. Is it worth doing? I dunno.
AlternativeSubPlan is pretty much a backwater, I think --- if it
were interesting performance-wise to a lot of people, more would
have been done with it by now.

regards, tom lane

--
Best Regards
Andy Fan

Attachments:

v1-0001-Convert-the-AlternativeSubplan-to-Subplan-as-soon.patchapplication/octet-stream; name=v1-0001-Convert-the-AlternativeSubplan-to-Subplan-as-soon.patchDownload
From 85d1f50909864a2e56013ebfc20e619d801ff569 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Mon, 17 Aug 2020 21:49:40 +0800
Subject: [PATCH v1] Convert the AlternativeSubplan to Subplan as soon as we
 knows the num_calls

---
 src/backend/optimizer/path/allpaths.c  | 34 ++++++++++++++++++++++++++
 src/backend/optimizer/path/costsize.c  |  6 ++---
 src/backend/optimizer/plan/subselect.c | 28 +++++++++++++++++++++
 src/backend/optimizer/util/relnode.c   | 33 +++++++++++++++++++++++++
 src/include/optimizer/planner.h        |  1 +
 src/include/optimizer/subselect.h      |  4 ++-
 6 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6da0dcd61c..5aa466c80d 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -41,6 +41,7 @@
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
+#include "optimizer/subselect.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -767,6 +768,39 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 {
 	Relids		required_outer;
 
+	/*
+	 * We have known the size of the rel, if there is altPlan on this rel
+	 * we can choose now. In this demo, I just track the alt plan in targetlist.
+	 */
+	ListCell *lc;
+	bool recost = false;
+	foreach(lc, rel->reltarget->exprs)
+	{
+		if (IsA(lfirst(lc), PlaceHolderVar))
+		{
+			bool recost_expr = false;
+			PlaceHolderVar *phv = lfirst_node(PlaceHolderVar, lc);
+			if (bms_is_subset(phv->phrels, rel->relids) &&
+				IsA(phv->phexpr, AlternativeSubPlan))
+			{
+				phv->phexpr = (Expr *)choose_subplan((AlternativeSubPlan *)phv->phexpr,
+													 rel->rows,
+													 &recost_expr);
+				if (recost_expr)
+					recost = true;
+			}
+		}
+	}
+
+	if (recost)
+	{
+		set_rel_width(root, rel);
+	}
+
+	/*
+	 * TODO: we can also check subplan in other places
+	 */
+
 	/*
 	 * We don't support pushing join clauses into the quals of a seqscan, but
 	 * it could still have required parameterization due to LATERAL refs in
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index fda4b2c6e8..c730d1bacf 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -91,6 +91,7 @@
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
+#include "optimizer/planner.h"
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 #include "utils/selfuncs.h"
@@ -175,7 +176,6 @@ static Selectivity get_foreign_key_join_selectivity(PlannerInfo *root,
 													List **restrictlist);
 static Cost append_nonpartial_cost(List *subpaths, int numpaths,
 								   int parallel_workers);
-static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
 static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
@@ -1774,7 +1774,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
 
 /*
  * cost_incremental_sort
- * 	Determines and returns the cost of sorting a relation incrementally, when
+ *	Determines and returns the cost of sorting a relation incrementally, when
  *  the input path is presorted by a prefix of the pathkeys.
  *
  * 'presorted_keys' is the number of leading pathkeys by which the input path
@@ -5438,7 +5438,7 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
  * The per-attribute width estimates are cached for possible re-use while
  * building join relations or post-scan/join pathtargets.
  */
-static void
+void
 set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 {
 	Oid			reloid = planner_rt_fetch(rel->relid, root)->relid;
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 6eb794669f..721f8bd06d 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2959,3 +2959,31 @@ SS_make_initplan_from_plan(PlannerInfo *root,
 	/* Set costs of SubPlan using info from the plan tree */
 	cost_subplan(subroot, node, plan);
 }
+
+
+/*
+ * Choose the subplan from AlternativeSubPlan as soon as possible.
+ */
+SubPlan *
+choose_subplan(AlternativeSubPlan *altplan, double num_calls, bool *recost)
+{
+
+	double cost1, cost2;
+	SubPlan *subplan1, *subplan2, *res;
+	subplan1 = linitial_node(SubPlan, altplan->subplans);
+	subplan2 = lsecond_node(SubPlan, altplan->subplans);
+	cost1 = subplan1->startup_cost + num_calls * subplan1->per_call_cost;
+	cost2 = subplan2->startup_cost + num_calls * subplan2->per_call_cost;
+
+	if (cost1 < cost2)
+	{
+		res = subplan1;
+		*recost = false;
+	}
+	else
+	{
+		res = subplan2;
+		*recost = true;
+	}
+	return res;
+}
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index a203e6f1ff..318546a4c4 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -27,6 +27,7 @@
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
 #include "optimizer/restrictinfo.h"
+#include "optimizer/subselect.h"
 #include "optimizer/tlist.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
@@ -728,6 +729,38 @@ build_join_rel(PlannerInfo *root,
 	set_joinrel_size_estimates(root, joinrel, outer_rel, inner_rel,
 							   sjinfo, restrictlist);
 
+	/* Fix the AlternativeSubPlans */
+	{
+		ListCell *lc;
+		bool recost = false;
+		foreach(lc, joinrel->reltarget->exprs)
+		{
+			if (IsA(lfirst(lc), PlaceHolderVar))
+			{
+				bool recost_expr = false;
+				PlaceHolderVar *phv = lfirst_node(PlaceHolderVar, lc);
+				if (bms_is_subset(phv->phrels, joinrel->relids) &&
+					IsA(phv->phexpr, AlternativeSubPlan))
+				{
+					phv->phexpr = (Expr *)choose_subplan((AlternativeSubPlan *)phv->phexpr,
+														 joinrel->rows,
+														 &recost_expr);
+					if(recost_expr)
+						recost = true;
+					/* track how many exprs are changed, old_ones (AlternativeSubplans)
+					 * and new_ones (SubPlans)
+					 */
+				}
+			}
+		}
+
+		if (recost)
+		{
+			/* get the delta cost by old_costs and new_costs, and then
+			 * modify the cost of the  path in outer_rel and inner_rel.
+			 */
+		}
+	}
 	/*
 	 * Set the consider_parallel flag if this joinrel could potentially be
 	 * scanned within a parallel worker.  If this flag is false for either
diff --git a/src/include/optimizer/planner.h b/src/include/optimizer/planner.h
index beb7dbbcbe..a7a7b32eff 100644
--- a/src/include/optimizer/planner.h
+++ b/src/include/optimizer/planner.h
@@ -58,4 +58,5 @@ extern Path *get_cheapest_fractional_path(RelOptInfo *rel,
 
 extern Expr *preprocess_phv_expression(PlannerInfo *root, Expr *expr);
 
+extern void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PLANNER_H */
diff --git a/src/include/optimizer/subselect.h b/src/include/optimizer/subselect.h
index d6a872bd2c..136d60b7c0 100644
--- a/src/include/optimizer/subselect.h
+++ b/src/include/optimizer/subselect.h
@@ -36,5 +36,7 @@ extern Param *SS_make_initplan_output_param(PlannerInfo *root,
 extern void SS_make_initplan_from_plan(PlannerInfo *root,
 									   PlannerInfo *subroot, Plan *plan,
 									   Param *prm);
-
+extern SubPlan *choose_subplan(AlternativeSubPlan *altplan,
+							   double num_calls,
+							   bool *recost);
 #endif							/* SUBSELECT_H */
-- 
2.21.0

#10Andy Fan
zhihui.fan1213@gmail.com
In reply to: Andy Fan (#9)
Re: Improve planner cost estimations for alternative subplans

On Mon, Aug 17, 2020 at 10:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Nope. The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

Actually ... maybe it's not that bad. Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity. Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.

I read your idea of "ripping out all executor support for
AlternativeSubPlan
and instead having the planner replace an AlternativeSubPlan with
the desired specific SubPlan somewhere late in planning, possibly
setrefs.c."
in [1]. I was thinking that if we can do such a replacement sooner,
for example once we know the num_calls for the subplans, Unknown if it
is possible though. If we can, then we can handle the issue here as well.

The attached is a very PoC version, I'm not sure if it is the right
direction
to go.

The idea behind it is if we have a RelOptInfo which have
some AlternativeSubPlan,
and assume these subplans have some correlated vars which can be expressed
as
deps_relids. Then we can convert the AlternativeSubPlan to SubPlan once
bms_is_subset(deps_relids, rel->relids). My patch is able to fix the
issue reported
here and it only converts the AlternativeSubPlan in rel->reltarget for demo
purpose.

--
Best Regards
Andy Fan

#11Andy Fan
zhihui.fan1213@gmail.com
In reply to: Andy Fan (#10)
Re: Improve planner cost estimations for alternative subplans

On Wed, Aug 26, 2020 at 4:21 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Aug 17, 2020 at 10:12 PM Andy Fan <zhihui.fan1213@gmail.com>
wrote:

On Mon, Jun 22, 2020 at 9:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Nope. The entire reason why we have that kluge is that we don't know
until much later how many times we expect to execute the subplan.
AlternativeSubPlan allows the decision which subplan form to use to be
postponed till runtime; but when we're doing things like estimating the
cost and selectivity of a where-clause, we don't know that.

Maybe there's some way to recast things to avoid that problem,
but I have little clue what it'd look like.

Actually ... maybe it's not that bad. Clearly there would be a
circularity issue for selectivity estimation, but all the alternatives
should have the same selectivity. Cost estimation is a different story:
by the time we need to do cost estimation for a subexpression, we do in
many cases have an idea how often the subexpression will be executed.

I read your idea of "ripping out all executor support for
AlternativeSubPlan
and instead having the planner replace an AlternativeSubPlan with
the desired specific SubPlan somewhere late in planning, possibly
setrefs.c."
in [1]. I was thinking that if we can do such a replacement sooner,
for example once we know the num_calls for the subplans, Unknown if it
is possible though. If we can, then we can handle the issue here as well.

The attached is a very PoC version, I'm not sure if it is the right
direction
to go.

The idea behind it is if we have a RelOptInfo which have
some AlternativeSubPlan,
and assume these subplans have some correlated vars which can be expressed
as
deps_relids. Then we can convert the AlternativeSubPlan to SubPlan once
bms_is_subset(subplan->deps_relids, rel->relids).

The way of figuring out subplan->deps_relids was wrong in my patch, I will
fix it later.
But the general idea is the same.

My patch is able to fix the issue reported
here and it only converts the AlternativeSubPlan in rel->reltarget for
demo purpose.

--
Best Regards
Andy Fan

--
Best Regards
Andy Fan