Making JIT more granular

Started by David Rowleyover 5 years ago4 messages
#1David Rowley
dgrowleyml@gmail.com
1 attachment(s)

Hi,

At the moment JIT compilation, if enabled, is applied to all
expressions in the entire plan. This can sometimes be a problem as
some expressions may be evaluated lots and warrant being JITted, but
others may only be evaluated just a few times, or even not at all.

This problem tends to become large when table partitioning is involved
as the number of expressions in the plan grows with each partition
present in the plan. Some partitions may have many rows and it can be
useful to JIT expression, but others may have few rows or even no
rows, in which case JIT is a waste of effort.

I recall a few cases where people have complained that JIT was too
slow. One case, in particular, is [1]/messages/by-id/7736C40E-6DB5-4E7A-8FE3-4B2AB8E22793@elevated-dev.com.

It would be nice if JIT was more granular about which parts of the
plan it could be enabled for. So I went and did that in the attached.

The patch basically changes the plan-level consideration of if JIT
should be enabled and to what level into a per-plan-node
consideration. So, instead of considering JIT based on the overall
total_cost of the plan, we just consider it on the plan-node's
total_cost.

I was just planing around with a test case of:

create table listp(a int, b int) partition by list(a);
select 'create table listp'|| x || ' partition of listp for values
in('||x||');' from generate_Series(1,1000) x;
\gexec
insert into listp select 1,x from generate_series(1,100000000) x;
vacuum analyze listp;

explain (analyze, buffers) select count(*) from listp where b < 0;

I get:

master jit=on
JIT:
Functions: 3002
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 141.587 ms, Inlining 11.760 ms, Optimization
6518.664 ms, Emission 3152.266 ms, Total 9824.277 ms
Execution Time: 12588.292 ms
(2013 rows)

master jit=off
Execution Time: 3672.391 ms

patched jit=on
JIT:
Functions: 5
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.675 ms, Inlining 3.322 ms, Optimization 10.766
ms, Emission 5.892 ms, Total 20.655 ms
Execution Time: 2754.160 ms

This explain format will need further work as each of those flags is
now per plan node rather than on the plan as a whole. I considered
just making the true/false a counter to count the number of functions,
e.g Inlined: 5 Optimized: 5 etc.

I understand from [2]/messages/by-id/20200728212806.tu5ebmdbmfrvhoao@alap3.anarazel.de that Andres has WIP code to improve the
performance of JIT compilation. That's really great, but I also
believe that no matter how fast we make it, it's going to be a waste
of effort unless the expressions are evaluated enough times for the
cheaper evaluations to pay off the compilation costs. It'll never be a
win when we evaluate certain expressions zero times. What Andres has
should allow us to drop the default jit costs.

Happy to hear people's thoughts on this.

David

[1]: /messages/by-id/7736C40E-6DB5-4E7A-8FE3-4B2AB8E22793@elevated-dev.com
[2]: /messages/by-id/20200728212806.tu5ebmdbmfrvhoao@alap3.anarazel.de

Attachments:

granular_jit_v1.patchapplication/octet-stream; name=granular_jit_v1.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7a7177c550..5cb573f06f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5171,8 +5171,9 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation is activated, if
-        enabled (see <xref linkend="jit"/>).
+        Sets the cost threshold for which a given plan node will consider
+        performing JIT compilation for itself.  xref linkend="jit"/> must also
+        be enabled.
         Performing <acronym>JIT</acronym> costs planning time but can
         accelerate query execution.
         Setting this to <literal>-1</literal> disables JIT compilation.
@@ -5189,10 +5190,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation attempts to inline
-        functions and operators.  Inlining adds planning time, but can
-        improve execution speed.  It is not meaningful to set this to less
-        than <varname>jit_above_cost</varname>.
+        Sets the cost threshold for which a given plan node will consider
+        inlining functions and operators using JIT compilation.  Inlining adds
+        additional overhead during executor start-up, but can improve
+        performance during execution.  It is not meaningful to set this to
+        less than <varname>jit_above_cost</varname>.
         Setting this to <literal>-1</literal> disables inlining.
         The default is <literal>500000</literal>.
        </para>
@@ -5207,13 +5209,14 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation applies expensive
-        optimizations.  Such optimization adds planning time, but can improve
-        execution speed.  It is not meaningful to set this to less
-        than <varname>jit_above_cost</varname>, and it is unlikely to be
-        beneficial to set it to more
-        than <varname>jit_inline_above_cost</varname>.
-        Setting this to <literal>-1</literal> disables expensive optimizations.
+        Sets the cost threshold for which a given plan node will consider
+        performing an optimization pass during JIT compilation for itself.
+        This compilation adds further overhead during executor start-up, but
+        can provide an additional boost to performance during execution.  It
+        is not meaningful to set this to less than
+        <varname>jit_above_cost</varname>, and it is unlikely to be beneficial
+        to set it to more than <varname>jit_inline_above_cost</varname>. 
+        Setting this to <literal>-1</literal> disables the optimization pass.
         The default is <literal>500000</literal>.
        </para>
       </listitem>
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index 88661217e9..a8ec488488 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -296,22 +296,8 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
 		if (estate->es_subplanstates &&
 			contain_subplans((Node *) exprs))
 		{
-			int			saved_jit_flags;
-
-			/*
-			 * As these expressions are only used once, disable JIT for them.
-			 * This is worthwhile because it's common to insert significant
-			 * amounts of data via VALUES().  Note that this doesn't prevent
-			 * use of JIT *within* a subplan, since that's initialized
-			 * separately; this just affects the upper-level subexpressions.
-			 */
-			saved_jit_flags = estate->es_jit_flags;
-			estate->es_jit_flags = PGJIT_NONE;
-
 			scanstate->exprstatelists[i] = ExecInitExprList(exprs,
 															&scanstate->ss.ps);
-
-			estate->es_jit_flags = saved_jit_flags;
 		}
 		i++;
 	}
diff --git a/src/backend/jit/jit.c b/src/backend/jit/jit.c
index 5ca3f922fe..362a9f7a11 100644
--- a/src/backend/jit/jit.c
+++ b/src/backend/jit/jit.c
@@ -165,11 +165,11 @@ jit_compile_expr(struct ExprState *state)
 		return false;
 
 	/* if no jitting should be performed at all */
-	if (!(state->parent->state->es_jit_flags & PGJIT_PERFORM))
+	if (!(state->parent->plan->jitFlags & PGJIT_PERFORM))
 		return false;
 
 	/* or if expressions aren't JITed */
-	if (!(state->parent->state->es_jit_flags & PGJIT_EXPR))
+	if (!(state->parent->plan->jitFlags & PGJIT_EXPR))
 		return false;
 
 	/* this also takes !jit_enabled into account */
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index cca5c117a0..65586da719 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -137,7 +137,7 @@ llvm_compile_expr(ExprState *state)
 		context = (LLVMJitContext *) parent->state->es_jit;
 	else
 	{
-		context = llvm_create_context(parent->state->es_jit_flags);
+		context = llvm_create_context(state->parent->plan->jitFlags);
 		parent->state->es_jit = &context->base;
 	}
 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 89c409de66..884ed9a60e 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -121,6 +121,7 @@ CopyPlanFields(const Plan *from, Plan *newnode)
 	COPY_SCALAR_FIELD(plan_width);
 	COPY_SCALAR_FIELD(parallel_aware);
 	COPY_SCALAR_FIELD(parallel_safe);
+	COPY_SCALAR_FIELD(jitFlags);
 	COPY_SCALAR_FIELD(plan_node_id);
 	COPY_NODE_FIELD(targetlist);
 	COPY_NODE_FIELD(qual);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index e2f177515d..c3e4cccb17 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -334,6 +334,7 @@ _outPlanInfo(StringInfo str, const Plan *node)
 	WRITE_INT_FIELD(plan_width);
 	WRITE_BOOL_FIELD(parallel_aware);
 	WRITE_BOOL_FIELD(parallel_safe);
+	WRITE_INT_FIELD(jitFlags);
 	WRITE_INT_FIELD(plan_node_id);
 	WRITE_NODE_FIELD(targetlist);
 	WRITE_NODE_FIELD(qual);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 42050ab719..dabedd4017 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1572,6 +1572,7 @@ ReadCommonPlan(Plan *local_node)
 	READ_INT_FIELD(plan_width);
 	READ_BOOL_FIELD(parallel_aware);
 	READ_BOOL_FIELD(parallel_safe);
+	READ_INT_FIELD(jitFlags);
 	READ_INT_FIELD(plan_node_id);
 	READ_NODE_FIELD(targetlist);
 	READ_NODE_FIELD(qual);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 99278eed93..1e7933f0d6 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -22,6 +22,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_class.h"
 #include "foreign/fdwapi.h"
+#include "jit/jit.h"
 #include "miscadmin.h"
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
@@ -75,6 +76,7 @@ static Plan *create_plan_recurse(PlannerInfo *root, Path *best_path,
 								 int flags);
 static Plan *create_scan_plan(PlannerInfo *root, Path *best_path,
 							  int flags);
+static void plan_consider_jit(PlannerInfo *root, Plan *plan);
 static List *build_path_tlist(PlannerInfo *root, Path *path);
 static bool use_physical_tlist(PlannerInfo *root, Path *path, int flags);
 static List *get_gating_quals(PlannerInfo *root, List *quals);
@@ -526,9 +528,53 @@ create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	/* See about switching on JIT for this node */
+	plan_consider_jit(root, plan);
+
 	return plan;
 }
 
+static void
+plan_consider_jit(PlannerInfo *root, Plan *plan)
+{
+	plan->jitFlags = PGJIT_NONE;
+
+	/*
+	 * For values scans, expressions are only used once, so ensure we don't
+	 * enable JIT for them.
+	 */
+	if (IsA(plan, ValuesScan))
+		return;
+
+	 /* Determine which JIT options to enable for this plan node */
+	if (jit_enabled && jit_above_cost >= 0 &&
+		plan->total_cost > jit_above_cost)
+	{
+		plan->jitFlags |= PGJIT_PERFORM;
+
+		/*
+		 * Decide how much effort should be put into generating better code.
+		 */
+		if (jit_optimize_above_cost >= 0 &&
+			plan->total_cost > jit_optimize_above_cost)
+			plan->jitFlags |= PGJIT_OPT3;
+		if (jit_inline_above_cost >= 0 &&
+			plan->total_cost > jit_inline_above_cost)
+			plan->jitFlags |= PGJIT_INLINE;
+
+		/*
+		 * Decide which operations should be JITed.
+		 */
+		if (jit_expressions)
+			plan->jitFlags |= PGJIT_EXPR;
+		if (jit_tuple_deforming)
+			plan->jitFlags |= PGJIT_DEFORM;
+
+		/* Record the maximum flags used by any plan node */
+		root->glob->jitFlags |= plan->jitFlags;
+	}
+}
+
 /*
  * create_scan_plan
  *	 Create a scan plan for the parent relation of 'best_path'.
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b40a112c25..ef9806887d 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -353,6 +353,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 		glob->parallelModeOK = false;
 	}
 
+	glob->jitFlags = PGJIT_NONE;
+
 	/*
 	 * glob->parallelModeNeeded is normally set to false here and changed to
 	 * true during plan creation if a Gather or Gather Merge plan is actually
@@ -532,32 +534,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->utilityStmt = parse->utilityStmt;
 	result->stmt_location = parse->stmt_location;
 	result->stmt_len = parse->stmt_len;
-
-	result->jitFlags = PGJIT_NONE;
-	if (jit_enabled && jit_above_cost >= 0 &&
-		top_plan->total_cost > jit_above_cost)
-	{
-		result->jitFlags |= PGJIT_PERFORM;
-
-		/*
-		 * Decide how much effort should be put into generating better code.
-		 */
-		if (jit_optimize_above_cost >= 0 &&
-			top_plan->total_cost > jit_optimize_above_cost)
-			result->jitFlags |= PGJIT_OPT3;
-		if (jit_inline_above_cost >= 0 &&
-			top_plan->total_cost > jit_inline_above_cost)
-			result->jitFlags |= PGJIT_INLINE;
-
-		/*
-		 * Decide which operations should be JITed.
-		 */
-		if (jit_expressions)
-			result->jitFlags |= PGJIT_EXPR;
-		if (jit_tuple_deforming)
-			result->jitFlags |= PGJIT_DEFORM;
-	}
-
+	result->jitFlags = glob->jitFlags;
 	if (glob->partition_directory != NULL)
 		DestroyPartitionDirectory(glob->partition_directory);
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cf832d7f90..02b8a57c92 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -589,9 +589,9 @@ typedef struct EState
 	struct dsa_area *es_query_dsa;
 
 	/*
-	 * JIT information. es_jit_flags indicates whether JIT should be performed
-	 * and with which options.  es_jit is created on-demand when JITing is
-	 * performed.
+	 * JIT information. es_jit_flags indicates the possible set of JIT options
+	 * that each plan node may make use of.  es_jit is created on-demand when
+	 * JITing is performed.
 	 *
 	 * es_jit_worker_instr is the combined, on demand allocated,
 	 * instrumentation from all workers. The leader's instrumentation is kept
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 485d1b06c9..dcfc401fb0 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -146,6 +146,8 @@ typedef struct PlannerGlobal
 
 	char		maxParallelHazard;	/* worst PROPARALLEL hazard level */
 
+	int			jitFlags;		/* OR mask of jitFlags for each plan node */
+
 	PartitionDirectory partition_directory; /* partition descriptors */
 } PlannerGlobal;
 
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 83e01074ed..aa7547bf6c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -59,7 +59,8 @@ typedef struct PlannedStmt
 
 	bool		parallelModeNeeded; /* parallel mode required to execute? */
 
-	int			jitFlags;		/* which forms of JIT should be performed */
+	int			jitFlags;		/* OR mask of JIT flags which plan nodes may
+								 * use */
 
 	struct Plan *planTree;		/* tree of Plan nodes */
 
@@ -135,7 +136,10 @@ typedef struct Plan
 	bool		parallel_aware; /* engage parallel-aware logic? */
 	bool		parallel_safe;	/* OK to use as part of parallel plan? */
 
-	/*
+	int			jitFlags;		/* which forms of JIT should be performed on
+								 * this node */
+
+								 /*
 	 * Common structural data for all Plan types.
 	 */
 	int			plan_node_id;	/* unique across entire final plan tree */
#2David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#1)
1 attachment(s)
Re: Making JIT more granular

(This is an old thread. See [1]/messages/by-id/CAApHDvpQJqLrNOSi8P1JLM8YE2C+ksKFpSdZg=q6sTbtQ-v=aw@mail.gmail.com if you're missing the original email.)

On Tue, 4 Aug 2020 at 14:01, David Rowley <dgrowleyml@gmail.com> wrote:

At the moment JIT compilation, if enabled, is applied to all
expressions in the entire plan. This can sometimes be a problem as
some expressions may be evaluated lots and warrant being JITted, but
others may only be evaluated just a few times, or even not at all.

This problem tends to become large when table partitioning is involved
as the number of expressions in the plan grows with each partition
present in the plan. Some partitions may have many rows and it can be
useful to JIT expression, but others may have few rows or even no
rows, in which case JIT is a waste of effort.

This patch recently came up again in [2]/messages/by-id/CAApHDvrEoQ5p61NjDCKVgEWaH0qm1KprYw2-7m8-6ZGGJ8A2Dw@mail.gmail.com, where Magnus proposed we add
a new GUC [3]/messages/by-id/CABUevExR_9ZmkYj-aBvDreDKUinWLBBpORcmTbuPdNb5vGOLtA@mail.gmail.com to warn users when JIT compilation takes longer than the
specified fraction of execution time. Over there I mentioned that I
think it might be better to have a go at making the JIT costing better
so that it's more aligned to the amount of JITing work there is to do
rather than the total cost of the plan without any consideration about
how much there is to JIT compile.

In [4]/messages/by-id/20220329231641.ai3qrzpdo2vqvwix@alap3.anarazel.de, Andres reminded me that I need to account for the number of
times a given plan is (re)scanned rather than just the total_cost of
the Plan node. There followed some discussion about how that might be
done.

I've loosely implemented this in the attached patch. In order to get
the information about the expected number of "loops" a given Plan node
will be subject to, I've modified create_plan() so that it passes this
value down recursively while creating the plan. Nodes such as Nested
Loop multiply the "est_calls" by the number of outer rows. For nodes
such as Material, I've made the estimated calls = 1.0. Memoize must
take into account the expected cache hit ratio, which I've had to
record as part of MemoizePath so that create_plan knows about that.
Altogether, this is a fair bit of churn for createplan.c, and it's
still only part of the way there. When planning subplans, we do
create_plan() right away and since we plan subplans before the outer
plans, we've no idea how many times the subplan will be rescanned. So
to make this work fully I think we'd need to modify the planner so
that we delay the create_plan() for subplans until sometime after
we've planned the outer query.

The reason that I'm posting about this now is mostly because I did say
I'd come back to this patch for v16 and I'm also feeling bad that I
-1'd Magnus' patch, which likely resulted in making zero forward
progress in improving JIT and it's costing situation for v15.

The reason I've not completed this patch to fix the deficiencies
regarding subplans is that that's quite a bit of work and I don't
really want to do that right now. We might decide that JIT costing
should work in a completely different way that does not require
estimating how many times a plan node will be rescanned. I think
there's enough patch here to allow us to test this and then decide if
it's any good or not.

There's also maybe some controversy in the patch. I ended up modifying
EXPLAIN so that it shows loops=N as part of the estimated costs. I
understand there's likely to be fallout from doing that as there are
various tools around that this would likely break. I added that for a
couple of reasons; 1) I think it would be tricky to figure out why JIT
was or was not enabled without showing that in EXPLAIN, and; 2) I
needed to display it somewhere for my testing so I could figure out if
I'd done something wrong when calculating the value during
create_plan().

This basically looks like:

postgres=# explain select * from h, h h1, h h2;
QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..12512550.00 rows=1000000000 width=12)
-> Nested Loop (cost=0.00..12532.50 rows=1000000 width=8)
-> Seq Scan on h (cost=0.00..15.00 rows=1000 width=4)
-> Materialize (cost=0.00..20.00 rows=1000 width=4 loops=1000)
-> Seq Scan on h h1 (cost=0.00..15.00 rows=1000 width=4)
-> Materialize (cost=0.00..20.00 rows=1000 width=4 loops=1000000)
-> Seq Scan on h h2 (cost=0.00..15.00 rows=1000 width=4)
(7 rows)

Just the same as EXPLAIN ANALYZE, I've coded loops= to only show when
there's more than 1 loop. You can also see that the node below
Materialize is not expected to be scanned multiple times. Technically
it could when a parameter changes, but right now it seems more trouble
than it's worth to go to the trouble of estimating that during
create_plan(). There's also some variation from the expected loops and
the actual regarding parallel workers. In the estimate, this is just
the number of times an average worker is expected to invoke the plan,
whereas the actual "loops" is the sum of each worker's invocations.

The other slight controversy that I can see in the patch is
repurposing the JIT cost GUCs and giving them a completely different
meaning than they had previously. I've left them as-is for now as I
didn't think renaming GUCs would ease the pain that DBAs would have to
endure as a result of this change.

Does anyone have any thoughts about this JIT costing? Is this an
improvement? Is there a better way?

David

[1]: /messages/by-id/CAApHDvpQJqLrNOSi8P1JLM8YE2C+ksKFpSdZg=q6sTbtQ-v=aw@mail.gmail.com
[2]: /messages/by-id/CAApHDvrEoQ5p61NjDCKVgEWaH0qm1KprYw2-7m8-6ZGGJ8A2Dw@mail.gmail.com
[3]: /messages/by-id/CABUevExR_9ZmkYj-aBvDreDKUinWLBBpORcmTbuPdNb5vGOLtA@mail.gmail.com
[4]: /messages/by-id/20220329231641.ai3qrzpdo2vqvwix@alap3.anarazel.de

Attachments:

granular_jit_v2.patchtext/plain; charset=US-ASCII; name=granular_jit_v2.patchDownload
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 4773cadec0..ba52b48783 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -129,7 +129,8 @@ static ForeignScan *fileGetForeignPlan(PlannerInfo *root,
 									   ForeignPath *best_path,
 									   List *tlist,
 									   List *scan_clauses,
-									   Plan *outer_plan);
+									   Plan *outer_plan,
+									   double est_calls);
 static void fileExplainForeignScan(ForeignScanState *node, ExplainState *es);
 static void fileBeginForeignScan(ForeignScanState *node, int eflags);
 static TupleTableSlot *fileIterateForeignScan(ForeignScanState *node);
@@ -588,7 +589,8 @@ fileGetForeignPlan(PlannerInfo *root,
 				   ForeignPath *best_path,
 				   List *tlist,
 				   List *scan_clauses,
-				   Plan *outer_plan)
+				   Plan *outer_plan,
+				   double est_calls)
 {
 	Index		scan_relid = baserel->relid;
 
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0e5771c89d..f51bf8a649 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -335,7 +335,8 @@ static ForeignScan *postgresGetForeignPlan(PlannerInfo *root,
 										   ForeignPath *best_path,
 										   List *tlist,
 										   List *scan_clauses,
-										   Plan *outer_plan);
+										   Plan *outer_plan,
+										   double est_calls);
 static void postgresBeginForeignScan(ForeignScanState *node, int eflags);
 static TupleTableSlot *postgresIterateForeignScan(ForeignScanState *node);
 static void postgresReScanForeignScan(ForeignScanState *node);
@@ -1221,7 +1222,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 					   ForeignPath *best_path,
 					   List *tlist,
 					   List *scan_clauses,
-					   Plan *outer_plan)
+					   Plan *outer_plan,
+					   double est_calls)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
 	Index		scan_relid;
@@ -1385,7 +1387,8 @@ postgresGetForeignPlan(PlannerInfo *root,
 			 * a Result node atop the plan tree.
 			 */
 			outer_plan = change_plan_targetlist(outer_plan, fdw_scan_tlist,
-												best_path->path.parallel_safe);
+												best_path->path.parallel_safe,
+												est_calls);
 		}
 	}
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 03986946a8..643680bbc6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5603,8 +5603,11 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation is activated, if
-        enabled (see <xref linkend="jit"/>).
+        Sets the cost threshold for which a given plan node will consider
+        performing JIT compilation for itself.  The <literal>total_cost</literal>
+        of a given plan node multiplied by its estimated <literal>loops</literal>
+        must exceed this value before JIT compilation is considered for the
+        plan node xref linkend="jit"/> must also be enabled.
         Performing <acronym>JIT</acronym> costs planning time but can
         accelerate query execution.
         Setting this to <literal>-1</literal> disables JIT compilation.
@@ -5621,12 +5624,13 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation attempts to inline
-        functions and operators.  Inlining adds planning time, but can
-        improve execution speed.  It is not meaningful to set this to less
-        than <varname>jit_above_cost</varname>.
-        Setting this to <literal>-1</literal> disables inlining.
-        The default is <literal>500000</literal>.
+        Sets the overall plan node cost above which JIT compilation attempts to
+        inline functions and operators.  Inlining adds additional executor
+        startup overheads, but can improve performance during execution.  It
+        is not meaningful to set this to less than
+        <varname>jit_above_cost</varname>.  Setting this to
+        <literal>-1</literal> disables inlining.  The default is
+        <literal>500000</literal>.
        </para>
       </listitem>
      </varlistentry>
@@ -5639,13 +5643,13 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </term>
       <listitem>
        <para>
-        Sets the query cost above which JIT compilation applies expensive
-        optimizations.  Such optimization adds planning time, but can improve
-        execution speed.  It is not meaningful to set this to less
-        than <varname>jit_above_cost</varname>, and it is unlikely to be
-        beneficial to set it to more
-        than <varname>jit_inline_above_cost</varname>.
-        Setting this to <literal>-1</literal> disables expensive optimizations.
+        Sets the overall plan node cost above which JIT compilation applies expensive
+        optimizations.  Such optimization adds executor startup overhead, but
+        can improve performance during execution.  It is not meaningful to set
+        this to less than <varname>jit_above_cost</varname>, and it is
+        unlikely to be beneficial to set it to more than
+        <varname>jit_inline_above_cost</varname>.
+        Setting this to <literal>-1</literal> disables the optimization pass.
         The default is <literal>500000</literal>.
        </para>
       </listitem>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index d2a2479822..fc36438df1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1583,9 +1583,17 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	{
 		if (es->format == EXPLAIN_FORMAT_TEXT)
 		{
-			appendStringInfo(es->str, "  (cost=%.2f..%.2f rows=%.0f width=%d)",
-							 plan->startup_cost, plan->total_cost,
-							 plan->plan_rows, plan->plan_width);
+			/* only display the expected loops if it's above 1.0 */
+			if (plan->est_calls <= 1.0)
+				appendStringInfo(es->str, "  (cost=%.2f..%.2f rows=%.0f width=%d)",
+								 plan->startup_cost, plan->total_cost,
+								 plan->plan_rows, plan->plan_width);
+			else
+				appendStringInfo(es->str, "  (cost=%.2f..%.2f rows=%.0f width=%d loops=%.0f)",
+								 plan->startup_cost, plan->total_cost,
+								 plan->plan_rows, plan->plan_width,
+								 plan->est_calls);
+
 		}
 		else
 		{
@@ -1597,6 +1605,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
 								 0, es);
 			ExplainPropertyInteger("Plan Width", NULL, plan->plan_width,
 								   es);
+			ExplainPropertyFloat("Plan Calls", NULL, plan->est_calls, 0, es);
 		}
 	}
 
@@ -1719,6 +1728,21 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->verbose)
 		show_plan_tlist(planstate, ancestors, es);
 
+	if (es->format == EXPLAIN_FORMAT_TEXT)
+	{
+		/*
+		 * If we did any jitting, indicate if we did any for this node or not.
+		 * When format is TEXT, only do this when VERBOSE is enabled.
+		 */
+		if (planstate->state->es_jit != NULL && es->verbose)
+			ExplainPropertyBool("JIT", plan->jit, es);
+	}
+	else
+	{
+		if (planstate->state->es_jit != NULL)
+			ExplainPropertyBool("JIT", plan->jit, es);
+	}
+
 	/* unique join */
 	switch (nodeTag(plan))
 	{
diff --git a/src/backend/executor/nodeValuesscan.c b/src/backend/executor/nodeValuesscan.c
index dda1c59b23..4854b16cfe 100644
--- a/src/backend/executor/nodeValuesscan.c
+++ b/src/backend/executor/nodeValuesscan.c
@@ -296,22 +296,8 @@ ExecInitValuesScan(ValuesScan *node, EState *estate, int eflags)
 		if (estate->es_subplanstates &&
 			contain_subplans((Node *) exprs))
 		{
-			int			saved_jit_flags;
-
-			/*
-			 * As these expressions are only used once, disable JIT for them.
-			 * This is worthwhile because it's common to insert significant
-			 * amounts of data via VALUES().  Note that this doesn't prevent
-			 * use of JIT *within* a subplan, since that's initialized
-			 * separately; this just affects the upper-level subexpressions.
-			 */
-			saved_jit_flags = estate->es_jit_flags;
-			estate->es_jit_flags = PGJIT_NONE;
-
 			scanstate->exprstatelists[i] = ExecInitExprList(exprs,
 															&scanstate->ss.ps);
-
-			estate->es_jit_flags = saved_jit_flags;
 		}
 		i++;
 	}
diff --git a/src/backend/jit/README b/src/backend/jit/README
index 5427bdf215..a4aa33b79b 100644
--- a/src/backend/jit/README
+++ b/src/backend/jit/README
@@ -266,27 +266,32 @@ generation, and later compiling larger parts of queries.
 When to JIT
 ===========
 
-Currently there are a number of GUCs that influence JITing:
-
-- jit_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
-  get JITed, *without* optimization (expensive part), corresponding to
-  -O0. This commonly already results in significant speedups if
-  expression/deforming is a bottleneck (removing dynamic branches
-  mostly).
-- jit_optimize_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
-  get JITed, *with* optimization (expensive part).
-- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
-  higher cost.
-
-Whenever a query's total cost is above these limits, JITing is
-performed.
-
-Alternative costing models, e.g. by generating separate paths for
-parts of a query with lower cpu_* costs, are also a possibility, but
-it's doubtful the overhead of doing so is sufficient.  Another
-alternative would be to count the number of times individual
+Currently, there are a number of GUCs that influence JITing.  Each of these
+GUCs defines the cost that a given plan node must exceed to enable the given
+JIT feature.  The costs here are calculated by multiplying the total_cost of
+the plan node by the estimated number of rescans the planner exepects the
+executor to perform on the given node.  We refer to this cost as the "overall"
+cost in the text below:
+
+- jit_above_cost = -1, 0-DBL_MAX - all plan nodes which have a overall cost
+  higher than this value get JITed, *without* optimization (expensive part),
+  corresponding to -O0. This commonly already results in significant speedups
+  if expression/deforming is a bottleneck (removing dynamic branches mostly).
+- jit_optimize_above_cost = -1, 0-DBL_MAX - perform an optimization pass
+  during JIT compilation when any plan node that is eligible for JIT reaches
+  this cost.
+- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried during JIT
+  compilation if any plan node has a higher overall cost than this value.
+
+It is important to remember that the optimize and inlining options are either
+enabled for all JIT enabled plan nodes or disabled.  If a single node reaches
+the required threshold then all JIT enabled nodes will be JITed using the same
+compilation options.
+
+An alternative would be to count the number of times individual
 expressions are estimated to be evaluated, and perform JITing of these
-individual expressions.
+individual expressions. This is probably more complexity than it would be
+worth.
 
 The obvious seeming approach of JITing expressions individually after
 a number of execution turns out not to work too well. Primarily
diff --git a/src/backend/jit/jit.c b/src/backend/jit/jit.c
index 18d168f1af..ade689bf22 100644
--- a/src/backend/jit/jit.c
+++ b/src/backend/jit/jit.c
@@ -172,6 +172,14 @@ jit_compile_expr(struct ExprState *state)
 	if (!(state->parent->state->es_jit_flags & PGJIT_EXPR))
 		return false;
 
+	/* don't jit if the plan node is missing */
+	if (state->parent->plan == NULL)
+		return false;
+
+	/* don't jit if it's not enabled for this plan node */
+	if (!state->parent->plan->jit)
+		return false;
+
 	/* this also takes !jit_enabled into account */
 	if (provider_init())
 		return provider.compile_expr(state);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 836f427ea8..e56db41047 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -128,6 +128,7 @@ CopyPlanFields(const Plan *from, Plan *newnode)
 	COPY_SCALAR_FIELD(parallel_aware);
 	COPY_SCALAR_FIELD(parallel_safe);
 	COPY_SCALAR_FIELD(async_capable);
+	COPY_SCALAR_FIELD(jit);
 	COPY_SCALAR_FIELD(plan_node_id);
 	COPY_NODE_FIELD(targetlist);
 	COPY_NODE_FIELD(qual);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 6a02f81ad5..27360d587c 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -341,6 +341,7 @@ _outPlanInfo(StringInfo str, const Plan *node)
 	WRITE_BOOL_FIELD(parallel_aware);
 	WRITE_BOOL_FIELD(parallel_safe);
 	WRITE_BOOL_FIELD(async_capable);
+	WRITE_BOOL_FIELD(jit);
 	WRITE_INT_FIELD(plan_node_id);
 	WRITE_NODE_FIELD(targetlist);
 	WRITE_NODE_FIELD(qual);
@@ -2117,6 +2118,7 @@ _outMemoizePath(StringInfo str, const MemoizePath *node)
 	WRITE_BOOL_FIELD(binary_mode);
 	WRITE_FLOAT_FIELD(calls, "%.0f");
 	WRITE_UINT_FIELD(est_entries);
+	WRITE_FLOAT_FIELD(est_hitratio, "%.6f");
 }
 
 static void
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ddf76ac778..137ffbcdd1 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1847,6 +1847,7 @@ ReadCommonPlan(Plan *local_node)
 	READ_BOOL_FIELD(parallel_aware);
 	READ_BOOL_FIELD(parallel_safe);
 	READ_BOOL_FIELD(async_capable);
+	READ_BOOL_FIELD(jit);
 	READ_INT_FIELD(plan_node_id);
 	READ_NODE_FIELD(targetlist);
 	READ_NODE_FIELD(qual);
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b787c6f81a..c6cf109392 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -2814,6 +2814,12 @@ cost_memoize_rescan(PlannerInfo *root, MemoizePath *mpath,
 	/* Ensure we don't go negative */
 	hit_ratio = Max(hit_ratio, 0.0);
 
+	/*
+	 * Since we've just gone to the bother of calculating the estimated hit
+	 * ratio, let's store that in the MemoizePath for later use.
+	 */
+	mpath->est_hitratio = hit_ratio;
+
 	/*
 	 * Set the total_cost accounting for the expected cache hit ratio.  We
 	 * also add on a cpu_operator_cost to account for a cache lookup. This
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 7905bc4654..9c51103df1 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -22,6 +22,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_class.h"
 #include "foreign/fdwapi.h"
+#include "jit/jit.h"
 #include "miscadmin.h"
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
@@ -73,95 +74,130 @@
 
 
 static Plan *create_plan_recurse(PlannerInfo *root, Path *best_path,
-								 int flags);
+								 int flags, double est_calls);
 static Plan *create_scan_plan(PlannerInfo *root, Path *best_path,
-							  int flags);
+							  int flags, double est_calls);
+static void plan_consider_jit(PlannerInfo *root, Plan *plan);
 static List *build_path_tlist(PlannerInfo *root, Path *path);
 static bool use_physical_tlist(PlannerInfo *root, Path *path, int flags);
 static List *get_gating_quals(PlannerInfo *root, List *quals);
 static Plan *create_gating_plan(PlannerInfo *root, Path *path, Plan *plan,
-								List *gating_quals);
-static Plan *create_join_plan(PlannerInfo *root, JoinPath *best_path);
+								List *gating_quals, double est_calls);
+static Plan *create_join_plan(PlannerInfo *root, JoinPath *best_path,
+							  double est_calls);
 static bool mark_async_capable_plan(Plan *plan, Path *path);
 static Plan *create_append_plan(PlannerInfo *root, AppendPath *best_path,
-								int flags);
+								int flags, double est_calls);
 static Plan *create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
-									  int flags);
+									  int flags, double est_calls);
 static Result *create_group_result_plan(PlannerInfo *root,
-										GroupResultPath *best_path);
-static ProjectSet *create_project_set_plan(PlannerInfo *root, ProjectSetPath *best_path);
+										GroupResultPath *best_path,
+										double est_calls);
+static ProjectSet *create_project_set_plan(PlannerInfo *root, ProjectSetPath *best_path,
+										   double est_calls);
 static Material *create_material_plan(PlannerInfo *root, MaterialPath *best_path,
-									  int flags);
+									  int flags, double est_calls);
 static Memoize *create_memoize_plan(PlannerInfo *root, MemoizePath *best_path,
-									int flags);
+									int flags, double est_calls);
 static Plan *create_unique_plan(PlannerInfo *root, UniquePath *best_path,
-								int flags);
-static Gather *create_gather_plan(PlannerInfo *root, GatherPath *best_path);
+								int flags, double est_calls);
+static Gather *create_gather_plan(PlannerInfo *root, GatherPath *best_path,
+								  double est_calls);
 static Plan *create_projection_plan(PlannerInfo *root,
 									ProjectionPath *best_path,
-									int flags);
-static Plan *inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe);
-static Sort *create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags);
+									int flags, double est_calls);
+static Plan *inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe,
+									double est_calls);
+static Sort *create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags,
+							  double est_calls);
 static IncrementalSort *create_incrementalsort_plan(PlannerInfo *root,
-													IncrementalSortPath *best_path, int flags);
-static Group *create_group_plan(PlannerInfo *root, GroupPath *best_path);
+													IncrementalSortPath *best_path, int flags,
+													double est_calls);
+static Group *create_group_plan(PlannerInfo *root, GroupPath *best_path,
+								double est_calls);
 static Unique *create_upper_unique_plan(PlannerInfo *root, UpperUniquePath *best_path,
-										int flags);
-static Agg *create_agg_plan(PlannerInfo *root, AggPath *best_path);
-static Plan *create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path);
-static Result *create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path);
-static WindowAgg *create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path);
+										int flags, double est_calls);
+static Agg *create_agg_plan(PlannerInfo *root, AggPath *best_path,
+							double est_calls);
+static Plan *create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path,
+									  double est_calls);
+static Result *create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path,
+									 double est_calls);
+static WindowAgg *create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path,
+										double est_calls);
 static SetOp *create_setop_plan(PlannerInfo *root, SetOpPath *best_path,
-								int flags);
-static RecursiveUnion *create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path);
+								int flags, double est_calls);
+static RecursiveUnion *create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path,
+												  double est_calls);
 static LockRows *create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path,
-									  int flags);
-static ModifyTable *create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path);
+									  int flags, double est_calls);
+static ModifyTable *create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path,
+											double est_calls);
 static Limit *create_limit_plan(PlannerInfo *root, LimitPath *best_path,
-								int flags);
+								int flags, double est_calls);
 static SeqScan *create_seqscan_plan(PlannerInfo *root, Path *best_path,
-									List *tlist, List *scan_clauses);
+									List *tlist, List *scan_clauses,
+									double est_calls);
 static SampleScan *create_samplescan_plan(PlannerInfo *root, Path *best_path,
-										  List *tlist, List *scan_clauses);
+										  List *tlist, List *scan_clauses,
+										  double est_calls);
 static Scan *create_indexscan_plan(PlannerInfo *root, IndexPath *best_path,
-								   List *tlist, List *scan_clauses, bool indexonly);
+								   List *tlist, List *scan_clauses, bool indexonly,
+								   double est_calls);
 static BitmapHeapScan *create_bitmap_scan_plan(PlannerInfo *root,
 											   BitmapHeapPath *best_path,
-											   List *tlist, List *scan_clauses);
+											   List *tlist, List *scan_clauses,
+											   double est_calls);
 static Plan *create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
-								   List **qual, List **indexqual, List **indexECs);
+								   List **qual, List **indexqual, List **indexECs,
+								   double est_calls);
 static void bitmap_subplan_mark_shared(Plan *plan);
 static TidScan *create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
-									List *tlist, List *scan_clauses);
+									List *tlist, List *scan_clauses,
+									double est_calls);
 static TidRangeScan *create_tidrangescan_plan(PlannerInfo *root,
 											  TidRangePath *best_path,
 											  List *tlist,
-											  List *scan_clauses);
+											  List *scan_clauses,
+											  double est_calls);
 static SubqueryScan *create_subqueryscan_plan(PlannerInfo *root,
 											  SubqueryScanPath *best_path,
-											  List *tlist, List *scan_clauses);
+											  List *tlist, List *scan_clauses,
+											  double est_calls);
 static FunctionScan *create_functionscan_plan(PlannerInfo *root, Path *best_path,
-											  List *tlist, List *scan_clauses);
+											  List *tlist, List *scan_clauses,
+											  double est_calls);
 static ValuesScan *create_valuesscan_plan(PlannerInfo *root, Path *best_path,
-										  List *tlist, List *scan_clauses);
+										  List *tlist, List *scan_clauses,
+										  double est_calls);
 static TableFuncScan *create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
-												List *tlist, List *scan_clauses);
+												List *tlist, List *scan_clauses,
+												double est_calls);
 static CteScan *create_ctescan_plan(PlannerInfo *root, Path *best_path,
-									List *tlist, List *scan_clauses);
+									List *tlist, List *scan_clauses,
+									double est_calls);
 static NamedTuplestoreScan *create_namedtuplestorescan_plan(PlannerInfo *root,
-															Path *best_path, List *tlist, List *scan_clauses);
+															Path *best_path, List *tlist, List *scan_clauses,
+															double est_calls);
 static Result *create_resultscan_plan(PlannerInfo *root, Path *best_path,
-									  List *tlist, List *scan_clauses);
+									  List *tlist, List *scan_clauses,
+									  double est_calls);
 static WorkTableScan *create_worktablescan_plan(PlannerInfo *root, Path *best_path,
-												List *tlist, List *scan_clauses);
+												List *tlist, List *scan_clauses,
+												double est_calls);
 static ForeignScan *create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
-											List *tlist, List *scan_clauses);
+											List *tlist, List *scan_clauses,
+											double est_calls);
 static CustomScan *create_customscan_plan(PlannerInfo *root,
 										  CustomPath *best_path,
-										  List *tlist, List *scan_clauses);
-static NestLoop *create_nestloop_plan(PlannerInfo *root, NestPath *best_path);
-static MergeJoin *create_mergejoin_plan(PlannerInfo *root, MergePath *best_path);
-static HashJoin *create_hashjoin_plan(PlannerInfo *root, HashPath *best_path);
+										  List *tlist, List *scan_clauses,
+										  double est_calls);
+static NestLoop *create_nestloop_plan(PlannerInfo *root, NestPath *best_path,
+									  double est_calls);
+static MergeJoin *create_mergejoin_plan(PlannerInfo *root, MergePath *best_path,
+										double est_calls);
+static HashJoin *create_hashjoin_plan(PlannerInfo *root, HashPath *best_path,
+									  double est_calls);
 static Node *replace_nestloop_params(PlannerInfo *root, Node *expr);
 static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root);
 static void fix_indexqual_references(PlannerInfo *root, IndexPath *index_path,
@@ -269,11 +305,13 @@ static Plan *prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
 										AttrNumber **p_sortColIdx,
 										Oid **p_sortOperators,
 										Oid **p_collations,
-										bool **p_nullsFirst);
+										bool **p_nullsFirst,
+										double est_calls);
 static Sort *make_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
-									 Relids relids);
+									 Relids relids, double est_calls);
 static IncrementalSort *make_incrementalsort_from_pathkeys(Plan *lefttree,
-														   List *pathkeys, Relids relids, int nPresortedCols);
+														   List *pathkeys, Relids relids, int nPresortedCols,
+														   double est_calls);
 static Sort *make_sort_from_groupcols(List *groupcls,
 									  AttrNumber *grpColIdx,
 									  Plan *lefttree);
@@ -314,7 +352,8 @@ static ModifyTable *make_modifytable(PlannerInfo *root, Plan *subplan,
 									 List *rowMarks, OnConflictExpr *onconflict,
 									 List *mergeActionList, int epqParam);
 static GatherMerge *create_gather_merge_plan(PlannerInfo *root,
-											 GatherMergePath *best_path);
+											 GatherMergePath *best_path,
+											 double est_calls);
 
 
 /*
@@ -330,10 +369,12 @@ static GatherMerge *create_gather_merge_plan(PlannerInfo *root,
  *
  *	  best_path is the best access path
  *
+ *	  est_calls is the number of expected times that we'll (re)scan this plan
+ *
  *	  Returns a Plan tree.
  */
 Plan *
-create_plan(PlannerInfo *root, Path *best_path)
+create_plan(PlannerInfo *root, Path *best_path, double est_calls)
 {
 	Plan	   *plan;
 
@@ -345,7 +386,7 @@ create_plan(PlannerInfo *root, Path *best_path)
 	root->curOuterParams = NIL;
 
 	/* Recursively process the path tree, demanding the correct tlist result */
-	plan = create_plan_recurse(root, best_path, CP_EXACT_TLIST);
+	plan = create_plan_recurse(root, best_path, CP_EXACT_TLIST, est_calls);
 
 	/*
 	 * Make sure the topmost plan node's targetlist exposes the original
@@ -384,7 +425,7 @@ create_plan(PlannerInfo *root, Path *best_path)
  *	  Recursive guts of create_plan().
  */
 static Plan *
-create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
+create_plan_recurse(PlannerInfo *root, Path *best_path, int flags, double est_calls)
 {
 	Plan	   *plan;
 
@@ -409,136 +450,147 @@ create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
 		case T_NamedTuplestoreScan:
 		case T_ForeignScan:
 		case T_CustomScan:
-			plan = create_scan_plan(root, best_path, flags);
+			plan = create_scan_plan(root, best_path, flags, est_calls);
 			break;
 		case T_HashJoin:
 		case T_MergeJoin:
 		case T_NestLoop:
 			plan = create_join_plan(root,
-									(JoinPath *) best_path);
+									(JoinPath *) best_path, est_calls);
 			break;
 		case T_Append:
 			plan = create_append_plan(root,
 									  (AppendPath *) best_path,
-									  flags);
+									  flags, est_calls);
 			break;
 		case T_MergeAppend:
 			plan = create_merge_append_plan(root,
 											(MergeAppendPath *) best_path,
-											flags);
+											flags, est_calls);
 			break;
 		case T_Result:
 			if (IsA(best_path, ProjectionPath))
 			{
 				plan = create_projection_plan(root,
 											  (ProjectionPath *) best_path,
-											  flags);
+											  flags, est_calls);
 			}
 			else if (IsA(best_path, MinMaxAggPath))
 			{
 				plan = (Plan *) create_minmaxagg_plan(root,
-													  (MinMaxAggPath *) best_path);
+													  (MinMaxAggPath *) best_path,
+													  est_calls);
 			}
 			else if (IsA(best_path, GroupResultPath))
 			{
 				plan = (Plan *) create_group_result_plan(root,
-														 (GroupResultPath *) best_path);
+														 (GroupResultPath *) best_path,
+														 est_calls);
 			}
 			else
 			{
 				/* Simple RTE_RESULT base relation */
 				Assert(IsA(best_path, Path));
-				plan = create_scan_plan(root, best_path, flags);
+				plan = create_scan_plan(root, best_path, flags, est_calls);
 			}
 			break;
 		case T_ProjectSet:
 			plan = (Plan *) create_project_set_plan(root,
-													(ProjectSetPath *) best_path);
+													(ProjectSetPath *) best_path,
+													est_calls);
 			break;
 		case T_Material:
 			plan = (Plan *) create_material_plan(root,
 												 (MaterialPath *) best_path,
-												 flags);
+												 flags, est_calls);
 			break;
 		case T_Memoize:
 			plan = (Plan *) create_memoize_plan(root,
 												(MemoizePath *) best_path,
-												flags);
+												flags, est_calls);
 			break;
 		case T_Unique:
 			if (IsA(best_path, UpperUniquePath))
 			{
 				plan = (Plan *) create_upper_unique_plan(root,
 														 (UpperUniquePath *) best_path,
-														 flags);
+														 flags, est_calls);
 			}
 			else
 			{
 				Assert(IsA(best_path, UniquePath));
 				plan = create_unique_plan(root,
 										  (UniquePath *) best_path,
-										  flags);
+										  flags, est_calls);
 			}
 			break;
 		case T_Gather:
 			plan = (Plan *) create_gather_plan(root,
-											   (GatherPath *) best_path);
+											   (GatherPath *) best_path,
+											   est_calls);
 			break;
 		case T_Sort:
 			plan = (Plan *) create_sort_plan(root,
 											 (SortPath *) best_path,
-											 flags);
+											 flags, est_calls);
 			break;
 		case T_IncrementalSort:
 			plan = (Plan *) create_incrementalsort_plan(root,
 														(IncrementalSortPath *) best_path,
-														flags);
+														flags, est_calls);
 			break;
 		case T_Group:
 			plan = (Plan *) create_group_plan(root,
-											  (GroupPath *) best_path);
+											  (GroupPath *) best_path,
+											  est_calls);
 			break;
 		case T_Agg:
 			if (IsA(best_path, GroupingSetsPath))
 				plan = create_groupingsets_plan(root,
-												(GroupingSetsPath *) best_path);
+												(GroupingSetsPath *) best_path,
+												est_calls);
 			else
 			{
 				Assert(IsA(best_path, AggPath));
 				plan = (Plan *) create_agg_plan(root,
-												(AggPath *) best_path);
+												(AggPath *) best_path,
+												est_calls);
 			}
 			break;
 		case T_WindowAgg:
 			plan = (Plan *) create_windowagg_plan(root,
-												  (WindowAggPath *) best_path);
+												  (WindowAggPath *) best_path,
+												  est_calls);
 			break;
 		case T_SetOp:
 			plan = (Plan *) create_setop_plan(root,
 											  (SetOpPath *) best_path,
-											  flags);
+											  flags, est_calls);
 			break;
 		case T_RecursiveUnion:
 			plan = (Plan *) create_recursiveunion_plan(root,
-													   (RecursiveUnionPath *) best_path);
+													   (RecursiveUnionPath *) best_path,
+													   est_calls);
 			break;
 		case T_LockRows:
 			plan = (Plan *) create_lockrows_plan(root,
 												 (LockRowsPath *) best_path,
-												 flags);
+												 flags, est_calls);
 			break;
 		case T_ModifyTable:
 			plan = (Plan *) create_modifytable_plan(root,
-													(ModifyTablePath *) best_path);
+													(ModifyTablePath *) best_path,
+													est_calls);
 			break;
 		case T_Limit:
 			plan = (Plan *) create_limit_plan(root,
 											  (LimitPath *) best_path,
-											  flags);
+											  flags, est_calls);
 			break;
 		case T_GatherMerge:
 			plan = (Plan *) create_gather_merge_plan(root,
-													 (GatherMergePath *) best_path);
+													 (GatherMergePath *) best_path,
+													 est_calls);
 			break;
 		default:
 			elog(ERROR, "unrecognized node type: %d",
@@ -547,15 +599,75 @@ create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
 			break;
 	}
 
+	/* See about switching on JIT for this node */
+	plan_consider_jit(root, plan);
+
 	return plan;
 }
 
+static void
+plan_consider_jit(PlannerInfo *root, Plan *plan)
+{
+	int		jitflags = root->glob->jitFlags;
+
+	plan->jit = false;
+
+	/*
+	 * For values scans, expressions are only used once, so ensure we don't
+	 * enable JIT for them.
+	 */
+	if (IsA(plan, ValuesScan))
+		return;
+
+	 /* Determine which JIT options to enable for this plan node */
+	if (jit_enabled && jit_above_cost >= 0)
+	{
+		Cost	total_cost;
+
+		/*
+		 * Take into account the number of times that we expect to rescan a
+		 * given plan node.  For example, subplans being invoked under the
+		 * inside of a Nested Loop may be rescanned many times.  JITing these
+		 * may be more worthwhile.
+		 */
+		total_cost = plan->total_cost * plan->est_calls;
+
+		if (total_cost > jit_above_cost)
+		{
+			plan->jit = true;
+			jitflags |= PGJIT_PERFORM;
+
+			/*
+			 * Decide how much effort should be put into generating better code.
+			 */
+			if (jit_optimize_above_cost >= 0 &&
+				total_cost > jit_optimize_above_cost)
+				jitflags |= PGJIT_OPT3;
+			if (jit_inline_above_cost >= 0 &&
+				total_cost > jit_inline_above_cost)
+				jitflags |= PGJIT_INLINE;
+
+			/*
+			 * Decide which operations should be JITed.
+			 */
+			if (jit_expressions)
+				jitflags |= PGJIT_EXPR;
+			if (jit_tuple_deforming)
+				jitflags |= PGJIT_DEFORM;
+
+			/* Record the maximum flags used by any plan node */
+			root->glob->jitFlags |= jitflags;
+		}
+	}
+}
+
 /*
  * create_scan_plan
  *	 Create a scan plan for the parent relation of 'best_path'.
  */
 static Plan *
-create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
+create_scan_plan(PlannerInfo *root, Path *best_path, int flags,
+				 double est_calls)
 {
 	RelOptInfo *rel = best_path->parent;
 	List	   *scan_clauses;
@@ -660,14 +772,16 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 			plan = (Plan *) create_seqscan_plan(root,
 												best_path,
 												tlist,
-												scan_clauses);
+												scan_clauses,
+												est_calls);
 			break;
 
 		case T_SampleScan:
 			plan = (Plan *) create_samplescan_plan(root,
 												   best_path,
 												   tlist,
-												   scan_clauses);
+												   scan_clauses,
+												   est_calls);
 			break;
 
 		case T_IndexScan:
@@ -675,7 +789,8 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 												  (IndexPath *) best_path,
 												  tlist,
 												  scan_clauses,
-												  false);
+												  false,
+												  est_calls);
 			break;
 
 		case T_IndexOnlyScan:
@@ -683,98 +798,112 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 												  (IndexPath *) best_path,
 												  tlist,
 												  scan_clauses,
-												  true);
+												  true,
+												  est_calls);
 			break;
 
 		case T_BitmapHeapScan:
 			plan = (Plan *) create_bitmap_scan_plan(root,
 													(BitmapHeapPath *) best_path,
 													tlist,
-													scan_clauses);
+													scan_clauses,
+													est_calls);
 			break;
 
 		case T_TidScan:
 			plan = (Plan *) create_tidscan_plan(root,
 												(TidPath *) best_path,
 												tlist,
-												scan_clauses);
+												scan_clauses,
+												est_calls);
 			break;
 
 		case T_TidRangeScan:
 			plan = (Plan *) create_tidrangescan_plan(root,
 													 (TidRangePath *) best_path,
 													 tlist,
-													 scan_clauses);
+													 scan_clauses,
+													 est_calls);
 			break;
 
 		case T_SubqueryScan:
 			plan = (Plan *) create_subqueryscan_plan(root,
 													 (SubqueryScanPath *) best_path,
 													 tlist,
-													 scan_clauses);
+													 scan_clauses,
+													 est_calls);
 			break;
 
 		case T_FunctionScan:
 			plan = (Plan *) create_functionscan_plan(root,
 													 best_path,
 													 tlist,
-													 scan_clauses);
+													 scan_clauses,
+													 est_calls);
 			break;
 
 		case T_TableFuncScan:
 			plan = (Plan *) create_tablefuncscan_plan(root,
 													  best_path,
 													  tlist,
-													  scan_clauses);
+													  scan_clauses,
+													  est_calls);
 			break;
 
 		case T_ValuesScan:
 			plan = (Plan *) create_valuesscan_plan(root,
 												   best_path,
 												   tlist,
-												   scan_clauses);
+												   scan_clauses,
+												   est_calls);
 			break;
 
 		case T_CteScan:
 			plan = (Plan *) create_ctescan_plan(root,
 												best_path,
 												tlist,
-												scan_clauses);
+												scan_clauses,
+												est_calls);
 			break;
 
 		case T_NamedTuplestoreScan:
 			plan = (Plan *) create_namedtuplestorescan_plan(root,
 															best_path,
 															tlist,
-															scan_clauses);
+															scan_clauses,
+															est_calls);
 			break;
 
 		case T_Result:
 			plan = (Plan *) create_resultscan_plan(root,
 												   best_path,
 												   tlist,
-												   scan_clauses);
+												   scan_clauses,
+												   est_calls);
 			break;
 
 		case T_WorkTableScan:
 			plan = (Plan *) create_worktablescan_plan(root,
 													  best_path,
 													  tlist,
-													  scan_clauses);
+													  scan_clauses,
+													  est_calls);
 			break;
 
 		case T_ForeignScan:
 			plan = (Plan *) create_foreignscan_plan(root,
 													(ForeignPath *) best_path,
 													tlist,
-													scan_clauses);
+													scan_clauses,
+													est_calls);
 			break;
 
 		case T_CustomScan:
 			plan = (Plan *) create_customscan_plan(root,
 												   (CustomPath *) best_path,
 												   tlist,
-												   scan_clauses);
+												   scan_clauses,
+												   est_calls);
 			break;
 
 		default:
@@ -790,7 +919,8 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags)
 	 * quals.
 	 */
 	if (gating_clauses)
-		plan = create_gating_plan(root, best_path, plan, gating_clauses);
+		plan = create_gating_plan(root, best_path, plan, gating_clauses,
+								  est_calls);
 
 	return plan;
 }
@@ -1000,7 +1130,7 @@ get_gating_quals(PlannerInfo *root, List *quals)
  */
 static Plan *
 create_gating_plan(PlannerInfo *root, Path *path, Plan *plan,
-				   List *gating_quals)
+				   List *gating_quals, double est_calls)
 {
 	Plan	   *gplan;
 	Plan	   *splan;
@@ -1045,6 +1175,7 @@ create_gating_plan(PlannerInfo *root, Path *path, Plan *plan,
 	 * gating qual being true.
 	 */
 	copy_plan_costsize(gplan, plan);
+	gplan->est_calls = clamp_row_est(est_calls);
 
 	/* Gating quals could be unsafe, so better use the Path's safety flag */
 	gplan->parallel_safe = path->parallel_safe;
@@ -1058,7 +1189,7 @@ create_gating_plan(PlannerInfo *root, Path *path, Plan *plan,
  *	  inner and outer paths.
  */
 static Plan *
-create_join_plan(PlannerInfo *root, JoinPath *best_path)
+create_join_plan(PlannerInfo *root, JoinPath *best_path, double est_calls)
 {
 	Plan	   *plan;
 	List	   *gating_clauses;
@@ -1067,15 +1198,18 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path)
 	{
 		case T_MergeJoin:
 			plan = (Plan *) create_mergejoin_plan(root,
-												  (MergePath *) best_path);
+												  (MergePath *) best_path,
+												  est_calls);
 			break;
 		case T_HashJoin:
 			plan = (Plan *) create_hashjoin_plan(root,
-												 (HashPath *) best_path);
+												 (HashPath *) best_path,
+												 est_calls);
 			break;
 		case T_NestLoop:
 			plan = (Plan *) create_nestloop_plan(root,
-												 (NestPath *) best_path);
+												 (NestPath *) best_path,
+												 est_calls);
 			break;
 		default:
 			elog(ERROR, "unrecognized node type: %d",
@@ -1092,7 +1226,7 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path)
 	gating_clauses = get_gating_quals(root, best_path->joinrestrictinfo);
 	if (gating_clauses)
 		plan = create_gating_plan(root, (Path *) best_path, plan,
-								  gating_clauses);
+								  gating_clauses, est_calls);
 
 #ifdef NOT_USED
 
@@ -1107,6 +1241,8 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path)
 							   get_actual_clauses(get_loc_restrictinfo(best_path))));
 #endif
 
+	plan->est_calls = clamp_row_est(est_calls);
+
 	return plan;
 }
 
@@ -1173,7 +1309,8 @@ mark_async_capable_plan(Plan *plan, Path *path)
  *	  Returns a Plan node.
  */
 static Plan *
-create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
+create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags,
+				   double est_calls)
 {
 	Append	   *plan;
 	List	   *tlist = build_path_tlist(root, &best_path->path);
@@ -1250,7 +1387,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 										  &nodeSortColIdx,
 										  &nodeSortOperators,
 										  &nodeCollations,
-										  &nodeNullsFirst);
+										  &nodeNullsFirst,
+										  est_calls);
 		tlist_was_changed = (orig_tlist_length != list_length(plan->plan.targetlist));
 	}
 
@@ -1266,7 +1404,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		Plan	   *subplan;
 
 		/* Must insist that all children return the same tlist */
-		subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST);
+		subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST, est_calls);
 
 		/*
 		 * For ordered Appends, we must insert a Sort node if subplan isn't
@@ -1294,7 +1432,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 												 &sortColIdx,
 												 &sortOperators,
 												 &collations,
-												 &nullsFirst);
+												 &nullsFirst,
+												 est_calls);
 
 			/*
 			 * Check that we got the same sort key information.  We just
@@ -1370,6 +1509,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 	plan->part_prune_info = partpruneinfo;
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	/*
 	 * If prepare_sort_from_pathkeys added sort columns, but we were told to
@@ -1381,7 +1521,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
 		tlist = list_truncate(list_copy(plan->plan.targetlist),
 							  orig_tlist_length);
 		return inject_projection_plan((Plan *) plan, tlist,
-									  plan->plan.parallel_safe);
+									  plan->plan.parallel_safe, est_calls);
 	}
 	else
 		return (Plan *) plan;
@@ -1396,7 +1536,7 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
  */
 static Plan *
 create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
-						 int flags)
+						 int flags, double est_calls)
 {
 	MergeAppend *node = makeNode(MergeAppend);
 	Plan	   *plan = &node->plan;
@@ -1416,6 +1556,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	 * child plans, to make cross-checking the sort info easier.
 	 */
 	copy_generic_path_info(plan, (Path *) best_path);
+	plan->est_calls = clamp_row_est(est_calls);
 	plan->targetlist = tlist;
 	plan->qual = NIL;
 	plan->lefttree = NULL;
@@ -1436,7 +1577,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 									  &node->sortColIdx,
 									  &node->sortOperators,
 									  &node->collations,
-									  &node->nullsFirst);
+									  &node->nullsFirst,
+									  est_calls);
 	tlist_was_changed = (orig_tlist_length != list_length(plan->targetlist));
 
 	/*
@@ -1456,7 +1598,7 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 
 		/* Build the child plan */
 		/* Must insist that all children return the same tlist */
-		subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST);
+		subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST, est_calls);
 
 		/* Compute sort column info, and adjust subplan's tlist as needed */
 		subplan = prepare_sort_from_pathkeys(subplan, pathkeys,
@@ -1467,7 +1609,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 											 &sortColIdx,
 											 &sortOperators,
 											 &collations,
-											 &nullsFirst);
+											 &nullsFirst,
+											 est_calls);
 
 		/*
 		 * Check that we got the same sort key information.  We just Assert
@@ -1539,7 +1682,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
 	if (tlist_was_changed && (flags & (CP_EXACT_TLIST | CP_SMALL_TLIST)))
 	{
 		tlist = list_truncate(list_copy(plan->targetlist), orig_tlist_length);
-		return inject_projection_plan(plan, tlist, plan->parallel_safe);
+		return inject_projection_plan(plan, tlist, plan->parallel_safe,
+									  est_calls);
 	}
 	else
 		return plan;
@@ -1553,7 +1697,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
  *	  Returns a Plan node.
  */
 static Result *
-create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path)
+create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path,
+						 double est_calls)
 {
 	Result	   *plan;
 	List	   *tlist;
@@ -1567,6 +1712,7 @@ create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path)
 	plan = make_result(tlist, (Node *) quals, NULL);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -1578,20 +1724,22 @@ create_group_result_plan(PlannerInfo *root, GroupResultPath *best_path)
  *	  Returns a Plan node.
  */
 static ProjectSet *
-create_project_set_plan(PlannerInfo *root, ProjectSetPath *best_path)
+create_project_set_plan(PlannerInfo *root, ProjectSetPath *best_path,
+						double est_calls)
 {
 	ProjectSet *plan;
 	Plan	   *subplan;
 	List	   *tlist;
 
 	/* Since we intend to project, we don't need to constrain child tlist */
-	subplan = create_plan_recurse(root, best_path->subpath, 0);
+	subplan = create_plan_recurse(root, best_path->subpath, 0, est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
 	plan = make_project_set(tlist, subplan);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -1604,7 +1752,8 @@ create_project_set_plan(PlannerInfo *root, ProjectSetPath *best_path)
  *	  Returns a Plan node.
  */
 static Material *
-create_material_plan(PlannerInfo *root, MaterialPath *best_path, int flags)
+create_material_plan(PlannerInfo *root, MaterialPath *best_path, int flags,
+					 double est_calls)
 {
 	Material   *plan;
 	Plan	   *subplan;
@@ -1612,14 +1761,21 @@ create_material_plan(PlannerInfo *root, MaterialPath *best_path, int flags)
 	/*
 	 * We don't want any excess columns in the materialized tuples, so request
 	 * a smaller tlist.  Otherwise, since Material doesn't project, tlist
-	 * requirements pass through.
+	 * requirements pass through.  Here we also don't propagate the est_calls
+	 * to the subplan.  We assume that the Material node will only call its
+	 * subplan once and then returned the cached version on each subsequent
+	 * execution.  This might not be true when a parameter change causes the
+	 * Material node to have to rescan, but that's hard to estimate here and
+	 * the current usages of est_calls does not seem important enough to
+	 * warrant expending too much effort trying to calculate this.
 	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  flags | CP_SMALL_TLIST);
+								  flags | CP_SMALL_TLIST, 1.0);
 
 	plan = make_material(subplan);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -1632,7 +1788,8 @@ create_material_plan(PlannerInfo *root, MaterialPath *best_path, int flags)
  *	  Returns a Plan node.
  */
 static Memoize *
-create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags)
+create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags,
+					double est_calls)
 {
 	Memoize    *plan;
 	Bitmapset  *keyparamids;
@@ -1645,8 +1802,13 @@ create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags)
 	int			nkeys;
 	int			i;
 
+	/*
+	 * est_calls must take into account the expected hit ratio of the cache.
+	 * We'll only be calling the subplan when it's a cache miss.
+	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  flags | CP_SMALL_TLIST);
+								  flags | CP_SMALL_TLIST,
+								  est_calls * (1.0 - best_path->est_hitratio));
 
 	param_exprs = (List *) replace_nestloop_params(root, (Node *)
 												   best_path->param_exprs);
@@ -1674,6 +1836,7 @@ create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags)
 						best_path->est_entries, keyparamids);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -1686,7 +1849,8 @@ create_memoize_plan(PlannerInfo *root, MemoizePath *best_path, int flags)
  *	  Returns a Plan node.
  */
 static Plan *
-create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
+create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags,
+				   double est_calls)
 {
 	Plan	   *plan;
 	Plan	   *subplan;
@@ -1702,7 +1866,7 @@ create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
 	ListCell   *l;
 
 	/* Unique doesn't project, so tlist requirements pass through */
-	subplan = create_plan_recurse(root, best_path->subpath, flags);
+	subplan = create_plan_recurse(root, best_path->subpath, flags, est_calls);
 
 	/* Done if we don't need to do any actual unique-ifying */
 	if (best_path->umethod == UNIQUE_PATH_NOOP)
@@ -1753,7 +1917,8 @@ create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
 	/* Use change_plan_targetlist in case we need to insert a Result node */
 	if (newitems || best_path->umethod == UNIQUE_PATH_SORT)
 		subplan = change_plan_targetlist(subplan, newtlist,
-										 best_path->path.parallel_safe);
+										 best_path->path.parallel_safe,
+										 est_calls);
 
 	/*
 	 * Build control information showing which subplan output columns are to
@@ -1874,6 +2039,7 @@ create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
 
 	/* Copy cost data from Path to Plan */
 	copy_generic_path_info(plan, &best_path->path);
+	plan->est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -1885,7 +2051,7 @@ create_unique_plan(PlannerInfo *root, UniquePath *best_path, int flags)
  *	  for its subpaths.
  */
 static Gather *
-create_gather_plan(PlannerInfo *root, GatherPath *best_path)
+create_gather_plan(PlannerInfo *root, GatherPath *best_path, double est_calls)
 {
 	Gather	   *gather_plan;
 	Plan	   *subplan;
@@ -1897,7 +2063,8 @@ create_gather_plan(PlannerInfo *root, GatherPath *best_path)
 	 * can't travel through a tuple queue because it uses MinimalTuple
 	 * representation).
 	 */
-	subplan = create_plan_recurse(root, best_path->subpath, CP_EXACT_TLIST);
+	subplan = create_plan_recurse(root, best_path->subpath, CP_EXACT_TLIST,
+								  est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
@@ -1909,6 +2076,7 @@ create_gather_plan(PlannerInfo *root, GatherPath *best_path)
 							  subplan);
 
 	copy_generic_path_info(&gather_plan->plan, &best_path->path);
+	gather_plan->plan.est_calls = clamp_row_est(est_calls);
 
 	/* use parallel mode for parallel plans. */
 	root->glob->parallelModeNeeded = true;
@@ -1923,7 +2091,8 @@ create_gather_plan(PlannerInfo *root, GatherPath *best_path)
  *	  plans for its subpaths.
  */
 static GatherMerge *
-create_gather_merge_plan(PlannerInfo *root, GatherMergePath *best_path)
+create_gather_merge_plan(PlannerInfo *root, GatherMergePath *best_path,
+						 double est_calls)
 {
 	GatherMerge *gm_plan;
 	Plan	   *subplan;
@@ -1931,13 +2100,15 @@ create_gather_merge_plan(PlannerInfo *root, GatherMergePath *best_path)
 	List	   *tlist = build_path_tlist(root, &best_path->path);
 
 	/* As with Gather, project away columns in the workers. */
-	subplan = create_plan_recurse(root, best_path->subpath, CP_EXACT_TLIST);
+	subplan = create_plan_recurse(root, best_path->subpath, CP_EXACT_TLIST,
+								  est_calls);
 
 	/* Create a shell for a GatherMerge plan. */
 	gm_plan = makeNode(GatherMerge);
 	gm_plan->plan.targetlist = tlist;
 	gm_plan->num_workers = best_path->num_workers;
 	copy_generic_path_info(&gm_plan->plan, &best_path->path);
+	gm_plan->plan.est_calls = clamp_row_est(est_calls);
 
 	/* Assign the rescan Param. */
 	gm_plan->rescan_param = assign_special_exec_param(root);
@@ -1954,7 +2125,8 @@ create_gather_merge_plan(PlannerInfo *root, GatherMergePath *best_path)
 										 &gm_plan->sortColIdx,
 										 &gm_plan->sortOperators,
 										 &gm_plan->collations,
-										 &gm_plan->nullsFirst);
+										 &gm_plan->nullsFirst,
+										 est_calls);
 
 
 	/*
@@ -1984,7 +2156,8 @@ create_gather_merge_plan(PlannerInfo *root, GatherMergePath *best_path)
  *	  but sometimes we can just let the subplan do the work.
  */
 static Plan *
-create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
+create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags,
+					   double est_calls)
 {
 	Plan	   *plan;
 	Plan	   *subplan;
@@ -2011,7 +2184,7 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
 		 * actually need to project.  However, we may still need to ensure
 		 * proper sortgroupref labels, if the caller cares about those.
 		 */
-		subplan = create_plan_recurse(root, best_path->subpath, 0);
+		subplan = create_plan_recurse(root, best_path->subpath, 0, est_calls);
 		tlist = subplan->targetlist;
 		if (flags & CP_LABEL_TLIST)
 			apply_pathtarget_labeling_to_tlist(tlist,
@@ -2026,7 +2199,7 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
 		 * produces.
 		 */
 		subplan = create_plan_recurse(root, best_path->subpath,
-									  CP_IGNORE_TLIST);
+									  CP_IGNORE_TLIST, est_calls);
 		Assert(is_projection_capable_plan(subplan));
 		tlist = build_path_tlist(root, &best_path->path);
 	}
@@ -2036,7 +2209,7 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
 		 * It looks like we need a result node, unless by good fortune the
 		 * requested tlist is exactly the one the child wants to produce.
 		 */
-		subplan = create_plan_recurse(root, best_path->subpath, 0);
+		subplan = create_plan_recurse(root, best_path->subpath, 0, est_calls);
 		tlist = build_path_tlist(root, &best_path->path);
 		needs_result_node = !tlist_same_exprs(tlist, subplan->targetlist);
 	}
@@ -2069,6 +2242,7 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
 		plan = (Plan *) make_result(tlist, NULL, subplan);
 
 		copy_generic_path_info(plan, (Path *) best_path);
+		plan->est_calls = clamp_row_est(est_calls);
 	}
 
 	return plan;
@@ -2086,7 +2260,8 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
  * to apply (since the tlist might be unsafe even if the child plan is safe).
  */
 static Plan *
-inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe)
+inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe,
+					   double est_calls)
 {
 	Plan	   *plan;
 
@@ -2100,6 +2275,7 @@ inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe)
 	 * consistent not more so.  Hence, just copy the subplan's cost.
 	 */
 	copy_plan_costsize(plan, subplan);
+	plan->est_calls = clamp_row_est(est_calls);
 	plan->parallel_safe = parallel_safe;
 
 	return plan;
@@ -2118,7 +2294,8 @@ inject_projection_plan(Plan *subplan, List *tlist, bool parallel_safe)
  * flag of the FDW's own Path node.
  */
 Plan *
-change_plan_targetlist(Plan *subplan, List *tlist, bool tlist_parallel_safe)
+change_plan_targetlist(Plan *subplan, List *tlist, bool tlist_parallel_safe,
+					   double est_calls)
 {
 	/*
 	 * If the top plan node can't do projections and its existing target list
@@ -2129,7 +2306,7 @@ change_plan_targetlist(Plan *subplan, List *tlist, bool tlist_parallel_safe)
 		!tlist_same_exprs(tlist, subplan->targetlist))
 		subplan = inject_projection_plan(subplan, tlist,
 										 subplan->parallel_safe &&
-										 tlist_parallel_safe);
+										 tlist_parallel_safe, est_calls);
 	else
 	{
 		/* Else we can just replace the plan node's tlist */
@@ -2146,7 +2323,8 @@ change_plan_targetlist(Plan *subplan, List *tlist, bool tlist_parallel_safe)
  *	  for its subpaths.
  */
 static Sort *
-create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
+create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags,
+				 double est_calls)
 {
 	Sort	   *plan;
 	Plan	   *subplan;
@@ -2157,7 +2335,7 @@ create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
 	 * requirements pass through.
 	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  flags | CP_SMALL_TLIST);
+								  flags | CP_SMALL_TLIST, est_calls);
 
 	/*
 	 * make_sort_from_pathkeys indirectly calls find_ec_member_matching_expr,
@@ -2167,9 +2345,11 @@ create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
 	 */
 	plan = make_sort_from_pathkeys(subplan, best_path->path.pathkeys,
 								   IS_OTHER_REL(best_path->subpath->parent) ?
-								   best_path->path.parent->relids : NULL);
+								   best_path->path.parent->relids : NULL,
+								   est_calls);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2181,21 +2361,22 @@ create_sort_plan(PlannerInfo *root, SortPath *best_path, int flags)
  */
 static IncrementalSort *
 create_incrementalsort_plan(PlannerInfo *root, IncrementalSortPath *best_path,
-							int flags)
+							int flags, double est_calls)
 {
 	IncrementalSort *plan;
 	Plan	   *subplan;
 
 	/* See comments in create_sort_plan() above */
 	subplan = create_plan_recurse(root, best_path->spath.subpath,
-								  flags | CP_SMALL_TLIST);
+								  flags | CP_SMALL_TLIST, est_calls);
 	plan = make_incrementalsort_from_pathkeys(subplan,
 											  best_path->spath.path.pathkeys,
 											  IS_OTHER_REL(best_path->spath.subpath->parent) ?
 											  best_path->spath.path.parent->relids : NULL,
-											  best_path->nPresortedCols);
+											  best_path->nPresortedCols, est_calls);
 
 	copy_generic_path_info(&plan->sort.plan, (Path *) best_path);
+	plan->sort.plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2207,7 +2388,7 @@ create_incrementalsort_plan(PlannerInfo *root, IncrementalSortPath *best_path,
  *	  for its subpaths.
  */
 static Group *
-create_group_plan(PlannerInfo *root, GroupPath *best_path)
+create_group_plan(PlannerInfo *root, GroupPath *best_path, double est_calls)
 {
 	Group	   *plan;
 	Plan	   *subplan;
@@ -2218,7 +2399,8 @@ create_group_plan(PlannerInfo *root, GroupPath *best_path)
 	 * Group can project, so no need to be terribly picky about child tlist,
 	 * but we do need grouping columns to be available
 	 */
-	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
+	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST,
+								  est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
@@ -2235,6 +2417,7 @@ create_group_plan(PlannerInfo *root, GroupPath *best_path)
 					  subplan);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2246,7 +2429,8 @@ create_group_plan(PlannerInfo *root, GroupPath *best_path)
  *	  for its subpaths.
  */
 static Unique *
-create_upper_unique_plan(PlannerInfo *root, UpperUniquePath *best_path, int flags)
+create_upper_unique_plan(PlannerInfo *root, UpperUniquePath *best_path, int flags,
+						 double est_calls)
 {
 	Unique	   *plan;
 	Plan	   *subplan;
@@ -2256,13 +2440,14 @@ create_upper_unique_plan(PlannerInfo *root, UpperUniquePath *best_path, int flag
 	 * need grouping columns to be labeled.
 	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  flags | CP_LABEL_TLIST);
+								  flags | CP_LABEL_TLIST, est_calls);
 
 	plan = make_unique_from_pathkeys(subplan,
 									 best_path->path.pathkeys,
 									 best_path->numkeys);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2274,7 +2459,7 @@ create_upper_unique_plan(PlannerInfo *root, UpperUniquePath *best_path, int flag
  *	  for its subpaths.
  */
 static Agg *
-create_agg_plan(PlannerInfo *root, AggPath *best_path)
+create_agg_plan(PlannerInfo *root, AggPath *best_path, double est_calls)
 {
 	Agg		   *plan;
 	Plan	   *subplan;
@@ -2285,7 +2470,8 @@ create_agg_plan(PlannerInfo *root, AggPath *best_path)
 	 * Agg can project, so no need to be terribly picky about child tlist, but
 	 * we do need grouping columns to be available
 	 */
-	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
+	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST,
+								  est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
@@ -2307,6 +2493,7 @@ create_agg_plan(PlannerInfo *root, AggPath *best_path)
 					subplan);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2358,7 +2545,8 @@ remap_groupColIdx(PlannerInfo *root, List *groupClause)
  *	  Returns a Plan node.
  */
 static Plan *
-create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
+create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path,
+						 double est_calls)
 {
 	Agg		   *plan;
 	Plan	   *subplan;
@@ -2376,7 +2564,8 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
 	 * Agg can project, so no need to be terribly picky about child tlist, but
 	 * we do need grouping columns to be available
 	 */
-	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
+	subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST,
+								  est_calls);
 
 	/*
 	 * Compute the mapping from tleSortGroupRef to column index in the child's
@@ -2471,7 +2660,7 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
 				sort_plan->targetlist = NIL;
 				sort_plan->lefttree = NULL;
 			}
-
+			/* XXX do we need to record est_calls here? */
 			chain = lappend(chain, agg_plan);
 		}
 	}
@@ -2504,6 +2693,7 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
 
 		/* Copy cost data from Path to Plan */
 		copy_generic_path_info(&plan->plan, &best_path->path);
+		plan->plan.est_calls = clamp_row_est(est_calls);
 	}
 
 	return (Plan *) plan;
@@ -2516,7 +2706,8 @@ create_groupingsets_plan(PlannerInfo *root, GroupingSetsPath *best_path)
  *	  for its subpaths.
  */
 static Result *
-create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
+create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path,
+					  double est_calls)
 {
 	Result	   *plan;
 	List	   *tlist;
@@ -2536,7 +2727,7 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
 		 * Since we are entering a different planner context (subroot),
 		 * recurse to create_plan not create_plan_recurse.
 		 */
-		plan = create_plan(subroot, mminfo->path);
+		plan = create_plan(subroot, mminfo->path, est_calls);
 
 		plan = (Plan *) make_limit(plan,
 								   subparse->limitOffset,
@@ -2562,6 +2753,7 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
 	plan = make_result(tlist, (Node *) best_path->quals, NULL);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	/*
 	 * During setrefs.c, we'll need to replace references to the Agg nodes
@@ -2582,7 +2774,8 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
  *	  for its subpaths.
  */
 static WindowAgg *
-create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path)
+create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path,
+					  double est_calls)
 {
 	WindowAgg  *plan;
 	WindowClause *wc = best_path->winclause;
@@ -2607,7 +2800,7 @@ create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path)
 	 * course need grouping columns to be available.
 	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  CP_LABEL_TLIST | CP_SMALL_TLIST);
+								  CP_LABEL_TLIST | CP_SMALL_TLIST, est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
@@ -2679,6 +2872,7 @@ create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path)
 						  subplan);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2690,7 +2884,8 @@ create_windowagg_plan(PlannerInfo *root, WindowAggPath *best_path)
  *	  for its subpaths.
  */
 static SetOp *
-create_setop_plan(PlannerInfo *root, SetOpPath *best_path, int flags)
+create_setop_plan(PlannerInfo *root, SetOpPath *best_path, int flags,
+				  double est_calls)
 {
 	SetOp	   *plan;
 	Plan	   *subplan;
@@ -2701,7 +2896,7 @@ create_setop_plan(PlannerInfo *root, SetOpPath *best_path, int flags)
 	 * need grouping columns to be labeled.
 	 */
 	subplan = create_plan_recurse(root, best_path->subpath,
-								  flags | CP_LABEL_TLIST);
+								  flags | CP_LABEL_TLIST, est_calls);
 
 	/* Convert numGroups to long int --- but 'ware overflow! */
 	numGroups = (long) Min(best_path->numGroups, (double) LONG_MAX);
@@ -2715,6 +2910,7 @@ create_setop_plan(PlannerInfo *root, SetOpPath *best_path, int flags)
 					  numGroups);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2726,7 +2922,8 @@ create_setop_plan(PlannerInfo *root, SetOpPath *best_path, int flags)
  *	  for its subpaths.
  */
 static RecursiveUnion *
-create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path)
+create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path,
+						   double est_calls)
 {
 	RecursiveUnion *plan;
 	Plan	   *leftplan;
@@ -2735,8 +2932,10 @@ create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path)
 	long		numGroups;
 
 	/* Need both children to produce same tlist, so force it */
-	leftplan = create_plan_recurse(root, best_path->leftpath, CP_EXACT_TLIST);
-	rightplan = create_plan_recurse(root, best_path->rightpath, CP_EXACT_TLIST);
+	leftplan = create_plan_recurse(root, best_path->leftpath, CP_EXACT_TLIST,
+								   est_calls);
+	rightplan = create_plan_recurse(root, best_path->rightpath, CP_EXACT_TLIST,
+									est_calls);
 
 	tlist = build_path_tlist(root, &best_path->path);
 
@@ -2751,6 +2950,7 @@ create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path)
 								numGroups);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2763,17 +2963,18 @@ create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path)
  */
 static LockRows *
 create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path,
-					 int flags)
+					 int flags, double est_calls)
 {
 	LockRows   *plan;
 	Plan	   *subplan;
 
 	/* LockRows doesn't project, so tlist requirements pass through */
-	subplan = create_plan_recurse(root, best_path->subpath, flags);
+	subplan = create_plan_recurse(root, best_path->subpath, flags, est_calls);
 
 	plan = make_lockrows(subplan, best_path->rowMarks, best_path->epqParam);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2785,14 +2986,15 @@ create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path,
  *	  Returns a Plan node.
  */
 static ModifyTable *
-create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path)
+create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path,
+						double est_calls)
 {
 	ModifyTable *plan;
 	Path	   *subpath = best_path->subpath;
 	Plan	   *subplan;
 
 	/* Subplan must produce exactly the specified tlist */
-	subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST);
+	subplan = create_plan_recurse(root, subpath, CP_EXACT_TLIST, est_calls);
 
 	/* Transfer resname/resjunk labeling, too, to keep executor happy */
 	apply_tlist_labeling(subplan->targetlist, root->processed_tlist);
@@ -2814,6 +3016,7 @@ create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path)
 							best_path->epqParam);
 
 	copy_generic_path_info(&plan->plan, &best_path->path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2825,7 +3028,8 @@ create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path)
  *	  for its subpaths.
  */
 static Limit *
-create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
+create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags,
+				  double est_calls)
 {
 	Limit	   *plan;
 	Plan	   *subplan;
@@ -2835,7 +3039,7 @@ create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
 	Oid		   *uniqCollations = NULL;
 
 	/* Limit doesn't project, so tlist requirements pass through */
-	subplan = create_plan_recurse(root, best_path->subpath, flags);
+	subplan = create_plan_recurse(root, best_path->subpath, flags, est_calls);
 
 	/* Extract information necessary for comparing rows for WITH TIES. */
 	if (best_path->limitOption == LIMIT_OPTION_WITH_TIES)
@@ -2868,6 +3072,7 @@ create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
 					  numUniqkeys, uniqColIdx, uniqOperators, uniqCollations);
 
 	copy_generic_path_info(&plan->plan, (Path *) best_path);
+	plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return plan;
 }
@@ -2887,7 +3092,7 @@ create_limit_plan(PlannerInfo *root, LimitPath *best_path, int flags)
  */
 static SeqScan *
 create_seqscan_plan(PlannerInfo *root, Path *best_path,
-					List *tlist, List *scan_clauses)
+					List *tlist, List *scan_clauses, double est_calls)
 {
 	SeqScan    *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -2914,6 +3119,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
 							 scan_relid);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -2925,7 +3131,7 @@ create_seqscan_plan(PlannerInfo *root, Path *best_path,
  */
 static SampleScan *
 create_samplescan_plan(PlannerInfo *root, Path *best_path,
-					   List *tlist, List *scan_clauses)
+					   List *tlist, List *scan_clauses, double est_calls)
 {
 	SampleScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -2960,6 +3166,7 @@ create_samplescan_plan(PlannerInfo *root, Path *best_path,
 								tsc);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -2979,7 +3186,7 @@ create_indexscan_plan(PlannerInfo *root,
 					  IndexPath *best_path,
 					  List *tlist,
 					  List *scan_clauses,
-					  bool indexonly)
+					  bool indexonly, double est_calls)
 {
 	Scan	   *scan_plan;
 	List	   *indexclauses = best_path->indexclauses;
@@ -3158,6 +3365,7 @@ create_indexscan_plan(PlannerInfo *root,
 											best_path->indexscandir);
 
 	copy_generic_path_info(&scan_plan->plan, &best_path->path);
+	scan_plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3171,7 +3379,7 @@ static BitmapHeapScan *
 create_bitmap_scan_plan(PlannerInfo *root,
 						BitmapHeapPath *best_path,
 						List *tlist,
-						List *scan_clauses)
+						List *scan_clauses, double est_calls)
 {
 	Index		baserelid = best_path->path.parent->relid;
 	Plan	   *bitmapqualplan;
@@ -3189,7 +3397,7 @@ create_bitmap_scan_plan(PlannerInfo *root,
 	/* Process the bitmapqual tree into a Plan tree and qual lists */
 	bitmapqualplan = create_bitmap_subplan(root, best_path->bitmapqual,
 										   &bitmapqualorig, &indexquals,
-										   &indexECs);
+										   &indexECs, est_calls);
 
 	if (best_path->path.parallel_aware)
 		bitmap_subplan_mark_shared(bitmapqualplan);
@@ -3273,6 +3481,7 @@ create_bitmap_scan_plan(PlannerInfo *root,
 									 baserelid);
 
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3299,7 +3508,8 @@ create_bitmap_scan_plan(PlannerInfo *root,
  */
 static Plan *
 create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
-					  List **qual, List **indexqual, List **indexECs)
+					  List **qual, List **indexqual, List **indexECs,
+					  double est_calls)
 {
 	Plan	   *plan;
 
@@ -3328,7 +3538,7 @@ create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
 
 			subplan = create_bitmap_subplan(root, (Path *) lfirst(l),
 											&subqual, &subindexqual,
-											&subindexEC);
+											&subindexEC, est_calls);
 			subplans = lappend(subplans, subplan);
 			subquals = list_concat_unique(subquals, subqual);
 			subindexquals = list_concat_unique(subindexquals, subindexqual);
@@ -3375,7 +3585,7 @@ create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
 
 			subplan = create_bitmap_subplan(root, (Path *) lfirst(l),
 											&subqual, &subindexqual,
-											&subindexEC);
+											&subindexEC, est_calls);
 			subplans = lappend(subplans, subplan);
 			if (subqual == NIL)
 				const_true_subqual = true;
@@ -3440,7 +3650,7 @@ create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
 		/* Use the regular indexscan plan build machinery... */
 		iscan = castNode(IndexScan,
 						 create_indexscan_plan(root, ipath,
-											   NIL, NIL, false));
+											   NIL, NIL, false, est_calls));
 		/* then convert to a bitmap indexscan */
 		plan = (Plan *) make_bitmap_indexscan(iscan->scan.scanrelid,
 											  iscan->indexid,
@@ -3507,7 +3717,7 @@ create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual,
  */
 static TidScan *
 create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
-					List *tlist, List *scan_clauses)
+					List *tlist, List *scan_clauses, double est_calls)
 {
 	TidScan    *scan_plan;
 	Index		scan_relid = best_path->path.parent->relid;
@@ -3593,6 +3803,7 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
 							 tidquals);
 
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3604,7 +3815,7 @@ create_tidscan_plan(PlannerInfo *root, TidPath *best_path,
  */
 static TidRangeScan *
 create_tidrangescan_plan(PlannerInfo *root, TidRangePath *best_path,
-						 List *tlist, List *scan_clauses)
+						 List *tlist, List *scan_clauses, double est_calls)
 {
 	TidRangeScan *scan_plan;
 	Index		scan_relid = best_path->path.parent->relid;
@@ -3658,6 +3869,7 @@ create_tidrangescan_plan(PlannerInfo *root, TidRangePath *best_path,
 								  tidrangequals);
 
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3669,7 +3881,7 @@ create_tidrangescan_plan(PlannerInfo *root, TidRangePath *best_path,
  */
 static SubqueryScan *
 create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
-						 List *tlist, List *scan_clauses)
+						 List *tlist, List *scan_clauses, double est_calls)
 {
 	SubqueryScan *scan_plan;
 	RelOptInfo *rel = best_path->path.parent;
@@ -3685,7 +3897,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 	 * a different planner context (subroot), recurse to create_plan not
 	 * create_plan_recurse.
 	 */
-	subplan = create_plan(rel->subroot, best_path->subpath);
+	subplan = create_plan(rel->subroot, best_path->subpath, est_calls);
 
 	/* Sort clauses into best execution order */
 	scan_clauses = order_qual_clauses(root, scan_clauses);
@@ -3708,6 +3920,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
 								  subplan);
 
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3719,7 +3932,7 @@ create_subqueryscan_plan(PlannerInfo *root, SubqueryScanPath *best_path,
  */
 static FunctionScan *
 create_functionscan_plan(PlannerInfo *root, Path *best_path,
-						 List *tlist, List *scan_clauses)
+						 List *tlist, List *scan_clauses, double est_calls)
 {
 	FunctionScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -3751,6 +3964,7 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
 								  functions, rte->funcordinality);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3762,7 +3976,7 @@ create_functionscan_plan(PlannerInfo *root, Path *best_path,
  */
 static TableFuncScan *
 create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
-						  List *tlist, List *scan_clauses)
+						  List *tlist, List *scan_clauses, double est_calls)
 {
 	TableFuncScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -3794,6 +4008,7 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
 								   tablefunc);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3805,7 +4020,7 @@ create_tablefuncscan_plan(PlannerInfo *root, Path *best_path,
  */
 static ValuesScan *
 create_valuesscan_plan(PlannerInfo *root, Path *best_path,
-					   List *tlist, List *scan_clauses)
+					   List *tlist, List *scan_clauses, double est_calls)
 {
 	ValuesScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -3838,6 +4053,7 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
 								values_lists);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3849,7 +4065,7 @@ create_valuesscan_plan(PlannerInfo *root, Path *best_path,
  */
 static CteScan *
 create_ctescan_plan(PlannerInfo *root, Path *best_path,
-					List *tlist, List *scan_clauses)
+					List *tlist, List *scan_clauses, double est_calls)
 {
 	CteScan    *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -3932,6 +4148,7 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
 							 plan_id, cte_param_id);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3944,7 +4161,8 @@ create_ctescan_plan(PlannerInfo *root, Path *best_path,
  */
 static NamedTuplestoreScan *
 create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
-								List *tlist, List *scan_clauses)
+								List *tlist, List *scan_clauses,
+								double est_calls)
 {
 	NamedTuplestoreScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -3971,6 +4189,7 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
 										 rte->enrname);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -3983,7 +4202,7 @@ create_namedtuplestorescan_plan(PlannerInfo *root, Path *best_path,
  */
 static Result *
 create_resultscan_plan(PlannerInfo *root, Path *best_path,
-					   List *tlist, List *scan_clauses)
+					   List *tlist, List *scan_clauses, double est_calls)
 {
 	Result	   *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -4009,6 +4228,7 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
 	scan_plan = make_result(tlist, (Node *) scan_clauses, NULL);
 
 	copy_generic_path_info(&scan_plan->plan, best_path);
+	scan_plan->plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -4020,7 +4240,7 @@ create_resultscan_plan(PlannerInfo *root, Path *best_path,
  */
 static WorkTableScan *
 create_worktablescan_plan(PlannerInfo *root, Path *best_path,
-						  List *tlist, List *scan_clauses)
+						  List *tlist, List *scan_clauses, double est_calls)
 {
 	WorkTableScan *scan_plan;
 	Index		scan_relid = best_path->parent->relid;
@@ -4069,6 +4289,7 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
 								   cteroot->wt_param_id);
 
 	copy_generic_path_info(&scan_plan->scan.plan, best_path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	return scan_plan;
 }
@@ -4080,7 +4301,7 @@ create_worktablescan_plan(PlannerInfo *root, Path *best_path,
  */
 static ForeignScan *
 create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
-						List *tlist, List *scan_clauses)
+						List *tlist, List *scan_clauses, double est_calls)
 {
 	ForeignScan *scan_plan;
 	RelOptInfo *rel = best_path->path.parent;
@@ -4093,7 +4314,7 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	/* transform the child path if any */
 	if (best_path->fdw_outerpath)
 		outer_plan = create_plan_recurse(root, best_path->fdw_outerpath,
-										 CP_EXACT_TLIST);
+										 CP_EXACT_TLIST, est_calls);
 
 	/*
 	 * If we're scanning a base relation, fetch its OID.  (Irrelevant if
@@ -4125,10 +4346,11 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	scan_plan = rel->fdwroutine->GetForeignPlan(root, rel, rel_oid,
 												best_path,
 												tlist, scan_clauses,
-												outer_plan);
+												outer_plan, est_calls);
 
 	/* Copy cost data from Path to Plan; no need to make FDW do this */
 	copy_generic_path_info(&scan_plan->scan.plan, &best_path->path);
+	scan_plan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	/* Copy foreign server OID; likewise, no need to make FDW do this */
 	scan_plan->fs_server = rel->serverid;
@@ -4224,7 +4446,7 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
  */
 static CustomScan *
 create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
-					   List *tlist, List *scan_clauses)
+					   List *tlist, List *scan_clauses, double est_calls)
 {
 	CustomScan *cplan;
 	RelOptInfo *rel = best_path->path.parent;
@@ -4235,7 +4457,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	foreach(lc, best_path->custom_paths)
 	{
 		Plan	   *plan = create_plan_recurse(root, (Path *) lfirst(lc),
-											   CP_EXACT_TLIST);
+											   CP_EXACT_TLIST, est_calls);
 
 		custom_plans = lappend(custom_plans, plan);
 	}
@@ -4263,6 +4485,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 	 * do this
 	 */
 	copy_generic_path_info(&cplan->scan.plan, &best_path->path);
+	cplan->scan.plan.est_calls = clamp_row_est(est_calls);
 
 	/* Likewise, copy the relids that are represented by this custom scan */
 	cplan->custom_relids = best_path->path.parent->relids;
@@ -4295,7 +4518,7 @@ create_customscan_plan(PlannerInfo *root, CustomPath *best_path,
 
 static NestLoop *
 create_nestloop_plan(PlannerInfo *root,
-					 NestPath *best_path)
+					 NestPath *best_path, double est_calls)
 {
 	NestLoop   *join_plan;
 	Plan	   *outer_plan;
@@ -4309,13 +4532,15 @@ create_nestloop_plan(PlannerInfo *root,
 	Relids		saveOuterRels = root->curOuterRels;
 
 	/* NestLoop can project, so no need to be picky about child tlists */
-	outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, 0);
+	outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, 0,
+									 est_calls);
 
 	/* For a nestloop, include outer relids in curOuterRels for inner side */
 	root->curOuterRels = bms_union(root->curOuterRels,
 								   best_path->jpath.outerjoinpath->parent->relids);
 
-	inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath, 0);
+	inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath, 0,
+									 est_calls * outer_plan->plan_rows);
 
 	/* Restore curOuterRels */
 	bms_free(root->curOuterRels);
@@ -4365,13 +4590,14 @@ create_nestloop_plan(PlannerInfo *root,
 							  best_path->jpath.inner_unique);
 
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
+	join_plan->join.plan.est_calls = clamp_row_est(est_calls);
 
 	return join_plan;
 }
 
 static MergeJoin *
 create_mergejoin_plan(PlannerInfo *root,
-					  MergePath *best_path)
+					  MergePath *best_path, double est_calls)
 {
 	MergeJoin  *join_plan;
 	Plan	   *outer_plan;
@@ -4403,10 +4629,12 @@ create_mergejoin_plan(PlannerInfo *root,
 	 * necessary.
 	 */
 	outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath,
-									 (best_path->outersortkeys != NIL) ? CP_SMALL_TLIST : 0);
+									 (best_path->outersortkeys != NIL) ? CP_SMALL_TLIST : 0,
+									 est_calls);
 
 	inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath,
-									 (best_path->innersortkeys != NIL) ? CP_SMALL_TLIST : 0);
+									 (best_path->innersortkeys != NIL) ? CP_SMALL_TLIST : 0,
+									 est_calls);
 
 	/* Sort join qual clauses into best execution order */
 	/* NB: do NOT reorder the mergeclauses */
@@ -4462,7 +4690,7 @@ create_mergejoin_plan(PlannerInfo *root,
 		Relids		outer_relids = outer_path->parent->relids;
 		Sort	   *sort = make_sort_from_pathkeys(outer_plan,
 												   best_path->outersortkeys,
-												   outer_relids);
+												   outer_relids, est_calls);
 
 		label_sort_with_costsize(root, sort, -1.0);
 		outer_plan = (Plan *) sort;
@@ -4476,7 +4704,7 @@ create_mergejoin_plan(PlannerInfo *root,
 		Relids		inner_relids = inner_path->parent->relids;
 		Sort	   *sort = make_sort_from_pathkeys(inner_plan,
 												   best_path->innersortkeys,
-												   inner_relids);
+												   inner_relids, est_calls);
 
 		label_sort_with_costsize(root, sort, -1.0);
 		inner_plan = (Plan *) sort;
@@ -4499,6 +4727,7 @@ create_mergejoin_plan(PlannerInfo *root,
 		 * sync with final_cost_mergejoin.)
 		 */
 		copy_plan_costsize(matplan, inner_plan);
+		inner_plan->est_calls = clamp_row_est(est_calls);
 		matplan->total_cost += cpu_operator_cost * matplan->plan_rows;
 
 		inner_plan = matplan;
@@ -4672,13 +4901,14 @@ create_mergejoin_plan(PlannerInfo *root,
 
 	/* Costs of sort and material steps are included in path cost already */
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
+	join_plan->join.plan.est_calls = clamp_row_est(est_calls);
 
 	return join_plan;
 }
 
 static HashJoin *
 create_hashjoin_plan(PlannerInfo *root,
-					 HashPath *best_path)
+					 HashPath *best_path, double est_calls)
 {
 	HashJoin   *join_plan;
 	Hash	   *hash_plan;
@@ -4705,10 +4935,11 @@ create_hashjoin_plan(PlannerInfo *root,
 	 * that we don't put extra data in the outer batch files.
 	 */
 	outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath,
-									 (best_path->num_batches > 1) ? CP_SMALL_TLIST : 0);
+									 (best_path->num_batches > 1) ? CP_SMALL_TLIST : 0,
+									 est_calls);
 
 	inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath,
-									 CP_SMALL_TLIST);
+									 CP_SMALL_TLIST, est_calls);
 
 	/* Sort join qual clauses into best execution order */
 	joinclauses = order_qual_clauses(root, best_path->jpath.joinrestrictinfo);
@@ -4845,6 +5076,7 @@ create_hashjoin_plan(PlannerInfo *root,
 							  best_path->jpath.inner_unique);
 
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
+	join_plan->join.plan.est_calls = clamp_row_est(est_calls);
 
 	return join_plan;
 }
@@ -6106,7 +6338,8 @@ prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
 						   AttrNumber **p_sortColIdx,
 						   Oid **p_sortOperators,
 						   Oid **p_collations,
-						   bool **p_nullsFirst)
+						   bool **p_nullsFirst,
+						   double est_calls)
 {
 	List	   *tlist = lefttree->targetlist;
 	ListCell   *i;
@@ -6226,7 +6459,8 @@ prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
 				/* copy needed so we don't modify input's tlist below */
 				tlist = copyObject(tlist);
 				lefttree = inject_projection_plan(lefttree, tlist,
-												  lefttree->parallel_safe);
+												  lefttree->parallel_safe,
+												  est_calls);
 			}
 
 			/* Don't bother testing is_projection_capable_plan again */
@@ -6283,7 +6517,8 @@ prepare_sort_from_pathkeys(Plan *lefttree, List *pathkeys,
  *	  'relids' is the set of relations required by prepare_sort_from_pathkeys()
  */
 static Sort *
-make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids)
+make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids,
+						double est_calls)
 {
 	int			numsortkeys;
 	AttrNumber *sortColIdx;
@@ -6300,7 +6535,8 @@ make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids)
 										  &sortColIdx,
 										  &sortOperators,
 										  &collations,
-										  &nullsFirst);
+										  &nullsFirst,
+										  est_calls);
 
 	/* Now build the Sort node */
 	return make_sort(lefttree, numsortkeys,
@@ -6319,7 +6555,8 @@ make_sort_from_pathkeys(Plan *lefttree, List *pathkeys, Relids relids)
  */
 static IncrementalSort *
 make_incrementalsort_from_pathkeys(Plan *lefttree, List *pathkeys,
-								   Relids relids, int nPresortedCols)
+								   Relids relids, int nPresortedCols,
+								   double est_calls)
 {
 	int			numsortkeys;
 	AttrNumber *sortColIdx;
@@ -6336,7 +6573,8 @@ make_incrementalsort_from_pathkeys(Plan *lefttree, List *pathkeys,
 										  &sortColIdx,
 										  &sortOperators,
 										  &collations,
-										  &nullsFirst);
+										  &nullsFirst,
+										  est_calls);
 
 	/* Now build the Sort node */
 	return make_incrementalsort(lefttree, numsortkeys, nPresortedCols,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9a4accb4d9..7b2552c0e4 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -314,6 +314,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	glob->lastPHId = 0;
 	glob->lastRowMarkId = 0;
 	glob->lastPlanNodeId = 0;
+	glob->jitFlags = PGJIT_NONE;
 	glob->transientPlan = false;
 	glob->dependsOnRole = false;
 
@@ -410,7 +411,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
 	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
 
-	top_plan = create_plan(root, best_path);
+	top_plan = create_plan(root, best_path, 1.0);
 
 	/*
 	 * If creating a plan for a scrollable cursor, make sure it can run
@@ -531,32 +532,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
 	result->utilityStmt = parse->utilityStmt;
 	result->stmt_location = parse->stmt_location;
 	result->stmt_len = parse->stmt_len;
-
-	result->jitFlags = PGJIT_NONE;
-	if (jit_enabled && jit_above_cost >= 0 &&
-		top_plan->total_cost > jit_above_cost)
-	{
-		result->jitFlags |= PGJIT_PERFORM;
-
-		/*
-		 * Decide how much effort should be put into generating better code.
-		 */
-		if (jit_optimize_above_cost >= 0 &&
-			top_plan->total_cost > jit_optimize_above_cost)
-			result->jitFlags |= PGJIT_OPT3;
-		if (jit_inline_above_cost >= 0 &&
-			top_plan->total_cost > jit_inline_above_cost)
-			result->jitFlags |= PGJIT_INLINE;
-
-		/*
-		 * Decide which operations should be JITed.
-		 */
-		if (jit_expressions)
-			result->jitFlags |= PGJIT_EXPR;
-		if (jit_tuple_deforming)
-			result->jitFlags |= PGJIT_DEFORM;
-	}
-
+	result->jitFlags = glob->jitFlags;
 	if (glob->partition_directory != NULL)
 		DestroyPartitionDirectory(glob->partition_directory);
 
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index df4ca12919..b81d4a1828 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -233,7 +233,11 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 	best_path = get_cheapest_fractional_path(final_rel, tuple_fraction);
 
-	plan = create_plan(subroot, best_path);
+	/*
+	 * XXX we can't get an accurate est_calls to pass to create_plan here as
+	 * we've not yet planned the outer query!
+	 */
+	plan = create_plan(subroot, best_path, 1.0);
 
 	/* And convert to SubPlan or InitPlan format. */
 	result = build_subplan(root, plan, subroot, plan_params,
@@ -284,7 +288,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 				AlternativeSubPlan *asplan;
 
 				/* OK, finish planning the ANY subquery */
-				plan = create_plan(subroot, best_path);
+				plan = create_plan(subroot, best_path, 1.0);
 
 				/* ... and convert to SubPlan format */
 				hashplan = castNode(SubPlan,
@@ -997,7 +1001,7 @@ SS_process_ctes(PlannerInfo *root)
 		final_rel = fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
 		best_path = final_rel->cheapest_total_path;
 
-		plan = create_plan(subroot, best_path);
+		plan = create_plan(subroot, best_path, 1.0);
 
 		/*
 		 * Make a SubPlan node for it.  This is just enough unlike
diff --git a/src/include/foreign/fdwapi.h b/src/include/foreign/fdwapi.h
index 57c02bff45..9c57896de9 100644
--- a/src/include/foreign/fdwapi.h
+++ b/src/include/foreign/fdwapi.h
@@ -38,7 +38,8 @@ typedef ForeignScan *(*GetForeignPlan_function) (PlannerInfo *root,
 												 ForeignPath *best_path,
 												 List *tlist,
 												 List *scan_clauses,
-												 Plan *outer_plan);
+												 Plan *outer_plan,
+												 double est_calls);
 
 typedef void (*BeginForeignScan_function) (ForeignScanState *node,
 										   int eflags);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 244d1e1197..6e78b06f10 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -119,6 +119,8 @@ typedef struct PlannerGlobal
 
 	int			lastPlanNodeId; /* highest plan node ID assigned */
 
+	int			jitFlags;		/* OR mask of jitFlags for each plan node */
+
 	bool		transientPlan;	/* redo plan when TransactionXmin changes? */
 
 	bool		dependsOnRole;	/* is plan specific to current role? */
@@ -1533,6 +1535,8 @@ typedef struct MemoizePath
 	uint32		est_entries;	/* The maximum number of entries that the
 								 * planner expects will fit in the cache, or 0
 								 * if unknown */
+	double		est_hitratio;	/* An estimate on the ratio of how many calls
+								 * will result in a cache hit. */
 } MemoizePath;
 
 /*
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index e43e360d9b..1ea4c78bcb 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -118,6 +118,9 @@ typedef struct Plan
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
+	double		est_calls;		/* estimated number of times this plan will be
+								 * (re)scanned */
+
 	/*
 	 * planner's estimate of result size of this plan step
 	 */
@@ -135,6 +138,11 @@ typedef struct Plan
 	 */
 	bool		async_capable;	/* engage asynchronous-capable logic? */
 
+	/*
+	 * information needed for jit
+	 */
+	bool		jit;			/* jit compile for this plan node? */
+
 	/*
 	 * Common structural data for all Plan types.
 	 */
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index c4f61c1a09..4b554d2a98 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -38,13 +38,15 @@ extern void preprocess_minmax_aggregates(PlannerInfo *root);
 /*
  * prototypes for plan/createplan.c
  */
-extern Plan *create_plan(PlannerInfo *root, Path *best_path);
+extern Plan *create_plan(PlannerInfo *root, Path *best_path,
+						 double est_calls);
 extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,
 									 Index scanrelid, List *fdw_exprs, List *fdw_private,
 									 List *fdw_scan_tlist, List *fdw_recheck_quals,
 									 Plan *outer_plan);
 extern Plan *change_plan_targetlist(Plan *subplan, List *tlist,
-									bool tlist_parallel_safe);
+									bool tlist_parallel_safe,
+									double est_calls);
 extern Plan *materialize_finished_plan(Plan *subplan);
 extern bool is_projection_capable_path(Path *path);
 extern bool is_projection_capable_plan(Plan *plan);
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index 48620edbc2..416dedee9a 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -100,6 +100,7 @@ select explain_filter('explain (analyze, buffers, format xml) select * from int8
        <Total-Cost>N.N</Total-Cost>                    +
        <Plan-Rows>N</Plan-Rows>                        +
        <Plan-Width>N</Plan-Width>                      +
+       <Plan-Calls>N</Plan-Calls>                      +
        <Actual-Startup-Time>N.N</Actual-Startup-Time>  +
        <Actual-Total-Time>N.N</Actual-Total-Time>      +
        <Actual-Rows>N</Actual-Rows>                    +
@@ -148,6 +149,7 @@ select explain_filter('explain (analyze, buffers, format yaml) select * from int
      Total Cost: N.N          +
      Plan Rows: N             +
      Plan Width: N            +
+     Plan Calls: N            +
      Actual Startup Time: N.N +
      Actual Total Time: N.N   +
      Actual Rows: N           +
@@ -199,6 +201,7 @@ select explain_filter('explain (buffers, format json) select * from int8_tbl i8'
        "Total Cost": N.N,          +
        "Plan Rows": N,             +
        "Plan Width": N,            +
+       "Plan Calls": N,            +
        "Shared Hit Blocks": N,     +
        "Shared Read Blocks": N,    +
        "Shared Dirtied Blocks": N, +
@@ -244,6 +247,7 @@ select explain_filter('explain (analyze, buffers, format json) select * from int
        "Total Cost": N.N,          +
        "Plan Rows": N,             +
        "Plan Width": N,            +
+       "Plan Calls": N,            +
        "Actual Startup Time": N.N, +
        "Actual Total Time": N.N,   +
        "Actual Rows": N,           +
@@ -363,6 +367,7 @@ select jsonb_pretty(
                              "Schema": "public",            +
                              "Node Type": "Seq Scan",       +
                              "Plan Rows": 0,                +
+                             "Plan Calls": 0,               +
                              "Plan Width": 0,               +
                              "Total Cost": 0.0,             +
                              "Actual Rows": 0,              +
@@ -409,6 +414,7 @@ select jsonb_pretty(
                      ],                                     +
                      "Node Type": "Sort",                   +
                      "Plan Rows": 0,                        +
+                     "Plan Calls": 0,                       +
                      "Plan Width": 0,                       +
                      "Total Cost": 0.0,                     +
                      "Actual Rows": 0,                      +
@@ -452,6 +458,7 @@ select jsonb_pretty(
              ],                                             +
              "Node Type": "Gather Merge",                   +
              "Plan Rows": 0,                                +
+             "Plan Calls": 0,                               +
              "Plan Width": 0,                               +
              "Total Cost": 0.0,                             +
              "Actual Rows": 0,                              +
#3Andy Fan
zhihui.fan1213@gmail.com
In reply to: David Rowley (#2)
Re: Making JIT more granular

Hi David:

Does anyone have any thoughts about this JIT costing? Is this an
improvement? Is there a better way?

I think this is an improvement. However I'm not sure how much improvement
& effort we want pay for it. I just shared my thoughts to start this
discussion.

1. Ideally there is no GUC needed at all. For given a operation, like
Expression execution, tuple deform, if we can know the extra cost
of JIT in compile and the saved cost of JIT in execution, we
can choose JIT automatically. But as for now, it is hard to
say both. and we don't have a GUC to for DBA like jit_compile_cost
/ jit_compile_tuple_deform_cost as well. Looks we have some
long way to go for this and cost is always a headache.

2. You calculate the cost to compare with jit_above_cost as:

plan->total_cost * plan->est_loops.

An alternative way might be to consider the rescan cost like
cost_rescan. This should be closer for a final execution cost.
However since it is hard to set a reasonable jit_above_cost,
so I am feeling the current way is OK as well.

3. At implementation level, I think it would be terrible to add
another parameter like est_loops to every create_xxx_plan
in future, An alternative way may be:

typedef struct
{
int est_calls;
} ExtPlanInfo;

void
copyExtPlanInfo(Plan *targetPlan, ExtPlanInfo ext)
{
targetPlan->est_calls = ext.est_calls;
}

create_xxx_plan(..., ExtPlanInfo extinfo)
{
copyExtPlanInfo(plan, extinfo);
}

By this way, it would be easier to add another parameter
like est_calls easily. Not sure if this is over-engineered.

I have gone through the patches for a while, General it looks
good to me. If we have finalized the design, I can do a final
double check.

At last, I think the patched way should be better than
the current way.

--
Best Regards
Andy Fan

#4Andy Fan
zhihui.fan1213@gmail.com
In reply to: Andy Fan (#3)
Re: Making JIT more granular

2. You calculate the cost to compare with jit_above_cost as:

plan->total_cost * plan->est_loops.

An alternative way might be to consider the rescan cost like
cost_rescan. This should be closer for a final execution cost.
However since it is hard to set a reasonable jit_above_cost,
so I am feeling the current way is OK as well.

There are two observers after thinking more about this. a). due to the
rescan cost reason, plan->total_cost * plan->est_loops might be greater
than the whole plan's total_cost. This may cause users to be confused why
this change can make the plan not JITed in the past, but JITed now.

explain analyze select * from t1, t2 where t1.a = t2.a;
QUERY PLAN

------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..154.25 rows=100 width=16) (actual
time=0.036..2.618 rows=100 loops=1) Join Filter: (t1.a = t2.a) Rows
Removed by Join Filter: 9900 -> Seq Scan on t1 (cost=0.00..2.00
rows=100 width=8) (actual time=0.015..0.031 rows=100 loops=1) ->
Materialize (cost=0.00..2.50 rows=100 width=8) (actual time=0.000..0.010
rows=100 loops=100) -> Seq Scan on t2 (cost=0.00..2.00 rows=100
width=8) (actual time=0.007..0.023 rows=100 loops=1) Planning Time: 0.299
ms Execution Time: 2.694 ms (8 rows)

The overall plan's total_cost is 154.25, but the Materialize's JIT cost is
2.5 * 100 = 250.

b). Since the total_cost for a plan counts all the costs for its children,
so if one
child plan is JITed, I think all its parents would JITed. Is this by
design?

QUERY PLAN
----------------------------
Sort
Sort Key: (count(*))
-> HashAggregate
Group Key: a
-> Seq Scan on t1

(If Seq Scan is JITed, both HashAggregate & Sort will be JITed.)

--
Best Regards
Andy Fan