On disable_cost

Started by Zhenghua Lyuabout 6 years ago164 messages

zlv@pivotal.io

about 6 years ago

Hi,

Postgres has a global variable `disable_cost`. It is set the value
1.0e10.

This value will be added to the cost of path if related GUC is set off.
For example,
if enable_nestloop is set off, when planner trys to add nestloop join
path, it continues
to add such path but with a huge cost `disable_cost`.

But 1.0e10 may not be large enough. I encounter this issue in
Greenplum(based on postgres).
Heikki tolds me that someone also encountered the same issue on Postgres.
So I send it here to
have a discussion.

My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For
query 104, it generates
nestloop join even with enable_nestloop set off. And the final plan's
total cost is very huge (about 1e24). But If I enlarge the disable_cost to
1e30, then, planner will generate hash join.

So I guess that disable_cost is not large enough for huge amount of
data.

It is tricky to set disable_cost a huge number. Can we come up with
better solution?

The following thoughts are from Heikki:

Aside from not having a large enough disable cost, there's also the
fact that the high cost might affect the rest of the plan, if we have to
use a plan type that's disabled. For example, if a table doesn't have any
indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan
on different side of a join than we we would with enable_seqscan=on,
because of the high cost estimate.

I think a more robust way to disable forbidden plan types would be to
handle the disabling in add_path(). Instead of having a high disable cost
on the Path itself, the comparison add_path() would always consider
disabled paths as more expensive than others, regardless of the cost.

Any thoughts or ideas on the problem? Thanks!

Best Regards,
Zhenghua Lyu

Thomas Munro

thomas.munro@gmail.com

about 6 years ago

In reply to: Zhenghua Lyu (#1)

Re: On disable_cost

On Fri, Nov 1, 2019 at 7:42 PM Zhenghua Lyu <zlv@pivotal.io> wrote:

It is tricky to set disable_cost a huge number. Can we come up with better solution?

What happens if you use DBL_MAX?

Euler Taveira

euler@timbira.com.br

about 6 years ago

In reply to: Zhenghua Lyu (#1)

Re: On disable_cost

Em sex, 1 de nov de 2019 às 03:42, Zhenghua Lyu <zlv@pivotal.io> escreveu:

My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates
nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join.

So I guess that disable_cost is not large enough for huge amount of data.

It is tricky to set disable_cost a huge number. Can we come up with better solution?

Isn't it a case for a GUC disable_cost? As Thomas suggested, DBL_MAX
upper limit should be sufficient.

The following thoughts are from Heikki:

Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the rest of the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on, because of the high cost estimate.

I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of having a high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensive than others, regardless of the cost.

I'm afraid it is not as cheap as using diable_cost as a node cost. Are
you proposing to add a new boolean variable in Path struct to handle
those cases in compare_path_costs_fuzzily?

--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Andres Freund

andres@anarazel.de

about 6 years ago

In reply to: Thomas Munro (#2)

Re: On disable_cost

Hi,

On 2019-11-01 19:58:04 +1300, Thomas Munro wrote:

On Fri, Nov 1, 2019 at 7:42 PM Zhenghua Lyu <zlv@pivotal.io> wrote:

It is tricky to set disable_cost a huge number. Can we come up with better solution?

What happens if you use DBL_MAX?

That seems like a bad idea - we add the cost multiple times. And we
still want to compare plans that potentially involve that cost, if
there's no other way to plan the query.

- Andres

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Andres Freund (#4)

Re: On disable_cost

On Fri, Nov 1, 2019 at 12:00 PM Andres Freund <andres@anarazel.de> wrote:

That seems like a bad idea - we add the cost multiple times. And we
still want to compare plans that potentially involve that cost, if
there's no other way to plan the query.

Yeah. I kind of wonder if we shouldn't instead (a) skip adding paths
that use methods which are disabled and then (b) if we don't end up
with any paths for that reloptinfo, try again, ignoring disabling
GUCs.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Jim Finnerty

jfinnert@amazon.com

about 6 years ago

In reply to: Robert Haas (#5)

Re: On disable_cost

re: coping with adding disable_cost more than once

Another option would be to have a 2-part Cost structure. If disable_cost is
ever added to the Cost, then you set a flag recording this. If any plans
exist that have no disable_costs added to them, then the planner chooses the
minimum cost among those, otherwise you choose the minimum cost path.

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Andres Freund

andres@anarazel.de

about 6 years ago

In reply to: Robert Haas (#5)

Re: On disable_cost

Hi,

On 2019-11-01 12:22:06 -0400, Robert Haas wrote:

On Fri, Nov 1, 2019 at 12:00 PM Andres Freund <andres@anarazel.de> wrote:

That seems like a bad idea - we add the cost multiple times. And we
still want to compare plans that potentially involve that cost, if
there's no other way to plan the query.

Yeah. I kind of wonder if we shouldn't instead (a) skip adding paths
that use methods which are disabled and then (b) if we don't end up
with any paths for that reloptinfo, try again, ignoring disabling
GUCs.

Hm. That seems complicated. Is it clear that we'd always notice that we
have no plan early enough to know which paths to reconsider? I think
there's cases where that'd only happen a few levels up.

As a first step I'd be inclined to "just" adjust disable_cost up to
something like 1.0e12. Unfortunately much higher and and we're getting
into the area where the loss of precision starts to be significant
enough that I'm not sure that we're always careful enough to perform
math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000
being 1e+20). I've seen queries with costs above 1e10 where that costing
wasn't insane.

And then, in a larger patch, go for something like Heikki's proposal
quoted by Zhenghua Lyu upthread, where we treat 'forbidden' as a
separate factor in comparisons of path costs, rather than fudging the
cost upwards. But there's some care to be taken to make sure we don't
regress performance too much due to the additional logic in
compare_path_cost et al.

I'd also be curious to see if there's some other problem with cost
calculation here - some of the quoted final costs seem high enough to be
suspicious. I'd be curious to see a plan...

Greetings,

Andres Freund

Robert Haas

robertmhaas@gmail.com

about 6 years ago

In reply to: Andres Freund (#7)

Re: On disable_cost

On Fri, Nov 1, 2019 at 12:43 PM Andres Freund <andres@anarazel.de> wrote:

Hm. That seems complicated. Is it clear that we'd always notice that we
have no plan early enough to know which paths to reconsider? I think
there's cases where that'd only happen a few levels up.

Yeah, there could be problems of that kind. I think if a baserel has
no paths, then we know right away that we've got a problem, but for
joinrels it might be more complicated.

As a first step I'd be inclined to "just" adjust disable_cost up to
something like 1.0e12. Unfortunately much higher and and we're getting
into the area where the loss of precision starts to be significant
enough that I'm not sure that we're always careful enough to perform
math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000
being 1e+20). I've seen queries with costs above 1e10 where that costing
wasn't insane.

We've done that before and we can do it again. But we're going to need
to have something better eventually, I think, not just keep kicking
the can down the road.

Another point to consider here is that in some cases we could really
just skip generating certain paths altogether. We already do this for
hash joins: if we're planning a join and enable_hashjoin is disabled,
we just don't generate hash joins paths at all, except for full joins,
where there might be no other legal method. As this example shows,
this cannot be applied in all cases, but maybe we could do it more
widely than we do today. I'm not sure how beneficial that technique
would be, though, because it doesn't seem like it's quite enough to
solve this problem by itself.

Yet another approach would be to divide the cost into two parts, a
"cost" component and a "violations" component. If two paths are
compared, the one with fewer violations always wins; if it's a tie,
they compare on cost. A path's violation count is the total of its
children, plus one for itself if it does something that's disabled.
This would be more principled than the current approach, but maybe
it's too costly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 6 years ago

In reply to: Jim Finnerty (#6)

Re: On disable_cost

On Fri, Nov 01, 2019 at 09:30:52AM -0700, Jim Finnerty wrote:

re: coping with adding disable_cost more than once

Another option would be to have a 2-part Cost structure. If disable_cost is
ever added to the Cost, then you set a flag recording this. If any plans
exist that have no disable_costs added to them, then the planner chooses the
minimum cost among those, otherwise you choose the minimum cost path.

Yeah, I agree having is_disabled flag, and treat all paths with 'true'
as more expensive than paths with 'false' (and when both paths have the
same value then actually compare the cost) is probably the way forward.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10

Andres Freund

andres@anarazel.de

about 6 years ago

In reply to: Robert Haas (#8)

Re: On disable_cost

On 2019-11-01 12:56:30 -0400, Robert Haas wrote:

On Fri, Nov 1, 2019 at 12:43 PM Andres Freund <andres@anarazel.de> wrote:

As a first step I'd be inclined to "just" adjust disable_cost up to
something like 1.0e12. Unfortunately much higher and and we're getting
into the area where the loss of precision starts to be significant
enough that I'm not sure that we're always careful enough to perform
math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000
being 1e+20). I've seen queries with costs above 1e10 where that costing
wasn't insane.

We've done that before and we can do it again. But we're going to need
to have something better eventually, I think, not just keep kicking
the can down the road.

Yea, that's why I continued on to describe what we should do afterwards
;)

Yet another approach would be to divide the cost into two parts, a
"cost" component and a "violations" component. If two paths are
compared, the one with fewer violations always wins; if it's a tie,
they compare on cost. A path's violation count is the total of its
children, plus one for itself if it does something that's disabled.
This would be more principled than the current approach, but maybe
it's too costly.

Namely go for something like this. I think we probably get away with the
additional comparison, especially if we were to store the violations as
an integer and did it like if (unlikely(path1->nviolations !=
path2->nviolations)) or such - that ought to be very well predicted in
nearly all cases.

I wonder how much we'd need to reformulate
compare_path_costs/compare_path_costs_fuzzily to allow the compiler to
auto-vectorize. Might not be worth caring...

Greetings,

Andres Freund

#11

Tom Lane

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Zhenghua Lyu (#1)

Re: On disable_cost

Zhenghua Lyu <zlv@pivotal.io> writes:

I think a more robust way to disable forbidden plan types would be to
handle the disabling in add_path(). Instead of having a high disable cost
on the Path itself, the comparison add_path() would always consider
disabled paths as more expensive than others, regardless of the cost.

Getting rid of disable_cost would be a nice thing to do, but I would
rather not do it by adding still more complexity to add_path(), not
to mention having to bloat Paths with a separate "disabled" marker.

The idea that I've been thinking about is to not generate disabled
Paths in the first place, thus not only fixing the problem but saving
some cycles. While this seems easy enough for "optional" paths,
we have to reserve the ability to generate certain path types regardless,
if there's no other way to implement the query. This is a bit of a
stumbling block :-(. At the base relation level, we could do something
like generating seqscan last, and only if no other path has been
successfully generated. But I'm not sure how to scale that up to
joins. In particular, imagine that we consider joining A to B, and
find that the only way is a nestloop, so we generate a nestloop join
despite that being nominally disabled. The next join level would
then see that as an available path, and it might decide that
((A nestjoin B) join C) is the cheapest choice, even though there
might have been a way to do, say, ((A join C) join B) with no use of
nestloops. Users would find this surprising.

Maybe the only way to do this is a separate number-of-uses-of-
disabled-plan-types cost figure in Paths, but I still don't want
to go there. The number of cases where disable_cost's shortcomings
really matter is too small to justify that, IMHO.

regards, tom lane

#12

Tom Lane

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Tomas Vondra (#9)

Re: On disable_cost

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On Fri, Nov 01, 2019 at 09:30:52AM -0700, Jim Finnerty wrote:

re: coping with adding disable_cost more than once

Another option would be to have a 2-part Cost structure. If disable_cost is
ever added to the Cost, then you set a flag recording this. If any plans
exist that have no disable_costs added to them, then the planner chooses the
minimum cost among those, otherwise you choose the minimum cost path.

Yeah, I agree having is_disabled flag, and treat all paths with 'true'
as more expensive than paths with 'false' (and when both paths have the
same value then actually compare the cost) is probably the way forward.

It would have to be a count, not a boolean --- for example, you want to
prefer a path that uses one disabled SeqScan over a path that uses two.

I'm with Andres in being pretty worried about the extra burden imposed
on add_path comparisons.

regards, tom lane

#13

Jim Finnerty

jfinnert@amazon.com

about 6 years ago

In reply to: Tom Lane (#12)

Re: On disable_cost

As a proof of concept, I hacked around a bit today to re-purpose one of the
bits of the Cost structure to mean "is_disabled" so that we can distinguish
'disabled' from 'non-disabled' paths without making the Cost structure any
bigger. In fact, it's still a valid double. The obvious choice would have
been to re-purpose the sign bit, but I've had occasion to exploit negative
costs before so for this POC I used the high-order bit of the fractional
bits of the double. (see Wikipedia for double precision floating point for
the layout).

The idea is to set a special bit when disable_cost is added to a cost.
Dedicating multiple bits instead of just 1 would be easily done, but as it
is we can accumulate many disable_costs without overflowing, so just
comparing the cost suffices.

The patch is not fully debugged and fails on a couple of tests in the serial
test suite. It seems to fail on Cartesian products, and maybe in one other
non-CP case. I wasn't able to debug it before the day came to an end.

In one place the core code subtracts off the disable_cost. I left the
"disabled" bit set in this case, which might be wrong.

I don't see an option to attach the patch as an attachment, so here is the
patch inline (it is based on PG11). The more interesting part is in a small
number of lines in costsize.c. Other changes just add functions that assign
a disable_cost and set the bit, or that compare costs such that a
non-disabled cost always compares less than a disabled cost.

------------------

diff --git a/src/backend/optimizer/path/costsize.c
b/src/backend/optimizer/path/costsize.c
index 4e86458672..3718639330 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -123,6 +123,8 @@ double		parallel_setup_cost =
DEFAULT_PARALLEL_SETUP_COST;
 int			effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;

 Cost		disable_cost = 1.0e10;
+uint64      disabled_mask = 0x8000000000000;
+#define IS_DISABLED(cost) (((uint64) cost) & disabled_mask)

int max_parallel_workers_per_gather = 2;

@@ -205,6 +207,53 @@ clamp_row_est(double nrows)
return nrows;
}

+Cost
+add_cost(Cost cost, Cost delta_cost)
+{
+	uint64 mask = (delta_cost == disable_cost) ? disabled_mask : 0;
+	Cost max_cost = disabled_mask - disable_cost;
+	
+	if (cost + delta_cost < max_cost)
+		return ((Cost) ((uint64)(cost + delta_cost) | mask));
+	else
+		return ((Cost) ((uint64)(max_cost) | mask));
+}
+
+bool
+is_lower_cost(Cost cost1, Cost cost2)
+{
+	if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask))
+		return false;
+	
+	if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask)
+		return true;
+	
+	return (cost1 < cost2);
+}
+
+bool
+is_greater_cost(Cost cost1, Cost cost2)
+{
+	if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask))
+		return true;
+	
+	if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask)
+		return false;
+	
+	return (cost1 > cost2);
+}
+
+bool
+is_geq_cost(Cost cost1, Cost cost2)
+{
+	if ((uint64)cost1 & disabled_mask && !((uint64)cost2 & disabled_mask))
+		return true;
+	
+	if (!((uint64)cost1 & disabled_mask) && (uint64)cost2 & disabled_mask)
+		return false;
+	
+	return (cost1 >= cost2);
+}

/*
* cost_seqscan
@@ -235,7 +284,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
path->rows = baserel->rows;

 	if (!enable_seqscan)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

/* fetch estimated page cost for tablespace containing table */
get_tablespace_page_costs(baserel->reltablespace,
@@ -424,7 +473,7 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo
*root,
path->path.rows = rel->rows;

 	if (!enable_gathermerge)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

/*
* Add one to the number of workers to account for the leader. This might
@@ -538,7 +587,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double
loop_count,
}

 	if (!enable_indexscan)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);
 	/* we don't need to check enable_indexonlyscan; indxpath.c does that */

/*
@@ -976,7 +1025,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root,
RelOptInfo *baserel,
path->rows = baserel->rows;

 	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
@@ -1242,10 +1291,10 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	if (isCurrentOf)
 	{
 		Assert(baserel->baserestrictcost.startup >= disable_cost);
-		startup_cost -= disable_cost;
+		startup_cost -= disable_cost;  /* but do not un-set the disabled mark */
 	}
 	else if (!enable_tidscan)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

/*
* The TID qual expressions will be computed once, any other baserestrict
@@ -1676,7 +1725,7 @@ cost_sort(Path *path, PlannerInfo *root,
long sort_mem_bytes = sort_mem * 1024L;

 	if (!enable_sort)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

path->rows = tuples;

@@ -2121,8 +2170,8 @@ cost_agg(Path *path, PlannerInfo *root,
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
 		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
+			startup_cost = add_cost(startup_cost, disable_cost);
+			total_cost = add_cost(total_cost, disable_cost);
 		}
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
@@ -2137,7 +2186,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			startup_cost = add_cost(startup_cost, disable_cost);
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		startup_cost += (cpu_operator_cost * numGroupCols) * input_tuples;
@@ -2436,7 +2485,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_nestloop)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

/* cost of inner-relation source data (we already dealt with outer rel) */

@@ -2882,7 +2931,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath
*path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_mergejoin)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
@@ -3312,7 +3361,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_hashjoin)
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
@@ -3410,7 +3459,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	if (relation_byte_size(clamp_row_est(inner_path_rows * innermcvfreq),
 						   inner_path->pathtarget->width) >
 		(work_mem * 1024L))
-		startup_cost += disable_cost;
+		startup_cost = add_cost(startup_cost, disable_cost);

 	/*
 	 * Compute cost of the hashquals and qpquals (other restriction clauses)
@@ -3930,7 +3979,7 @@ cost_qual_eval_walker(Node *node,
cost_qual_eval_context *context)
 	else if (IsA(node, CurrentOfExpr))
 	{
 		/* Report high cost to prevent selection of anything but TID scan */
-		context->total.startup += disable_cost;
+		context->total.startup = add_cost(context->total.startup, disable_cost);
 	}
 	else if (IsA(node, SubLink))
 	{
diff --git a/src/backend/optimizer/util/pathnode.c
b/src/backend/optimizer/util/pathnode.c
index 4736d84a83..fd746a06bc 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -72,33 +72,33 @@ compare_path_costs(Path *path1, Path *path2,
CostSelector criterion)
 {
 	if (criterion == STARTUP_COST)
 	{
-		if (path1->startup_cost < path2->startup_cost)
+		if (is_lower_cost(path1->startup_cost, path2->startup_cost))
 			return -1;
-		if (path1->startup_cost > path2->startup_cost)
+		if (is_greater_cost(path1->startup_cost, path2->startup_cost))
 			return +1;

 		/*
 		 * If paths have the same startup cost (not at all unlikely), order
 		 * them by total cost.
 		 */
-		if (path1->total_cost < path2->total_cost)
+		if (is_lower_cost(path1->total_cost, path2->total_cost))
 			return -1;
-		if (path1->total_cost > path2->total_cost)
+		if (is_greater_cost(path1->total_cost, path2->total_cost))
 			return +1;
 	}
 	else
 	{
-		if (path1->total_cost < path2->total_cost)
+		if (is_lower_cost(path1->total_cost, path2->total_cost))
 			return -1;
-		if (path1->total_cost > path2->total_cost)
+		if (is_greater_cost(path1->total_cost, path2->total_cost))
 			return +1;

 		/*
 		 * If paths have the same total cost, order them by startup cost.
 		 */
-		if (path1->startup_cost < path2->startup_cost)
+		if (is_lower_cost(path1->startup_cost, path2->startup_cost))
 			return -1;
-		if (path1->startup_cost > path2->startup_cost)
+		if (is_greater_cost(path1->startup_cost, path2->startup_cost))
 			return +1;
 	}
 	return 0;
@@ -126,9 +126,9 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 		fraction * (path1->total_cost - path1->startup_cost);
 	cost2 = path2->startup_cost +
 		fraction * (path2->total_cost - path2->startup_cost);
-	if (cost1 < cost2)
+	if (is_lower_cost(cost1, cost2))
 		return -1;
-	if (cost1 > cost2)
+	if (is_greater_cost(cost1, cost2))
 		return +1;
 	return 0;
 }
@@ -172,11 +172,11 @@ compare_path_costs_fuzzily(Path *path1, Path *path2,
double fuzz_factor)
 	 * Check total cost first since it's more likely to be different; many
 	 * paths have zero startup cost.
 	 */
-	if (path1->total_cost > path2->total_cost * fuzz_factor)
+	if (is_greater_cost(path1->total_cost, path2->total_cost * fuzz_factor))
 	{
 		/* path1 fuzzily worse on total cost */
 		if (CONSIDER_PATH_STARTUP_COST(path1) &&
-			path2->startup_cost > path1->startup_cost * fuzz_factor)
+			is_greater_cost(path2->startup_cost, path1->startup_cost * fuzz_factor))
 		{
 			/* ... but path2 fuzzily worse on startup, so DIFFERENT */
 			return COSTS_DIFFERENT;
@@ -184,11 +184,11 @@ compare_path_costs_fuzzily(Path *path1, Path *path2,
double fuzz_factor)
 		/* else path2 dominates */
 		return COSTS_BETTER2;
 	}
-	if (path2->total_cost > path1->total_cost * fuzz_factor)
+	if (is_greater_cost(path2->total_cost, path1->total_cost * fuzz_factor))
 	{
 		/* path2 fuzzily worse on total cost */
 		if (CONSIDER_PATH_STARTUP_COST(path2) &&
-			path1->startup_cost > path2->startup_cost * fuzz_factor)
+			is_greater_cost(path1->startup_cost, path2->startup_cost * fuzz_factor))
 		{
 			/* ... but path1 fuzzily worse on startup, so DIFFERENT */
 			return COSTS_DIFFERENT;
@@ -197,12 +197,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2,
double fuzz_factor)
 		return COSTS_BETTER1;
 	}
 	/* fuzzily the same on total cost ... */
-	if (path1->startup_cost > path2->startup_cost * fuzz_factor)
+	if (is_greater_cost(path1->startup_cost, path2->startup_cost *
fuzz_factor))
 	{
 		/* ... but path1 fuzzily worse on startup, so path2 wins */
 		return COSTS_BETTER2;
 	}
-	if (path2->startup_cost > path1->startup_cost * fuzz_factor)
+	if (is_greater_cost(path2->startup_cost, path1->startup_cost *
fuzz_factor))
 	{
 		/* ... but path2 fuzzily worse on startup, so path1 wins */
 		return COSTS_BETTER1;
@@ -605,7 +605,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 		else
 		{
 			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			if (is_geq_cost(new_path->total_cost, old_path->total_cost))
 				insert_after = p1;
 			/* p1_prev advances */
 			p1_prev = p1;
@@ -681,7 +681,7 @@ add_path_precheck(RelOptInfo *parent_rel,
 		 *
 		 * Cost comparisons here should match compare_path_costs_fuzzily.
 		 */
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		if (is_greater_cost(total_cost, old_path->total_cost * STD_FUZZ_FACTOR))
 		{
 			/* new path can win on startup cost only if consider_startup */
 			if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
@@ -796,14 +796,14 @@ add_partial_path(RelOptInfo *parent_rel, Path
*new_path)
 		/* Unless pathkeys are incompable, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+			if (is_greater_cost(new_path->total_cost, old_path->total_cost *
STD_FUZZ_FACTOR))
 			{
 				/* New path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER1)
 					accept_new = false;
 			}
-			else if (old_path->total_cost > new_path->total_cost
-					 * STD_FUZZ_FACTOR)
+			else if (is_greater_cost(old_path->total_cost, new_path->total_cost
+									 * STD_FUZZ_FACTOR))
 			{
 				/* Old path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER2)
@@ -819,7 +819,7 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 				/* Costs are about the same, old path has better pathkeys. */
 				accept_new = false;
 			}
-			else if (old_path->total_cost > new_path->total_cost * 1.0000000001)
+			else if (is_greater_cost(old_path->total_cost, new_path->total_cost *
1.0000000001))
 			{
 				/* Pathkeys are the same, and the old path costs more. */
 				remove_old = true;
@@ -847,7 +847,7 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		else
 		{
 			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			if (is_geq_cost(new_path->total_cost, old_path->total_cost))
 				insert_after = p1;
 			/* p1_prev advances */
 			p1_prev = p1;
@@ -913,10 +913,10 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost
total_cost,
 		keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys);
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR &&
+			if (is_greater_cost(total_cost, old_path->total_cost * STD_FUZZ_FACTOR)
&&
 				keyscmp != PATHKEYS_BETTER1)
 				return false;
-			if (old_path->total_cost > total_cost * STD_FUZZ_FACTOR &&
+			if (is_greater_cost(old_path->total_cost, total_cost * STD_FUZZ_FACTOR)
&&
 				keyscmp != PATHKEYS_BETTER2)
 				return true;
 		}
@@ -1697,7 +1697,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel,
Path *subpath,

 	if (sjinfo->semi_can_btree && sjinfo->semi_can_hash)
 	{
-		if (agg_path.total_cost < sort_path.total_cost)
+		if (is_lower_cost(agg_path.total_cost, sort_path.total_cost))
 			pathnode->umethod = UNIQUE_PATH_HASH;
 		else
 			pathnode->umethod = UNIQUE_PATH_SORT;
diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index 78f3b99a76..c261a9d790 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5076,8 +5076,8 @@ IsProjectionFunctionalIndex(Relation index)
 		 * when values differ because the expression is recalculated when
 		 * inserting a new index entry for the changed value.
 		 */
-		if ((index_expr_cost.startup + index_expr_cost.per_tuple) >
-			HEURISTIC_MAX_HOT_RECHECK_EXPR_COST)
+		if (is_greater_cost((index_expr_cost.startup +
index_expr_cost.per_tuple),
+							HEURISTIC_MAX_HOT_RECHECK_EXPR_COST))
 			is_projection = false;

 		tuple = SearchSysCache1(RELOID,
ObjectIdGetDatum(RelationGetRelid(index)));
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 9159f2bab1..c01d08eae5 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -251,6 +251,12 @@ extern PathTarget
*set_pathtarget_cost_width(PlannerInfo *root, PathTarget *targ
 extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel,
 					 Path *bitmapqual, int loop_count, Cost *cost, double *tuple);

+extern Cost add_cost(Cost cost, Cost delta_cost);
+extern bool is_lower_cost(Cost cost1, Cost cost2);
+extern bool is_greater_cost(Cost cost1, Cost cost2);
+extern bool is_geq_cost(Cost cost1, Cost cost2);
+
+
 /*
  * prototypes for clausesel.c
  *	  routines to compute clause selectivities

-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

#14

Laurenz Albe

laurenz.albe@cybertec.at

about 6 years ago

In reply to: Jim Finnerty (#13)

Re: On disable_cost

On Tue, 2019-12-10 at 15:50 -0700, Jim Finnerty wrote:

As a proof of concept, I hacked around a bit today to re-purpose one of the
bits of the Cost structure to mean "is_disabled" so that we can distinguish
'disabled' from 'non-disabled' paths without making the Cost structure any
bigger. In fact, it's still a valid double. The obvious choice would have
been to re-purpose the sign bit, but I've had occasion to exploit negative
costs before so for this POC I used the high-order bit of the fractional
bits of the double. (see Wikipedia for double precision floating point for
the layout).

The idea is to set a special bit when disable_cost is added to a cost.
Dedicating multiple bits instead of just 1 would be easily done, but as it
is we can accumulate many disable_costs without overflowing, so just
comparing the cost suffices.

Doesn't that rely on a specific implementation of double precision (IEEE)?
I thought that we don't want to limit ourselves to platforms with IEEE floats.

Yours,
Laurenz Albe

#15

Greg Stark

stark@mit.edu

about 6 years ago

In reply to: Laurenz Albe (#14)

Re: On disable_cost

On Wed, 11 Dec 2019 at 01:24, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Tue, 2019-12-10 at 15:50 -0700, Jim Finnerty wrote:

As a proof of concept, I hacked around a bit today to re-purpose one of the
bits of the Cost structure to mean "is_disabled" so that we can distinguish
'disabled' from 'non-disabled' paths without making the Cost structure any
bigger. In fact, it's still a valid double. The obvious choice would have
been to re-purpose the sign bit, but I've had occasion to exploit negative
costs before so for this POC I used the high-order bit of the fractional
bits of the double. (see Wikipedia for double precision floating point for
the layout).

The idea is to set a special bit when disable_cost is added to a cost.
Dedicating multiple bits instead of just 1 would be easily done, but as it
is we can accumulate many disable_costs without overflowing, so just
comparing the cost suffices.

Doesn't that rely on a specific implementation of double precision (IEEE)?
I thought that we don't want to limit ourselves to platforms with IEEE floats.

We could always implement it again in another format....

However, I wouldn't have expected to be bit twiddling. I would have
expected to use standard functions like ldexp to do this. In fact I
think if you use the high bit of the exponent you could do it entirely
using ldexp and regular double comparisons (with fabs).

Ie, to set the bit you set cost = ldexp(cost, __DBL_MAX_EXP__/2). And
to check for the bit being set you compare ilogb(cost,
__DBL_MAX_EXP__/2). Hm. that doesn't handle if the cost is already < 1
in which case I guess you would have to set it to 1 first. Or reserve
the two high bits of the cost so you can represent disabled values
that had negative exponents before being disabled.

I wonder if it wouldn't be a lot cleaner and more flexible to just go
with a plain float for Cost and use the other 32 bits for counters and
bitmasks and still be ahead of the game. A double can store 2^1024 but
a float 2^128 which still feels like it should be more than enough to
store the kinds of costs plans have without the disabled costs. 2^128
milliseconds is still 10^28 years which is an awfully expensive
query....

--
greg

#16

Thomas Munro

thomas.munro@gmail.com

about 6 years ago

In reply to: Laurenz Albe (#14)

Re: On disable_cost

On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Doesn't that rely on a specific implementation of double precision (IEEE)?
I thought that we don't want to limit ourselves to platforms with IEEE floats.

Just by the way, you might want to read the second last paragraph of
the commit message for 02ddd499. The dream is over, we're never going
to run on Vax.

#17

Tom Lane

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Thomas Munro (#16)

Re: On disable_cost

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Doesn't that rely on a specific implementation of double precision (IEEE)?
I thought that we don't want to limit ourselves to platforms with IEEE floats.

Just by the way, you might want to read the second last paragraph of
the commit message for 02ddd499. The dream is over, we're never going
to run on Vax.

Still, the proposed hack is doubling down on IEEE dependency in a way
that I quite dislike, in that (a) it doesn't just read float values
but generates new ones (and assumes that the hardware/libc will react in
a predictable way to them), (b) in a part of the code that has no damn
business having close dependencies on float format, and (c) for a gain
far smaller than what we got from the Ryu code.

We have had prior discussions about whether 02ddd499 justifies adding
more IEEE dependencies elsewhere. I don't think it does. IEEE 754
is not the last word that will ever be said on floating-point arithmetic,
any more than x86_64 is the last CPU architecture that anyone will ever
care about. We should keep our dependencies on it well circumscribed.

regards, tom lane

#18

Greg Stark

stark@mit.edu

about 6 years ago

In reply to: Tom Lane (#17)

Re: On disable_cost

I think this would be ready to abstract away behind a few functions that
could always be replaced by something else later...

However on further thought I really think just using a 32-bit float and 32
bits of other bitmaps or counters would be a better approach.

On Sun., Dec. 15, 2019, 14:54 Tom Lane, <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Dec 11, 2019 at 7:24 PM Laurenz Albe <laurenz.albe@cybertec.at>

wrote:

Doesn't that rely on a specific implementation of double precision

(IEEE)?

I thought that we don't want to limit ourselves to platforms with IEEE

floats.

Just by the way, you might want to read the second last paragraph of
the commit message for 02ddd499. The dream is over, we're never going
to run on Vax.

Still, the proposed hack is doubling down on IEEE dependency in a way
that I quite dislike, in that (a) it doesn't just read float values
but generates new ones (and assumes that the hardware/libc will react in
a predictable way to them), (b) in a part of the code that has no damn
business having close dependencies on float format, and (c) for a gain
far smaller than what we got from the Ryu code.

We have had prior discussions about whether 02ddd499 justifies adding
more IEEE dependencies elsewhere. I don't think it does. IEEE 754
is not the last word that will ever be said on floating-point arithmetic,
any more than x86_64 is the last CPU architecture that anyone will ever
care about. We should keep our dependencies on it well circumscribed.

regards, tom lane

#19

Jian Guo

gjian@vmware.com

over 2 years ago

In reply to: Euler Taveira (#3)

1 attachment(s)

Re: On disable_cost

Hi hackers,

I have write an initial patch to retire the disable_cost GUC, which labeled a flag on the Path struct instead of adding up a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have tested on some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review?

regards,

Jian
________________________________
From: Euler Taveira <euler@timbira.com.br>
Sent: Friday, November 1, 2019 22:48
To: Zhenghua Lyu <zlyu@vmware.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Subject: Re: On disable_cost

!! External Email

Em sex, 1 de nov de 2019 às 03:42, Zhenghua Lyu <zlv@pivotal.io> escreveu:

My issue: I did some spikes and tests on TPCDS 1TB Bytes data. For query 104, it generates
nestloop join even with enable_nestloop set off. And the final plan's total cost is very huge (about 1e24). But If I enlarge the disable_cost to 1e30, then, planner will generate hash join.

So I guess that disable_cost is not large enough for huge amount of data.

It is tricky to set disable_cost a huge number. Can we come up with better solution?

Isn't it a case for a GUC disable_cost? As Thomas suggested, DBL_MAX
upper limit should be sufficient.

The following thoughts are from Heikki:

Aside from not having a large enough disable cost, there's also the fact that the high cost might affect the rest of the plan, if we have to use a plan type that's disabled. For example, if a table doesn't have any indexes, but enable_seqscan is off, we might put the unavoidable Seq Scan on different side of a join than we we would with enable_seqscan=on, because of the high cost estimate.

I think a more robust way to disable forbidden plan types would be to handle the disabling in add_path(). Instead of having a high disable cost on the Path itself, the comparison add_path() would always consider disabled paths as more expensive than others, regardless of the cost.

I'm afraid it is not as cheap as using diable_cost as a node cost. Are
you proposing to add a new boolean variable in Path struct to handle
those cases in compare_path_costs_fuzzily?

--
Euler Taveira Timbira -
https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=05%7C01%7Cgjian%40vmware.com%7C12a30b2852dd4651667608db9401d056%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638266507757076648%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=v54JhsW8FX4mSmjgt2yP59t7xtv1mZvC%2BBhtKrfp%2FBY%3D&reserved=0<http://www.timbira.com.br/>
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.

Attachments:

0001-Retire-disable_cost-introduce-new-flag-is_disabled.patchtext/x-patch; name=0001-Retire-disable_cost-introduce-new-flag-is_disabled.patchDownload

From baf0143438a91c8739534c7d85b9d121a7bb560e Mon Sep 17 00:00:00 2001
From: Jian Guo <gjian@vmware.com>
Date: Thu, 3 Aug 2023 17:03:49 +0800
Subject: [PATCH] Retire disable_cost, introduce new flag is_disabled into Path
 struct.

Signed-off-by: Jian Guo <gjian@vmware.com>
---
 src/backend/optimizer/path/costsize.c | 30 +++++++-------
 src/backend/optimizer/util/pathnode.c | 58 +++++++++++++++++++++++++++
 src/include/nodes/pathnodes.h         |  1 +
 3 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ef475d95a1..0814604c49 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -274,7 +274,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
 		path->rows = baserel->rows;
 
 	if (!enable_seqscan)
-		startup_cost += disable_cost;
+		path->is_disabled = true;
 
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
@@ -463,7 +463,7 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 		path->path.rows = rel->rows;
 
 	if (!enable_gathermerge)
-		startup_cost += disable_cost;
+		path->path.is_disabled = true;
 
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
@@ -576,7 +576,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 	}
 
 	if (!enable_indexscan)
-		startup_cost += disable_cost;
+		path->path.is_disabled = true;
 	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
 
 	/*
@@ -1011,7 +1011,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 		path->rows = baserel->rows;
 
 	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
+		path->is_disabled = true;
 
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
@@ -1279,11 +1279,10 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	 */
 	if (isCurrentOf)
 	{
-		Assert(baserel->baserestrictcost.startup >= disable_cost);
-		startup_cost -= disable_cost;
+		path->is_disabled = false;
 	}
 	else if (!enable_tidscan)
-		startup_cost += disable_cost;
+		path->is_disabled = true;
 
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
@@ -1372,7 +1371,7 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	nseqpages = pages - 1.0;
 
 	if (!enable_tidscan)
-		startup_cost += disable_cost;
+		path->is_disabled = true;
 
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
@@ -2108,7 +2107,7 @@ cost_sort(Path *path, PlannerInfo *root,
 				   limit_tuples);
 
 	if (!enable_sort)
-		startup_cost += disable_cost;
+		path->is_disabled = true;
 
 	startup_cost += input_cost;
 
@@ -2679,8 +2678,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
 		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
+			path->is_disabled = true;
 		}
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
@@ -2696,7 +2694,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			path->is_disabled = true;
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		/* cost of computing hash value */
@@ -3076,7 +3074,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_nestloop)
-		startup_cost += disable_cost;
+		path->jpath.path.is_disabled = true;
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3523,7 +3521,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_mergejoin)
-		startup_cost += disable_cost;
+		path->jpath.path.is_disabled = true;
 
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
@@ -3953,7 +3951,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	 * disabled, which doesn't seem like the way to bet.
 	 */
 	if (!enable_hashjoin)
-		startup_cost += disable_cost;
+		path->jpath.path.is_disabled = true;
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
@@ -4050,7 +4048,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	 */
 	if (relation_byte_size(clamp_row_est(inner_path_rows * innermcvfreq),
 						   inner_path->pathtarget->width) > get_hash_memory_limit())
-		startup_cost += disable_cost;
+		path->jpath.path.is_disabled = true;
 
 	/*
 	 * Compute cost of the hashquals and qpquals (other restriction clauses)
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f123fcb41e..42d72cb1d5 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -165,6 +165,19 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 static PathCostComparison
 compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 {
+	if (path1->is_disabled && path2->is_disabled)
+	{
+		return COSTS_EQUAL;
+	}
+	else if (path1->is_disabled)
+	{
+		return COSTS_BETTER2;
+	}
+	else if (path2->is_disabled)
+	{
+		return COSTS_BETTER1;
+	}
+
 #define CONSIDER_PATH_STARTUP_COST(p)  \
 	((p)->param_info == NULL ? (p)->parent->consider_startup : (p)->parent->consider_param_startup)
 
@@ -940,6 +953,7 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = parallel_workers;
 	pathnode->pathkeys = NIL;	/* seqscan has unordered result */
+	pathnode->is_disabled = false;
 
 	cost_seqscan(pathnode, root, rel, pathnode->param_info);
 
@@ -964,6 +978,7 @@ create_samplescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* samplescan has unordered result */
+	pathnode->is_disabled = false;
 
 	cost_samplescan(pathnode, root, rel, pathnode->param_info);
 
@@ -1016,6 +1031,7 @@ create_index_path(PlannerInfo *root,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->indexinfo = index;
 	pathnode->indexclauses = indexclauses;
@@ -1059,6 +1075,7 @@ create_bitmap_heap_path(PlannerInfo *root,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = parallel_degree;
 	pathnode->path.pathkeys = NIL;	/* always unordered */
+	pathnode->path.is_disabled = false;
 
 	pathnode->bitmapqual = bitmapqual;
 
@@ -1112,6 +1129,7 @@ create_bitmap_and_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = 0;
 
 	pathnode->path.pathkeys = NIL;	/* always unordered */
+	pathnode->path.is_disabled = false;
 
 	pathnode->bitmapquals = bitmapquals;
 
@@ -1164,6 +1182,7 @@ create_bitmap_or_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = 0;
 
 	pathnode->path.pathkeys = NIL;	/* always unordered */
+	pathnode->path.is_disabled = false;
 
 	pathnode->bitmapquals = bitmapquals;
 
@@ -1192,6 +1211,7 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = NIL;	/* always unordered */
+	pathnode->path.is_disabled = false;
 
 	pathnode->tidquals = tidquals;
 
@@ -1221,6 +1241,7 @@ create_tidrangescan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = NIL;	/* always unordered */
+	pathnode->path.is_disabled = false;
 
 	pathnode->tidrangequals = tidrangequals;
 
@@ -1278,6 +1299,7 @@ create_append_path(PlannerInfo *root,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	/*
 	 * For parallel append, non-partial paths are sorted by descending total
@@ -1430,6 +1452,7 @@ create_merge_append_path(PlannerInfo *root,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 	pathnode->subpaths = subpaths;
 
 	/*
@@ -1537,6 +1560,7 @@ create_group_result_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = target->cost.startup;
 	pathnode->path.total_cost = target->cost.startup +
 		cpu_tuple_cost + target->cost.per_tuple;
+	pathnode->path.is_disabled = false;
 
 	/*
 	 * Add cost of qual, if any --- but we ignore its selectivity, since our
@@ -1576,6 +1600,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -1610,6 +1635,7 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->hash_operators = hash_operators;
@@ -1701,6 +1727,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	 * to represent it.  (This might get overridden below.)
 	 */
 	pathnode->path.pathkeys = NIL;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->in_operators = sjinfo->semi_operators;
@@ -1892,6 +1919,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->path.pathtarget = target ? target : rel->reltarget;
 	pathnode->path.rows += subpath->rows;
+	pathnode->path.is_disabled = false;
 
 	if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 	{
@@ -1977,6 +2005,7 @@ create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.parallel_safe = false;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = NIL;	/* Gather has unordered result */
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->num_workers = subpath->parallel_workers;
@@ -2021,6 +2050,7 @@ create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 	pathnode->subpath = subpath;
 
 	cost_subqueryscan(pathnode, root, rel, pathnode->path.param_info,
@@ -2049,6 +2079,7 @@ create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = pathkeys;
+	pathnode->is_disabled = false;
 
 	cost_functionscan(pathnode, root, rel, pathnode->param_info);
 
@@ -2075,6 +2106,7 @@ create_tablefuncscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* result is always unordered */
+	pathnode->is_disabled = false;
 
 	cost_tablefuncscan(pathnode, root, rel, pathnode->param_info);
 
@@ -2101,6 +2133,7 @@ create_valuesscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* result is always unordered */
+	pathnode->is_disabled = false;
 
 	cost_valuesscan(pathnode, root, rel, pathnode->param_info);
 
@@ -2126,6 +2159,7 @@ create_ctescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer)
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* XXX for now, result is always unordered */
+	pathnode->is_disabled = false;
 
 	cost_ctescan(pathnode, root, rel, pathnode->param_info);
 
@@ -2152,6 +2186,7 @@ create_namedtuplestorescan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* result is always unordered */
+	pathnode->is_disabled = false;
 
 	cost_namedtuplestorescan(pathnode, root, rel, pathnode->param_info);
 
@@ -2178,6 +2213,7 @@ create_resultscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* result is always unordered */
+	pathnode->is_disabled = false;
 
 	cost_resultscan(pathnode, root, rel, pathnode->param_info);
 
@@ -2204,6 +2240,7 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->parallel_safe = rel->consider_parallel;
 	pathnode->parallel_workers = 0;
 	pathnode->pathkeys = NIL;	/* result is always unordered */
+	pathnode->is_disabled = false;
 
 	/* Cost is the same as for a regular CTE scan */
 	cost_ctescan(pathnode, root, rel, pathnode->param_info);
@@ -2248,6 +2285,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->fdw_outerpath = fdw_outerpath;
 	pathnode->fdw_private = fdw_private;
@@ -2298,6 +2336,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->fdw_outerpath = fdw_outerpath;
 	pathnode->fdw_private = fdw_private;
@@ -2343,6 +2382,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->fdw_outerpath = fdw_outerpath;
 	pathnode->fdw_private = fdw_private;
@@ -2472,6 +2512,7 @@ create_nestloop_path(PlannerInfo *root,
 	/* This is a foolish way to estimate parallel_workers, but for now... */
 	pathnode->jpath.path.parallel_workers = outer_path->parallel_workers;
 	pathnode->jpath.path.pathkeys = pathkeys;
+	pathnode->jpath.path.is_disabled = false;
 	pathnode->jpath.jointype = jointype;
 	pathnode->jpath.inner_unique = extra->inner_unique;
 	pathnode->jpath.outerjoinpath = outer_path;
@@ -2536,6 +2577,7 @@ create_mergejoin_path(PlannerInfo *root,
 	/* This is a foolish way to estimate parallel_workers, but for now... */
 	pathnode->jpath.path.parallel_workers = outer_path->parallel_workers;
 	pathnode->jpath.path.pathkeys = pathkeys;
+	pathnode->jpath.path.is_disabled = false;
 	pathnode->jpath.jointype = jointype;
 	pathnode->jpath.inner_unique = extra->inner_unique;
 	pathnode->jpath.outerjoinpath = outer_path;
@@ -2613,6 +2655,7 @@ create_hashjoin_path(PlannerInfo *root,
 	 * outer rel than it does now.)
 	 */
 	pathnode->jpath.path.pathkeys = NIL;
+	pathnode->jpath.path.is_disabled = false;
 	pathnode->jpath.jointype = jointype;
 	pathnode->jpath.inner_unique = extra->inner_unique;
 	pathnode->jpath.outerjoinpath = outer_path;
@@ -2671,6 +2714,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	/* Projection does not change the sort order */
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -2853,6 +2897,7 @@ create_set_projection_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	/* Projection does not change the sort order XXX? */
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -2922,6 +2967,7 @@ create_incremental_sort_path(PlannerInfo *root,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -2969,6 +3015,7 @@ create_sort_path(PlannerInfo *root,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.pathkeys = pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -3015,6 +3062,7 @@ create_group_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	/* Group doesn't change sort ordering */
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 
@@ -3073,6 +3121,7 @@ create_upper_unique_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	/* Unique doesn't change the input ordering */
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->numkeys = numCols;
@@ -3151,6 +3200,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.startup_cost += target->cost.startup;
 	pathnode->path.total_cost += target->cost.startup +
 		target->cost.per_tuple * pathnode->path.rows;
+	pathnode->path.is_disabled = false;
 
 	return pathnode;
 }
@@ -3194,6 +3244,7 @@ create_groupingsets_path(PlannerInfo *root,
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
+	pathnode->path.is_disabled = false;
 	pathnode->subpath = subpath;
 
 	/*
@@ -3353,6 +3404,7 @@ create_minmaxagg_path(PlannerInfo *root,
 	/* Result is one unordered row */
 	pathnode->path.rows = 1;
 	pathnode->path.pathkeys = NIL;
+	pathnode->path.is_disabled = false;
 
 	pathnode->mmaggregates = mmaggregates;
 	pathnode->quals = quals;
@@ -3443,6 +3495,7 @@ create_windowagg_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	/* WindowAgg preserves the input sort order */
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->winclause = winclause;
@@ -3513,6 +3566,7 @@ create_setop_path(PlannerInfo *root,
 	/* SetOp preserves the input sort order if in sort mode */
 	pathnode->path.pathkeys =
 		(strategy == SETOP_SORTED) ? subpath->pathkeys : NIL;
+	pathnode->path.is_disabled = false;
 
 	pathnode->subpath = subpath;
 	pathnode->cmd = cmd;
@@ -3572,6 +3626,7 @@ create_recursiveunion_path(PlannerInfo *root,
 	pathnode->path.parallel_workers = leftpath->parallel_workers;
 	/* RecursiveUnion result is always unsorted */
 	pathnode->path.pathkeys = NIL;
+	pathnode->path.is_disabled = false;
 
 	pathnode->leftpath = leftpath;
 	pathnode->rightpath = rightpath;
@@ -3628,6 +3683,7 @@ create_lockrows_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_tuple_cost * subpath->rows;
+	pathnode->path.is_disabled = false;
 
 	return pathnode;
 }
@@ -3688,6 +3744,7 @@ create_modifytable_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = false;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.pathkeys = NIL;
+	pathnode->path.is_disabled = false;
 
 	/*
 	 * Compute cost & rowcount as subpath cost & rowcount (if RETURNING)
@@ -3777,6 +3834,7 @@ create_limit_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost;
 	pathnode->path.pathkeys = subpath->pathkeys;
+	pathnode->path.is_disabled = false;
 	pathnode->subpath = subpath;
 	pathnode->limitOffset = limitOffset;
 	pathnode->limitCount = limitCount;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a1dc1d07e1..50c5b65bde 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1631,6 +1631,7 @@ typedef struct Path
 
 	/* sort ordering of path's output; a List of PathKey nodes; see above */
 	List	   *pathkeys;
+	bool		is_disabled;
 } Path;
 
 /* Macro for extracting a path's parameterization relids; beware double eval */
-- 
2.41.0

#20

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Jian Guo (#19)

Re: On disable_cost

On Thu, Aug 3, 2023 at 5:22 AM Jian Guo <gjian@vmware.com> wrote:

I have write an initial patch to retire the disable_cost GUC, which labeled a flag on the Path struct instead of adding up a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have tested on some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review?

I took a look at this patch today. I believe that overall this may
well be an approach worth pursuing. However, more work is going to be
needed. Here are some comments:

1. You stated that it changes lots of plans in the regression tests,
but you haven't provided any sort of analysis of why those plans
changed. I'm kind of surprised that there would be "tons" of plan
changes. You (or someone) should look into why that's happening.

2. The change to compare_path_costs_fuzzily() seems incorrect to me.
When path1->is_disabled && path2->is_disabled, costs should be
compared just as they are when neither path is disabled. Instead, the
patch treats any two such paths as having equal cost. That seems
catastrophically bad. Maybe it accounts for some of those plan
changes, although that would only be true if those plans were created
while using some disabling GUC.

3. Instead of adding is_disabled at the end of the Path structure, I
suggest adding it between param_info and parallel_aware. I think if
you do that, the space used by the new field will use up padding bytes
that are currently included in the struct, instead of making it
longer.

4. A critical issue for any patch of this type is performance. This
concern was raised earlier on this thread, but your path doesn't
address it. There's no performance analysis or benchmarking included
in your email. One idea that I have is to write the cost-comparison
test like this:

if (unlikely(path1->is_disabled || path2->is_disabled))
{
if (!path1->is_disabled)
return COSTS_BETTER1;
if (!path2->is_disabled)
return COSTS_BETTER2;
/* if both disabled, fall through */
}

I'm not sure that would be enough to prevent the patch from adding
noticeably to the cost of path comparison, but maybe it would help.

5. The patch changes only compare_path_costs_fuzzily() but I wonder
whether compare_path_costs() and compare_fractional_path_costs() need
similar surgery. Whether they do or don't, there should likely be some
comments explaining the situation.

6. In fact, the patch changes no comments at all, anywhere. I'm not
sure how many comment changes a patch like this needs to make, but the
answer definitely isn't "none".

7. The patch doesn't actually remove disable_cost. I guess it should.

8. When you submit a patch, it's a good idea to also add it on
commitfest.postgresql.org. It doesn't look like that was done in this
case.

--
Robert Haas
EDB: http://www.enterprisedb.com

#21

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Robert Haas (#20)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Aug 3, 2023 at 5:22 AM Jian Guo <gjian@vmware.com> wrote:

I have write an initial patch to retire the disable_cost GUC, which labeled a flag on the Path struct instead of adding up a big cost which is hard to estimate. Though it involved in tons of plan changes in regression tests, I have tested on some simple test cases such as eagerly generate a two-stage agg plans and it worked. Could someone help to review?

I took a look at this patch today. I believe that overall this may
well be an approach worth pursuing. However, more work is going to be
needed. Here are some comments:

1. You stated that it changes lots of plans in the regression tests,
but you haven't provided any sort of analysis of why those plans
changed. I'm kind of surprised that there would be "tons" of plan
changes. You (or someone) should look into why that's happening.

I've not read the patch, but given this description I would expect
there to be *zero* regression changes --- I don't think we have any
test cases that depend on disable_cost being finite. If there's more
than zero changes, that very likely indicates a bug in the patch.
Even if there are places where the output legitimately changes, you
need to justify each one and make sure that you didn't invalidate the
intent of that test case.

BTW, having written that paragraph, I wonder if we couldn't get
the same end result with a nearly one-line change that consists of
making disable_cost be IEEE infinity. Years ago we didn't want
to rely on IEEE float semantics in this area, but nowadays I don't
see why we shouldn't.

regards, tom lane

#22

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#21)

Re: On disable_cost

On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, having written that paragraph, I wonder if we couldn't get
the same end result with a nearly one-line change that consists of
making disable_cost be IEEE infinity. Years ago we didn't want
to rely on IEEE float semantics in this area, but nowadays I don't
see why we shouldn't.

I don't think so, because I think that what will happen in that case
is that we'll pick a completely random plan if we can't pick a plan
that avoids incurring disable_cost. Every plan that contains one
disabled node anywhere in the plan tree will look like it has exactly
the same cost as any other such plan.

IMHO, this is actually one of the problems with disable_cost as it
works today. I think the semantics that we want are: if it's possible
to pick a plan where nothing is disabled, then pick the cheapest such
plan; if not, pick the cheapest plan overall. But treating
disable_cost doesn't really do that. It does the first part -- picking
the cheapest plan where nothing is disabled -- but it doesn't do the
second part, because once you add disable_cost into the cost of some
particular plan node, it screws up the rest of the planning, because
the cost estimates for the disabled nodes have no bearing in reality.
Fast-start plans, for example, will look insanely good compared to
what would be the case in normal planning (and we lean too much toward
fast-start plans even normally).

(I don't think we should care how MANY disabled nodes appear in a
plan, particularly. This is a more arguable point. Is a plan with 1
disabled node and 10% more cost better or worse than a plan with 2
disabled nodes and 10% less cost? I'd argue that counting the number
of disabled nodes isn't particularly meaningful.)

--
Robert Haas
EDB: http://www.enterprisedb.com

#23

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Robert Haas (#22)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, having written that paragraph, I wonder if we couldn't get
the same end result with a nearly one-line change that consists of
making disable_cost be IEEE infinity.

I don't think so, because I think that what will happen in that case
is that we'll pick a completely random plan if we can't pick a plan
that avoids incurring disable_cost. Every plan that contains one
disabled node anywhere in the plan tree will look like it has exactly
the same cost as any other such plan.

Good point.

IMHO, this is actually one of the problems with disable_cost as it
works today.

Yeah. I keep thinking that the right solution is to not generate
disabled paths in the first place if there are any other ways to
produce the same relation. That has obvious order-of-operations
problems though, and I've not been able to make it work.

regards, tom lane

#24

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#23)

Re: On disable_cost

On Tue, Mar 12, 2024 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah. I keep thinking that the right solution is to not generate
disabled paths in the first place if there are any other ways to
produce the same relation. That has obvious order-of-operations
problems though, and I've not been able to make it work.

I've expressed the same view in the past. It would be nice not to
waste planner effort on paths that we're just going to throw away, but
I'm not entirely sure what you mean by "obvious order-of-operations
problems."

To me, it seems like what we'd need is to be able to restart the whole
planner process if we run out of steam before we get done. For
example, suppose we're planning a 2-way join where index and
index-only scans are disabled, sorts are disabled, and nested loops
and hash joins are disabled. There's no problem generating just the
non-disabled scan types at the baserel level, but when we reach the
join, we're going to find that the only non-disabled join type is a
merge join, and we're also going to find that we have no paths that
provide pre-sorted input, so we need to sort, which we're also not
allowed to do. If we could give up at that point and restart planning,
disabling all of the plan-choice constraints and now creating all
paths for each RelOptInfo, then everything would, I believe, be just
fine. We'd end up needing neither disable_cost nor the mechanism
proposed by this patch.

But in the absence of that, we need some way to privilege the
non-disabled paths over the disabled ones -- and I'd prefer to have
something more principled than disable_cost, if we can work out the
details.

--
Robert Haas
EDB: http://www.enterprisedb.com

#25

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Robert Haas (#24)

Re: On disable_cost

On Wed, 13 Mar 2024 at 08:55, Robert Haas <robertmhaas@gmail.com> wrote:

But in the absence of that, we need some way to privilege the
non-disabled paths over the disabled ones -- and I'd prefer to have
something more principled than disable_cost, if we can work out the
details.

The primary place I see issues with disabled_cost is caused by
STD_FUZZ_FACTOR. When you add 1.0e10 to a couple of modestly costly
paths, it makes them appear fuzzily the same in cases where one could
be significantly cheaper than the other. If we were to bump up the
disable_cost it would make this problem worse.

I think we do still need some way to pick the cheapest disabled path
when there are no other options.

One way would be to set fuzz_factor to 1.0 when either of the paths
costs in compare_path_costs_fuzzily() is >= disable_cost.
clamp_row_est() does cap row estimates at MAXIMUM_ROWCOUNT (1e100), so
I think there is some value of disable_cost that could almost
certainly ensure we don't reach it because the path is truly expensive
rather than disabled.

So maybe the fix could be to set disable_cost to something like
1.0e110 and adjust compare_path_costs_fuzzily to not apply the
fuzz_factor for paths >= disable_cost. However, I wonder if that
risks the costs going infinite after a couple of cartesian joins.

David

#26

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: David Rowley (#25)

Re: On disable_cost

David Rowley <dgrowleyml@gmail.com> writes:

So maybe the fix could be to set disable_cost to something like
1.0e110 and adjust compare_path_costs_fuzzily to not apply the
fuzz_factor for paths >= disable_cost. However, I wonder if that
risks the costs going infinite after a couple of cartesian joins.

Perhaps. It still does nothing for Robert's point that once we're
forced into using a "disabled" plan type, it'd be better if the
disabled-ness didn't skew subsequent planning choices.

On the whole I agree that getting rid of disable_cost entirely
would be the way to go, if we can replace that with a separate
boolean without driving up the cost of add_path too much.

regards, tom lane

#27

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: David Rowley (#25)

Re: On disable_cost

On Tue, Mar 12, 2024 at 4:55 PM David Rowley <dgrowleyml@gmail.com> wrote:

The primary place I see issues with disabled_cost is caused by
STD_FUZZ_FACTOR. When you add 1.0e10 to a couple of modestly costly
paths, it makes them appear fuzzily the same in cases where one could
be significantly cheaper than the other. If we were to bump up the
disable_cost it would make this problem worse.

Hmm, good point.

So maybe the fix could be to set disable_cost to something like
1.0e110 and adjust compare_path_costs_fuzzily to not apply the
fuzz_factor for paths >= disable_cost. However, I wonder if that
risks the costs going infinite after a couple of cartesian joins.

Yeah, I think the disabled flag is a better answer if we can make it
work. No matter what value we pick for disable_cost, it's bound to
skew the planning sometimes.

--
Robert Haas
EDB: http://www.enterprisedb.com

#28

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#21)

Re: On disable_cost

On Tue, Mar 12, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

1. You stated that it changes lots of plans in the regression tests,
but you haven't provided any sort of analysis of why those plans
changed. I'm kind of surprised that there would be "tons" of plan
changes. You (or someone) should look into why that's happening.

I've not read the patch, but given this description I would expect
there to be *zero* regression changes --- I don't think we have any
test cases that depend on disable_cost being finite. If there's more
than zero changes, that very likely indicates a bug in the patch.
Even if there are places where the output legitimately changes, you
need to justify each one and make sure that you didn't invalidate the
intent of that test case.

I spent some more time poking at this patch. It's missing a ton of
important stuff and is wrong in a whole bunch of really serious ways,
and I'm not going to try to mention all of them in this email. But I
do want to talk about some of the more interesting realizations that
came to me as I was working my way through this.

One of the things I realized relatively early is that the patch does
nothing to propagate disable_cost upward through the plan tree. That
means that if you have a choice between, say,
Sort-over-Append-over-SeqScan and MergeAppend-over-IndexScan, as we do
in the regression tests, disabling IndexScan doesn't change the plan
with the patch applied, as it does in master. That's because only the
IndexScan node ends up flagged as disabled. Once we start stacking
other plan nodes on top of disabled plan nodes, the resultant plans
are at no disadvantage compared to plans containing no disabled nodes.
The IndexScan plan survives initially, despite being disabled, because
it's got a sort order. That lets us use it to build a MergeAppend
path, and that MergeAppend path is not disabled, and compares
favorably on cost.

After straining my brain over various plan changes for a long time,
and hacking on the code somewhat, I realized that just propagating the
Boolean upward is insufficient to set things right. That's basically
because I was being dumb when I said this:

I don't think we should care how MANY disabled nodes appear in a
plan, particularly.

Suppose we try to plan a Nested Loop with SeqScan disabled, but
there's no alternative to a SeqScan for the outer side of the join. If
we suppose an upward-propagating Boolean, every path for the join is
disabled because every path for the outer side is disabled. That means
that we have no reason to avoid paths where the inner side also uses a
disabled path. When we loop over the inner rel's pathlist looking for
ways to build a path for the join, we may find some disabled paths
there, and the join paths we build from those paths are disabled, but
so are the join paths where we use a non-disabled path on the inner
side. So those paths are just competing with each other on cost, and
the path type that is supposedly disabled on the outer side of the
join ends up not really being very disabled at all. More precisely, if
disabling a plan type causes paths to be discarded completely before
the join paths are constructed, then they actually do get removed from
consideration. But if those paths make it into inner rel's path list,
even way out towards the end, then paths derived from them can jump to
the front of the joinrel's path list.

The same kind of problem happens with Append or MergeAppend nodes. The
regression tests expect that we'll avoid disabled plan types whenever
possible even if we can't avoid them completely; for instance, the
matest0 table intentionally omits an index on one child table.
Disabling sequential scans is expected to disable them for all of the
other child tables even though for that particular child table there
is no other option. But that behavior is hard to achieve if every path
for the parent rel is "equally disabled". You either want the path
that uses only the one required SeqScan to be not-disabled even though
one of its children is disabled ... or you want the disabled flag to
be more than a Boolean. And while there's probably more than one way
to make it work, the easiest thing seems to be to just have a
disabled-counter in every node that gets initialized to the total
disabled-counter values of all of its children, and then you add 1 if
that node is itself doing something that is disabled, i.e. the exact
opposite of what I said in the quote above.

I haven't done enough work to know whether that would match the
current behavior, let alone whether it would have acceptable
performance, and I'm not at all convinced that's the right direction
anyway. Actually, at the moment, I don't have a very good idea at all
what the right direction is. I do have a feeling that this patch is
probably not going in the right direction, but I might be wrong about
that, too.

--
Robert Haas
EDB: http://www.enterprisedb.com

#29

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Robert Haas (#28)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

One of the things I realized relatively early is that the patch does
nothing to propagate disable_cost upward through the plan tree.
...
After straining my brain over various plan changes for a long time,
and hacking on the code somewhat, I realized that just propagating the
Boolean upward is insufficient to set things right. That's basically
because I was being dumb when I said this:

I don't think we should care how MANY disabled nodes appear in a
plan, particularly.

Very interesting, thanks for the summary. So the fact that
disable_cost is additive across plan nodes is actually a pretty
important property of the current setup. I think this is closely
related to one argument you made against my upthread idea of using
IEEE Infinity for disable_cost: that'd mask whether more than one
of the sub-plans had been disabled.

... And while there's probably more than one way
to make it work, the easiest thing seems to be to just have a
disabled-counter in every node that gets initialized to the total
disabled-counter values of all of its children, and then you add 1 if
that node is itself doing something that is disabled, i.e. the exact
opposite of what I said in the quote above.

Yeah, that seems like the next thing to try if anyone plans to pursue
this further. That'd essentially do what we're doing now except that
disable_cost is its own "order of infinity", entirely separate from
normal costs.

regards, tom lane

#30

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#29)

Re: On disable_cost

On Mon, Apr 1, 2024 at 5:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Very interesting, thanks for the summary. So the fact that
disable_cost is additive across plan nodes is actually a pretty
important property of the current setup. I think this is closely
related to one argument you made against my upthread idea of using
IEEE Infinity for disable_cost: that'd mask whether more than one
of the sub-plans had been disabled.

Yes, exactly. I just hadn't quite put the pieces together.

Yeah, that seems like the next thing to try if anyone plans to pursue
this further. That'd essentially do what we're doing now except that
disable_cost is its own "order of infinity", entirely separate from
normal costs.

Right. I think that's actually what I had in mind in the last
paragraph of /messages/by-id/CA+TgmoY+Ltw7B=1FSFSN4yHcu2roWrz-ijBovj-99LZU=9h1dA@mail.gmail.com
but that was a while ago and I'd lost track of why it actually
mattered. But I also have questions about whether that's really the
right approach.

I think the approach of just not generating paths we don't want in the
first place merits more consideration. We do that in some cases
already, but not in others, and I'm not clear why. Like, if
index-scans, index-only scans, sorts, nested loops, and hash joins are
disabled, something is going to have to give, because the only
remaining join type is a merge join yet we've ruled out every possible
way of getting the day into some order, but I'm not sure whether
there's some reason that we need exactly the behavior that we have
right now rather than anything else. Maybe it would be OK to just
insist on at least one unparameterized, non-partial path at the
baserel level, and then if that ends up forcing us to ignore the
join-type restrictions higher up, so be it. Or maybe that's not OK and
after I try that out I'll end up writing another email about how I was
a bit clueless about all of this. I don't know. But I feel like it
merits more investigation, because I'm having trouble shaking the
theory that what we've got right now is pretty arbitrary.

And also ... looking at the regression tests, and also thinking about
the kinds of problems that I think people run into in real
deployments, I can't help feeling like we've somehow got this whole
thing backwards. enable_wunk imagines that you want to plan as normal
except with one particular plan type excluded from consideration. And
maybe that makes sense if the point of the enable_wunk GUC is that the
planner feature might be buggy and you might therefore want to turn it
off to protect yourself, or if the planner feature might be expensive
and you might want to turn it off to save cycles. But surely that's
not the case with something like enable_seqscan or enable_indexscan.
What I think we're mostly doing in the regression tests is shutting
off every relevant type of plan except one. I theorize that what we
actually want to do is tell the planner what we do want to happen,
rather than what we don't want to happen, but we've got this weird set
of GUCs that do the opposite of that and we're super-attached to them
because they've existed forever. I don't really have a concrete
proposal here, but I wonder if what we're actually talking about here
is spending time and energy polishing a mechanism that nobody likes in
the first place.

--
Robert Haas
EDB: http://www.enterprisedb.com

#31

Greg Sabino Mullane

htamfids@gmail.com

almost 2 years ago

In reply to: Robert Haas (#30)

Re: On disable_cost

On Mon, Apr 1, 2024 at 7:54 PM Robert Haas <robertmhaas@gmail.com> wrote:

What I think we're mostly doing in the regression tests is shutting
off every relevant type of plan except one. I theorize that what we
actually want to do is tell the planner what we do want to happen,
rather than what we don't want to happen, but we've got this weird set
of GUCs that do the opposite of that and we're super-attached to them
because they've existed forever.

So rather than listing all the things we don't want to happen, we need a
way to force (nay, highly encourage) a particular solution. As our costing
is a based on positive numbers, what if we did something like this in
costsize.c?

Cost disable_cost = 1.0e10;
Cost promotion_cost = 1.0e10; // or higher or lower, depending on
how strongly we want to "beat" disable_costs effects.
...

if (!enable_seqscan)
startup_cost += disable_cost;
else if (promote_seqscan)
startup_cost -= promotion_cost; // or replace "promote" with
"encourage"?

Cheers,
Greg

#32

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Greg Sabino Mullane (#31)

Re: On disable_cost

On Tue, Apr 2, 2024 at 10:04 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:

So rather than listing all the things we don't want to happen, we need a way to force (nay, highly encourage) a particular solution. As our costing is a based on positive numbers, what if we did something like this in costsize.c?

Cost disable_cost = 1.0e10;
Cost promotion_cost = 1.0e10; // or higher or lower, depending on how strongly we want to "beat" disable_costs effects.
...

if (!enable_seqscan)
startup_cost += disable_cost;
else if (promote_seqscan)
startup_cost -= promotion_cost; // or replace "promote" with "encourage"?

I'm pretty sure negative costs are going to create a variety of
unpleasant planning artifacts. The large positive costs do, too, which
is where this whole discussion started. If I disable (or promote) some
particular plan, I want the rest of the plan tree to come out looking
as much as possible like what would have happened if the same
alternative had won organically on cost. I think the only reason we're
driving this off of costing today is that making add_path() more
complicated is unappealing, mostly on performance grounds, and if you
add disabled-ness (or promoted-ness) as a separate axis of value then
add_path() has to know about that on top of everything else. I think
the goal here is to come up with a more principled alternative that
isn't just based on whacking large numbers into the cost and hoping
something good happens ... but it is a whole lot easier to be unhappy
with the status quo than it is to come up with something that's
actually better.

I am planning to spend some more time thinking about it, though.

--
Robert Haas
EDB: http://www.enterprisedb.com

#33

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Robert Haas (#32)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 2, 2024 at 10:04 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:

if (!enable_seqscan)
startup_cost += disable_cost;
else if (promote_seqscan)
startup_cost -= promotion_cost; // or replace "promote" with "encourage"?

I'm pretty sure negative costs are going to create a variety of
unpleasant planning artifacts.

Indeed. It might be okay to have negative values for disabled-ness
if we treat disabled-ness as a "separate order of infinity", but
I suspect that it'd behave poorly when there are both disabled and
promoted sub-paths in a tree, for pretty much the same reasons you
explained just upthread.

I think the only reason we're
driving this off of costing today is that making add_path() more
complicated is unappealing, mostly on performance grounds, and if you
add disabled-ness (or promoted-ness) as a separate axis of value then
add_path() has to know about that on top of everything else.

It doesn't seem to me that it's a separate axis of value, just a
higher-order component of the cost metric. Nonetheless, adding even
a few instructions to add_path comparisons sounds expensive. Maybe
it'd be fine, but we'd need to do some performance testing.

regards, tom lane

#34

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#33)

Re: On disable_cost

On Tue, Apr 2, 2024 at 11:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm pretty sure negative costs are going to create a variety of
unpleasant planning artifacts.

Indeed. It might be okay to have negative values for disabled-ness
if we treat disabled-ness as a "separate order of infinity", but
I suspect that it'd behave poorly when there are both disabled and
promoted sub-paths in a tree, for pretty much the same reasons you
explained just upthread.

Hmm, can you explain further? I think essentially you'd be maximizing
#(promoted notes)-#(disabled nodes), but I have no real idea whether
that behavior will be exactly what people want or extremely
unintuitive or something in the middle. It seems like it should be
fine if there's only promoting or only disabling or if we can respect
both the promoting and the disabling, assuming we even want to have
both, but I'm suspicious that it will be weird somehow in other cases.
I can't say exactly in what way, though. Do you have more insight?

I think the only reason we're
driving this off of costing today is that making add_path() more
complicated is unappealing, mostly on performance grounds, and if you
add disabled-ness (or promoted-ness) as a separate axis of value then
add_path() has to know about that on top of everything else.

It doesn't seem to me that it's a separate axis of value, just a
higher-order component of the cost metric. Nonetheless, adding even
a few instructions to add_path comparisons sounds expensive. Maybe
it'd be fine, but we'd need to do some performance testing.

Hmm, yeah. I'm not sure how much difference there is between these
things in practice. I didn't run down everything that was happening,
but I think what I did was equivalent to making it a higher-order
component of the cost metric, and it seemed like an awful lot of paths
were surviving anyway, e.g. index scans survived
enable_indexscan=false because they had a sort order, and I think
sequential scans were surviving enable_seqscan=false too, perhaps
because they had no startup cost. At any rate there's no question that
add_path() is hot.

--
Robert Haas
EDB: http://www.enterprisedb.com

#35

Tom Lane

tgl@sss.pgh.pa.us

almost 2 years ago

In reply to: Robert Haas (#34)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Apr 2, 2024 at 11:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I suspect that it'd behave poorly when there are both disabled and
promoted sub-paths in a tree, for pretty much the same reasons you
explained just upthread.

Hmm, can you explain further? I think essentially you'd be maximizing
#(promoted notes)-#(disabled nodes), but I have no real idea whether
that behavior will be exactly what people want or extremely
unintuitive or something in the middle. It seems like it should be
fine if there's only promoting or only disabling or if we can respect
both the promoting and the disabling, assuming we even want to have
both, but I'm suspicious that it will be weird somehow in other cases.
I can't say exactly in what way, though. Do you have more insight?

Not really. But if you had, say, a join of a promoted path to a
disabled path, that would be treated as on-par with a join of two
regular paths, which seems like it'd lead to odd choices. Maybe
it'd be fine, but my gut says it'd likely not act nicely. As you
say, it's a lot easier to believe that only-promoted or only-disabled
situations would behave sanely.

regards, tom lane

#36

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Tom Lane (#35)

1 attachment(s)

Re: On disable_cost

On Tue, Apr 2, 2024 at 12:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Not really. But if you had, say, a join of a promoted path to a
disabled path, that would be treated as on-par with a join of two
regular paths, which seems like it'd lead to odd choices. Maybe
it'd be fine, but my gut says it'd likely not act nicely. As you
say, it's a lot easier to believe that only-promoted or only-disabled
situations would behave sanely.

Makes sense.

I wanted to further explore the idea of just not generating plans of
types that are currently disabled. I looked into doing this for
enable_indexscan and enable_indexonlyscan. As a first step, I
investigated how those settings work now, and was horrified. I don't
know whether I just wasn't paying attention back when the original
index-only scan work was done -- I remember discussing
enable_indexonlyscan with you at the time -- or whether it got changed
subsequently. Anyway, the current behavior is:

[A] enable_indexscan=false adds disable_cost to the cost of every
Index Scan path *and also* every Index-Only Scan path. So disabling
index-scans also in effect discourages the use of index-only scans,
which would make sense if we didn't have a separate setting called
enable_indexonlyscan, but we do. Given that, I think this is
completely and utterly wrong.

[b] enable_indexonlyscan=false causes index-only scan paths not to be
generated at all, but instead, we generate index-scan paths to do the
same thing that we would not have generated otherwise. This is weird
because it means that disabling one plan type causes us to consider
additional plans of another type, which seems like a thing that a user
might not expect. It's more defensible than [A], though, because you
could argue that we only omit the index scan path as an optimization,
on the theory that it will always lose to the index-only scan path,
and thus if the index-only scan path is not generated, there's a point
to generating the index scan path after all, so we should. However, it
seems unlikely to me that someone reading the one line of
documentation that we have about this parameter would be able to guess
that it works this way.

Here's an example of how the current system behaves:

robert.haas=# explain select count(*) from pgbench_accounts;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=2854.29..2854.30 rows=1 width=8)
-> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=0.29..2604.29 rows=100000 width=0)
(2 rows)

robert.haas=# set enable_indexscan=false;
SET
robert.haas=# explain select count(*) from pgbench_accounts;
QUERY PLAN
------------------------------------------------------------------------------
Aggregate (cost=2890.00..2890.01 rows=1 width=8)
-> Seq Scan on pgbench_accounts (cost=0.00..2640.00 rows=100000 width=0)
(2 rows)

robert.haas=# set enable_seqscan=false;
SET
robert.haas=# set enable_bitmapscan=false;
SET
robert.haas=# explain select count(*) from pgbench_accounts;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10000002854.29..10000002854.30 rows=1 width=8)
-> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts
(cost=10000000000.29..10000002604.29 rows=100000 width=0)
(2 rows)

robert.haas=# set enable_indexonlyscan=false;
SET
robert.haas=# explain select count(*) from pgbench_accounts;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Aggregate (cost=10000002890.00..10000002890.01 rows=1 width=8)
-> Seq Scan on pgbench_accounts
(cost=10000000000.00..10000002640.00 rows=100000 width=0)
(2 rows)

The first time we run the query, it picks an index-only scan because
it's the cheapest. When index scans are disabled, the query now picks
a sequential scan, even though it wasn't use an index-only scan,
because the index scan that it was using is perceived to have become
very expensive. When we then shut off sequential scans and bitmap-only
scans, it switches back to an index-only scan, because setting
enable_indexscan=false didn't completely disable index-only scans, but
just made them expensive. But now everything looks expensive, so we go
back to the same plan we had initially, except with the cost increased
by a bazillion. Finally, when we disable index-only scans, that
removes that plan from the pool, so now we pick the second-cheapest
plan overall, which in this case is a sequential scan.

So just to see what would happen, I wrote a patch to make
enable_indexscan and enable_indexonlyscan do exactly what they say on
the tin: when you set one of them to false, paths of that type are not
generated, and nothing else changes. I found that there are a
surprisingly large number of regression tests that rely on the current
behavior, so I took a crack at fixing them to achieve their goals (or
what I believed their goals to be) in other ways. The resulting patch
is attached for your (or anyone's) possible edification.

Just to be clear, I have no immediate plans to press forward with
trying to get something committed here. It seems pretty clear to me
that we should fix [A] in some way, but maybe not in the way I did it
here. It's also pretty clear to me that the fact that enable_indexscan
and enable_indexonlyscan work completely differently from each other
is surprising at best, wrong at worst, but here again, what this patch
does about that is not above reproach. I think it may make sense to
dig through the behavior of some of the remaining enable_* GUCs before
settling on a final strategy here, but I thought that the finds above
were interesting enough and bizarre enough that it made sense to drop
an email now and see what people think of all this before going
further.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v1-0001-Rationalize-the-behavior-of-enable_indexscan-and-.patchapplication/octet-stream; name=v1-0001-Rationalize-the-behavior-of-enable_indexscan-and-.patchDownload

From d9a81022a52c3cc2ab010aa8d2b12a164a4678b7 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 3 Apr 2024 14:27:23 -0400
Subject: [PATCH v1] Rationalize the behavior of enable_indexscan and
 enable_indexonlyscan.

Previously, setting enable_indexscan=false added disable_cost to the
cost of both index scans and index-only scans. It doesn't make sense
for enable_indexscan to affect the whether index-only scans are chosen
given that we also have a GUC called enable_indexonlyscan.

Also previously, enable_indexonlyscan worked in a completely different
manner than enable_indexscan. Rather than adding disable_cost anywhere,
enable_indexonlyscan=false caused the planner to consider an index-scan
plan in each case where, without that setting, an index-only scan would
have been considered. It doesn't make sense for enable_indexonlyscan to
work in a completely different wany than enable_indexscan.

Accordingly, revise the implementation so that when
enable_indexscan=false or enable_indexscan=false, any paths of the
corresponding types that would have been generated are not generated,
and no new paths are considered that would not have been considered
otherwise. This fixes both of the problems described above.

A surprising number of regression tests depend on the old behavior
in various ways, especially on the fact that enable_indexonlyscan=false
caused additional index scan plans to be considered. Hence, adapt
the tests to the new behavior while preserving, as far as I'm able to
understand it, the intent of the tests.
---
 contrib/btree_gist/expected/interval.out      | 16 ++--
 contrib/btree_gist/sql/interval.sql           |  7 +-
 src/backend/optimizer/path/costsize.c         |  4 -
 src/backend/optimizer/path/indxpath.c         | 20 +++--
 src/test/regress/expected/btree_index.out     |  2 +
 src/test/regress/expected/create_index.out    |  9 ++-
 src/test/regress/expected/mvcc.out            |  7 +-
 src/test/regress/expected/partition_prune.out | 75 +++++++++++++++----
 src/test/regress/expected/select.out          |  2 +
 src/test/regress/expected/select_parallel.out |  2 +
 src/test/regress/expected/stats.out           | 23 ++++--
 src/test/regress/expected/tuplesort.out       |  2 +
 src/test/regress/expected/union.out           | 37 +++++----
 src/test/regress/sql/btree_index.sql          |  4 +
 src/test/regress/sql/create_index.sql         |  8 +-
 src/test/regress/sql/mvcc.sql                 |  7 +-
 src/test/regress/sql/partition_prune.sql      | 17 +++--
 src/test/regress/sql/select.sql               |  2 +
 src/test/regress/sql/select_parallel.sql      |  2 +
 src/test/regress/sql/stats.sql                | 17 +++--
 src/test/regress/sql/tuplesort.sql            |  2 +
 src/test/regress/sql/union.sql                | 18 ++---
 22 files changed, 184 insertions(+), 99 deletions(-)

diff --git a/contrib/btree_gist/expected/interval.out b/contrib/btree_gist/expected/interval.out
index 4c3d494e4a..a62b81e36f 100644
--- a/contrib/btree_gist/expected/interval.out
+++ b/contrib/btree_gist/expected/interval.out
@@ -89,9 +89,9 @@ SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21
  @ 220 days 19 hours 5 mins 42 secs  | @ 21 days -2 hours -15 mins -41 secs
 (3 rows)
 
-SET enable_indexonlyscan=off;
+-- prevent index-only scan
 EXPLAIN (COSTS OFF)
-SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
+SELECT a, a <-> '199 days 21:21:23', substr(xmin::text, 0, 1) FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
                                 QUERY PLAN                                 
 ---------------------------------------------------------------------------
  Limit
@@ -99,11 +99,11 @@ SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21
          Order By: (a <-> '@ 199 days 21 hours 21 mins 23 secs'::interval)
 (3 rows)
 
-SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
-                  a                  |               ?column?               
--------------------------------------+--------------------------------------
- @ 199 days 21 hours 21 mins 23 secs | @ 0
- @ 183 days 6 hours 52 mins 48 secs  | @ 16 days 14 hours 28 mins 35 secs
- @ 220 days 19 hours 5 mins 42 secs  | @ 21 days -2 hours -15 mins -41 secs
+SELECT a, a <-> '199 days 21:21:23', substr(xmin::text, 0, 1) FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
+                  a                  |               ?column?               | substr 
+-------------------------------------+--------------------------------------+--------
+ @ 199 days 21 hours 21 mins 23 secs | @ 0                                  | 
+ @ 183 days 6 hours 52 mins 48 secs  | @ 16 days 14 hours 28 mins 35 secs   | 
+ @ 220 days 19 hours 5 mins 42 secs  | @ 21 days -2 hours -15 mins -41 secs | 
 (3 rows)
 
diff --git a/contrib/btree_gist/sql/interval.sql b/contrib/btree_gist/sql/interval.sql
index 346d6adcb5..493e751959 100644
--- a/contrib/btree_gist/sql/interval.sql
+++ b/contrib/btree_gist/sql/interval.sql
@@ -36,8 +36,7 @@ EXPLAIN (COSTS OFF)
 SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
 SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
 
-SET enable_indexonlyscan=off;
-
+-- prevent index-only scan
 EXPLAIN (COSTS OFF)
-SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
-SELECT a, a <-> '199 days 21:21:23' FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
+SELECT a, a <-> '199 days 21:21:23', substr(xmin::text, 0, 1) FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
+SELECT a, a <-> '199 days 21:21:23', substr(xmin::text, 0, 1) FROM intervaltmp ORDER BY a <-> '199 days 21:21:23' LIMIT 3;
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ee23ed7835..b71f720b06 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -603,10 +603,6 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
-	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
-
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
 	 * for scanning the index, as well as the selectivity of the index (ie,
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 32c6a8bbdc..5ec5b69c51 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -765,7 +765,13 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		IndexPath  *ipath = (IndexPath *) lfirst(lc);
 
 		if (index->amhasgettuple)
-			add_path(rel, (Path *) ipath);
+		{
+			if (ipath->path.pathtype == T_IndexScan && enable_indexscan)
+				add_path(rel, (Path *) ipath);
+			else if (ipath->path.pathtype == T_IndexOnlyScan &&
+				enable_indexonlyscan)
+				add_path(rel, (Path *) ipath);
+		}
 
 		if (index->amhasgetbitmap &&
 			(ipath->path.pathkeys == NIL ||
@@ -862,6 +868,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		case ST_INDEXSCAN:
 			if (!index->amhasgettuple)
 				return NIL;
+			if (!enable_indexscan && !enable_indexonlyscan)
+				return NIL;
 			break;
 		case ST_BITMAPSCAN:
 			if (!index->amhasgetbitmap)
@@ -1031,7 +1039,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		 */
 		if (index->amcanparallel &&
 			rel->consider_parallel && outer_relids == NULL &&
-			scantype != ST_BITMAPSCAN)
+			scantype != ST_BITMAPSCAN &&
+			(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 		{
 			ipath = create_index_path(root, index,
 									  index_clauses,
@@ -1081,7 +1090,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 			/* If appropriate, consider parallel index scan */
 			if (index->amcanparallel &&
 				rel->consider_parallel && outer_relids == NULL &&
-				scantype != ST_BITMAPSCAN)
+				scantype != ST_BITMAPSCAN &&
+				(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 			{
 				ipath = create_index_path(root, index,
 										  index_clauses,
@@ -1789,10 +1799,6 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
 	ListCell   *lc;
 	int			i;
 
-	/* Index-only scans must be enabled */
-	if (!enable_indexonlyscan)
-		return false;
-
 	/*
 	 * Check that all needed attributes of the relation are available from the
 	 * index.
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 8311a03c3d..115f7c46f8 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -238,6 +238,7 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 (2 rows)
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -278,6 +279,7 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 ---------
 (0 rows)
 
+reset enable_indexonlyscan;
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                            QUERY PLAN                            
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 70ab47a92f..551a8b0331 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -618,6 +618,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 EXPLAIN (COSTS OFF)
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
@@ -643,6 +644,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 --
 -- GIN over int[] and text[]
@@ -1952,10 +1954,10 @@ ORDER BY thousand;
         1 |     1001
 (2 rows)
 
-SET enable_indexonlyscan = OFF;
 explain (costs off)
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
+AND two IS NOT DISTINCT FROM two
 ORDER BY thousand;
                                       QUERY PLAN                                      
 --------------------------------------------------------------------------------------
@@ -1963,10 +1965,12 @@ ORDER BY thousand;
    Sort Key: thousand
    ->  Index Scan using tenk1_thous_tenthous on tenk1
          Index Cond: ((thousand < 2) AND (tenthous = ANY ('{1001,3000}'::integer[])))
-(4 rows)
+         Filter: (NOT (two IS DISTINCT FROM two))
+(5 rows)
 
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
+AND two IS NOT DISTINCT FROM two
 ORDER BY thousand;
  thousand | tenthous 
 ----------+----------
@@ -1974,7 +1978,6 @@ ORDER BY thousand;
         1 |     1001
 (2 rows)
 
-RESET enable_indexonlyscan;
 --
 -- Check elimination of constant-NULL subexpressions
 --
diff --git a/src/test/regress/expected/mvcc.out b/src/test/regress/expected/mvcc.out
index 225c39f64f..38802f89f3 100644
--- a/src/test/regress/expected/mvcc.out
+++ b/src/test/regress/expected/mvcc.out
@@ -8,7 +8,6 @@
 -- this.
 BEGIN;
 SET LOCAL enable_seqscan = false;
-SET LOCAL enable_indexonlyscan = false;
 SET LOCAL enable_bitmapscan = false;
 -- Can't easily use a unique index, since dead tuples can be found
 -- independent of the kill_prior_tuples optimization.
@@ -22,8 +21,10 @@ BEGIN
     -- iterate often enough to see index growth even on larger-than-default page sizes
     FOR i IN 1..100 LOOP
         BEGIN
-	    -- perform index scan over all the inserted keys to get them to be seen as dead
-            IF EXISTS(SELECT * FROM clean_aborted_self WHERE key > 0 AND key < 100) THEN
+	    -- perform index scan over all the inserted keys to get them to be seen
+	    -- as dead; we mention an unindexed column here so that the planner
+	    -- cannot believe that an index-only scan is possible
+            IF EXISTS(SELECT * FROM clean_aborted_self WHERE key > 0 AND key < 100 AND data IS NOT DISTINCT FROM data) THEN
 	        RAISE data_corrupted USING MESSAGE = 'these rows should not exist';
             END IF;
             INSERT INTO clean_aborted_self SELECT g.i, 'rolling back in a sec' FROM generate_series(1, 100) g(i);
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 46b78ba3c4..eeaba2ebd3 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -2480,7 +2480,7 @@ where c.relname like 'ab\_%' order by c.relname;
  ab_a3_b3_a_idx |        1 |         0 | t          |                  |                  
 (21 rows)
 
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(0, 0, 1)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(0, 0, 1) and ab.b in (1, 2, 3)');
                                         explain_parallel_append                                         
 --------------------------------------------------------------------------------------------------------
  Finalize Aggregate (actual rows=N loops=N)
@@ -2494,27 +2494,36 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                      ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 ab_1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 ab_2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed)
                                  Index Cond: (a = a.a)
-(27 rows)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
+(36 rows)
 
 -- Ensure the same partitions are pruned when we make the nested loop
 -- parameter an Expr rather than a plain Param.
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a + 0 where a.a in(0, 0, 1)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a + 0 where a.a in(0, 0, 1) and ab.b in (1, 2, 3)');
                                         explain_parallel_append                                         
 --------------------------------------------------------------------------------------------------------
  Finalize Aggregate (actual rows=N loops=N)
@@ -2528,26 +2537,35 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                      ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 ab_1 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 ab_2 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed)
                                  Index Cond: (a = (a.a + 0))
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed)
                                  Index Cond: (a = (a.a + 0))
-(27 rows)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
+(36 rows)
 
 insert into lprt_a values(3),(3);
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 3)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 3) and ab.b in (1, 2, 3)');
                                         explain_parallel_append                                         
 --------------------------------------------------------------------------------------------------------
  Finalize Aggregate (actual rows=N loops=N)
@@ -2561,25 +2579,34 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                      ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 ab_1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 ab_2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
-(27 rows)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
+(36 rows)
 
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0) and ab.b in (1, 2, 3)');
                                         explain_parallel_append                                         
 --------------------------------------------------------------------------------------------------------
  Finalize Aggregate (actual rows=N loops=N)
@@ -2594,26 +2621,35 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                      ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 ab_1 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 ab_2 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed)
                                  Index Cond: (a = a.a)
-(28 rows)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
+(37 rows)
 
 delete from lprt_a where a = 1;
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0) and ab.b in (1, 2, 3)');
                                      explain_parallel_append                                     
 -------------------------------------------------------------------------------------------------
  Finalize Aggregate (actual rows=N loops=N)
@@ -2628,23 +2664,32 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on
                      ->  Append (actual rows=N loops=N)
                            ->  Index Scan using ab_a1_b1_a_idx on ab_a1_b1 ab_1 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b2_a_idx on ab_a1_b2 ab_2 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed)
                                  Index Cond: (a = a.a)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
                            ->  Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed)
                                  Index Cond: (a = a.a)
-(28 rows)
+                                 Filter: (b = ANY ('{1,2,3}'::integer[]))
+(37 rows)
 
 reset enable_hashjoin;
 reset enable_mergejoin;
@@ -2978,7 +3023,7 @@ drop table ab, lprt_a;
 create table tbl1(col1 int);
 insert into tbl1 values (501), (505);
 -- Basic table
-create table tprt (col1 int) partition by range (col1);
+create table tprt (col1 int, col2 int) partition by range (col1);
 create table tprt_1 partition of tprt for values from (1) to (501);
 create table tprt_2 partition of tprt for values from (501) to (1001);
 create table tprt_3 partition of tprt for values from (1001) to (2001);
@@ -2991,7 +3036,8 @@ create index tprt3_idx on tprt_3 (col1);
 create index tprt4_idx on tprt_4 (col1);
 create index tprt5_idx on tprt_5 (col1);
 create index tprt6_idx on tprt_6 (col1);
-insert into tprt values (10), (20), (501), (502), (505), (1001), (4500);
+insert into tprt values (10, 0), (20, 0), (501, 0), (502, 0), (505, 0),
+	(1001, 0), (4500, 0);
 set enable_hashjoin = off;
 set enable_mergejoin = off;
 explain (analyze, costs off, summary off, timing off)
@@ -3584,7 +3630,7 @@ explain (analyze, verbose, costs off, summary off, timing off) execute mt_q2 (35
 
 deallocate mt_q2;
 -- ensure initplan params properly prune partitions
-explain (analyze, costs off, summary off, timing off) select * from ma_test where a >= (select min(b) from ma_test_p2) order by b;
+explain (analyze, costs off, summary off, timing off) select * from ma_test where a >= (select min(b) from ma_test_p2 where a >= 0) order by b;
                                           QUERY PLAN                                           
 -----------------------------------------------------------------------------------------------
  Merge Append (actual rows=20 loops=1)
@@ -3595,13 +3641,14 @@ explain (analyze, costs off, summary off, timing off) select * from ma_test wher
              ->  Limit (actual rows=1 loops=1)
                    ->  Index Scan using ma_test_p2_b_idx on ma_test_p2 (actual rows=1 loops=1)
                          Index Cond: (b IS NOT NULL)
+                         Filter: (a >= 0)
    ->  Index Scan using ma_test_p1_b_idx on ma_test_p1 ma_test_1 (never executed)
          Filter: (a >= (InitPlan 2).col1)
    ->  Index Scan using ma_test_p2_b_idx on ma_test_p2 ma_test_2 (actual rows=10 loops=1)
          Filter: (a >= (InitPlan 2).col1)
    ->  Index Scan using ma_test_p3_b_idx on ma_test_p3 ma_test_3 (actual rows=10 loops=1)
          Filter: (a >= (InitPlan 2).col1)
-(14 rows)
+(15 rows)
 
 reset enable_seqscan;
 reset enable_sort;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0e..5be4ac94a2 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -844,6 +844,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
                          QUERY PLAN                          
@@ -861,6 +862,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 (1 row)
 
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 -- check multi-index cases too
 explain (costs off)
 select unique1, unique2 from onek2
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 4ffc5b4c56..8a5235358b 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -497,6 +497,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -597,6 +598,7 @@ select * from explain_parallel_sort_stats();
 (14 rows)
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 6e08898b18..f864878869 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -92,8 +92,9 @@ SELECT count(*) FROM tenk2;
 
 -- do an indexscan
 -- make sure it is not a bitmap scan, which might skip fetching heap tuples
+-- mention an unrelated column to forestall an index-only scan
 SET enable_bitmapscan TO off;
-SELECT count(*) FROM tenk2 WHERE unique1 = 1;
+SELECT count(*) FROM tenk2 WHERE unique1 = 1 and two is not distinct from two;
  count 
 -------
      1
@@ -593,6 +594,7 @@ SELECT seq_scan, idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'
 (1 row)
 
 -- ensure we start out with exactly one index and sequential scan
+-- as above, mention another column to forestall an index-only scan
 BEGIN;
 SET LOCAL enable_seqscan TO on;
 SET LOCAL enable_indexscan TO on;
@@ -612,15 +614,18 @@ SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
 (1 row)
 
 SET LOCAL enable_seqscan TO off;
-EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
                           QUERY PLAN                          
 --------------------------------------------------------------
  Aggregate
    ->  Index Scan using test_last_scan_pkey on test_last_scan
          Index Cond: (idx_col = 1)
-(3 rows)
+         Filter: (NOT (noidx_col IS DISTINCT FROM noidx_col))
+(4 rows)
 
-SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
  count 
 -------
      1
@@ -686,19 +691,23 @@ SELECT pg_sleep(0.1);
 (1 row)
 
 -- cause one index scan
+-- as above, mention another column to forestall an index-only scan
 BEGIN;
 SET LOCAL enable_seqscan TO off;
 SET LOCAL enable_indexscan TO on;
 SET LOCAL enable_bitmapscan TO off;
-EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
                           QUERY PLAN                          
 --------------------------------------------------------------
  Aggregate
    ->  Index Scan using test_last_scan_pkey on test_last_scan
          Index Cond: (idx_col = 1)
-(3 rows)
+         Filter: (NOT (noidx_col IS DISTINCT FROM noidx_col))
+(4 rows)
 
-SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
  count 
 -------
      1
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 0e8b5bf4a3..eeea0f7926 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -349,6 +349,7 @@ ROLLBACK;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -445,6 +446,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 26b718e903..05ff08582e 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -1129,12 +1129,12 @@ INSERT INTO t1c VALUES ('v', 'w'), ('c', 'd'), ('m', 'n'), ('e', 'f');
 INSERT INTO t2c VALUES ('vw'), ('cd'), ('mn'), ('ef');
 CREATE INDEX t1c_ab_idx on t1c ((a || b));
 set enable_seqscan = on;
-set enable_indexonlyscan = off;
+-- force xmin to be fetched to avoid an index-only scan
 explain (costs off)
   SELECT * FROM
-  (SELECT a || b AS ab FROM t1
+  (SELECT a || b AS ab, substr(xmin::text, 1, 0) AS c FROM t1
    UNION ALL
-   SELECT ab FROM t2) t
+   SELECT ab, substr(xmin::text, 1, 0) FROM t2) t
   ORDER BY 1 LIMIT 8;
                      QUERY PLAN                      
 -----------------------------------------------------
@@ -1148,20 +1148,20 @@ explain (costs off)
 (7 rows)
 
   SELECT * FROM
-  (SELECT a || b AS ab FROM t1
+  (SELECT a || b AS ab, substr(xmin::text, 1, 0) AS c FROM t1
    UNION ALL
-   SELECT ab FROM t2) t
+   SELECT ab, substr(xmin::text, 1, 0) FROM t2) t
   ORDER BY 1 LIMIT 8;
- ab 
-----
- ab
- ab
- cd
- dc
- ef
- fe
- mn
- nm
+ ab | c 
+----+---
+ ab | 
+ ab | 
+ cd | 
+ dc | 
+ ef | 
+ fe | 
+ mn | 
+ nm | 
 (8 rows)
 
 reset enable_seqscan;
@@ -1173,10 +1173,10 @@ create table events (event_id int primary key);
 create table other_events (event_id int primary key);
 create table events_child () inherits (events);
 explain (costs off)
-select event_id
- from (select event_id from events
+select event_id, xmin
+ from (select event_id, xmin from events
        union all
-       select event_id from other_events) ss
+       select event_id, xmin from other_events) ss
  order by event_id;
                         QUERY PLAN                        
 ----------------------------------------------------------
@@ -1190,7 +1190,6 @@ select event_id
 (7 rows)
 
 drop table events_child, events, other_events;
-reset enable_indexonlyscan;
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql
index ef84354234..0dfb3e6312 100644
--- a/src/test/regress/sql/btree_index.sql
+++ b/src/test/regress/sql/btree_index.sql
@@ -153,6 +153,7 @@ explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -160,6 +161,9 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
 explain (costs off)
 select proname from pg_proc where proname ilike '00%foo' order by 1;
 select proname from pg_proc where proname ilike '00%foo' order by 1;
+
+reset enable_indexonlyscan;
+
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index d49ce9f300..490e402f4d 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -246,6 +246,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 
 EXPLAIN (COSTS OFF)
@@ -254,6 +255,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 
 --
@@ -774,19 +776,17 @@ SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
 
-SET enable_indexonlyscan = OFF;
-
 explain (costs off)
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
+AND two IS NOT DISTINCT FROM two
 ORDER BY thousand;
 
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
+AND two IS NOT DISTINCT FROM two
 ORDER BY thousand;
 
-RESET enable_indexonlyscan;
-
 --
 -- Check elimination of constant-NULL subexpressions
 --
diff --git a/src/test/regress/sql/mvcc.sql b/src/test/regress/sql/mvcc.sql
index 0a3ebc88f3..df4b197966 100644
--- a/src/test/regress/sql/mvcc.sql
+++ b/src/test/regress/sql/mvcc.sql
@@ -9,7 +9,6 @@
 BEGIN;
 
 SET LOCAL enable_seqscan = false;
-SET LOCAL enable_indexonlyscan = false;
 SET LOCAL enable_bitmapscan = false;
 
 -- Can't easily use a unique index, since dead tuples can be found
@@ -26,8 +25,10 @@ BEGIN
     -- iterate often enough to see index growth even on larger-than-default page sizes
     FOR i IN 1..100 LOOP
         BEGIN
-	    -- perform index scan over all the inserted keys to get them to be seen as dead
-            IF EXISTS(SELECT * FROM clean_aborted_self WHERE key > 0 AND key < 100) THEN
+	    -- perform index scan over all the inserted keys to get them to be seen
+	    -- as dead; we mention an unindexed column here so that the planner
+	    -- cannot believe that an index-only scan is possible
+            IF EXISTS(SELECT * FROM clean_aborted_self WHERE key > 0 AND key < 100 AND data IS NOT DISTINCT FROM data) THEN
 	        RAISE data_corrupted USING MESSAGE = 'these rows should not exist';
             END IF;
             INSERT INTO clean_aborted_self SELECT g.i, 'rolling back in a sec' FROM generate_series(1, 100) g(i);
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index dc71693861..25f5142b84 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -614,20 +614,20 @@ left join pg_stat_all_tables s on c.oid = s.relid
 left join pg_index i on c.oid = i.indexrelid
 where c.relname like 'ab\_%' order by c.relname;
 
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(0, 0, 1)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(0, 0, 1) and ab.b in (1, 2, 3)');
 
 -- Ensure the same partitions are pruned when we make the nested loop
 -- parameter an Expr rather than a plain Param.
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a + 0 where a.a in(0, 0, 1)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a + 0 where a.a in(0, 0, 1) and ab.b in (1, 2, 3)');
 
 insert into lprt_a values(3),(3);
 
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 3)');
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 3) and ab.b in (1, 2, 3)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0) and ab.b in (1, 2, 3)');
 
 delete from lprt_a where a = 1;
 
-select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)');
+select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0) and ab.b in (1, 2, 3)');
 
 reset enable_hashjoin;
 reset enable_mergejoin;
@@ -708,7 +708,7 @@ create table tbl1(col1 int);
 insert into tbl1 values (501), (505);
 
 -- Basic table
-create table tprt (col1 int) partition by range (col1);
+create table tprt (col1 int, col2 int) partition by range (col1);
 create table tprt_1 partition of tprt for values from (1) to (501);
 create table tprt_2 partition of tprt for values from (501) to (1001);
 create table tprt_3 partition of tprt for values from (1001) to (2001);
@@ -723,7 +723,8 @@ create index tprt4_idx on tprt_4 (col1);
 create index tprt5_idx on tprt_5 (col1);
 create index tprt6_idx on tprt_6 (col1);
 
-insert into tprt values (10), (20), (501), (502), (505), (1001), (4500);
+insert into tprt values (10, 0), (20, 0), (501, 0), (502, 0), (505, 0),
+	(1001, 0), (4500, 0);
 
 set enable_hashjoin = off;
 set enable_mergejoin = off;
@@ -963,7 +964,7 @@ explain (analyze, verbose, costs off, summary off, timing off) execute mt_q2 (35
 deallocate mt_q2;
 
 -- ensure initplan params properly prune partitions
-explain (analyze, costs off, summary off, timing off) select * from ma_test where a >= (select min(b) from ma_test_p2) order by b;
+explain (analyze, costs off, summary off, timing off) select * from ma_test where a >= (select min(b) from ma_test_p2 where a >= 0) order by b;
 
 reset enable_seqscan;
 reset enable_sort;
diff --git a/src/test/regress/sql/select.sql b/src/test/regress/sql/select.sql
index 019f1e7673..e5f41b649a 100644
--- a/src/test/regress/sql/select.sql
+++ b/src/test/regress/sql/select.sql
@@ -218,10 +218,12 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 -- check multi-index cases too
 explain (costs off)
 select unique1, unique2 from onek2
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index c43a5b2119..ad5c734279 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -193,6 +193,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -240,6 +241,7 @@ $$;
 select * from explain_parallel_sort_stats();
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index d8ac0d06f4..c5a8b18db5 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -94,8 +94,9 @@ ROLLBACK;
 SELECT count(*) FROM tenk2;
 -- do an indexscan
 -- make sure it is not a bitmap scan, which might skip fetching heap tuples
+-- mention an unrelated column to forestall an index-only scan
 SET enable_bitmapscan TO off;
-SELECT count(*) FROM tenk2 WHERE unique1 = 1;
+SELECT count(*) FROM tenk2 WHERE unique1 = 1 and two is not distinct from two;
 RESET enable_bitmapscan;
 
 -- ensure pending stats are flushed
@@ -310,6 +311,7 @@ SELECT pg_stat_reset_single_table_counters('test_last_scan'::regclass);
 SELECT seq_scan, idx_scan FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 
 -- ensure we start out with exactly one index and sequential scan
+-- as above, mention another column to forestall an index-only scan
 BEGIN;
 SET LOCAL enable_seqscan TO on;
 SET LOCAL enable_indexscan TO on;
@@ -317,8 +319,10 @@ SET LOCAL enable_bitmapscan TO off;
 EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
 SELECT count(*) FROM test_last_scan WHERE noidx_col = 1;
 SET LOCAL enable_seqscan TO off;
-EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
-SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
 SELECT pg_stat_force_next_flush();
 COMMIT;
 
@@ -346,12 +350,15 @@ FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass \gset
 SELECT pg_sleep(0.1);
 
 -- cause one index scan
+-- as above, mention another column to forestall an index-only scan
 BEGIN;
 SET LOCAL enable_seqscan TO off;
 SET LOCAL enable_indexscan TO on;
 SET LOCAL enable_bitmapscan TO off;
-EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
-SELECT count(*) FROM test_last_scan WHERE idx_col = 1;
+EXPLAIN (COSTS off) SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
+SELECT count(*) FROM test_last_scan WHERE idx_col = 1
+   AND noidx_col IS NOT DISTINCT FROM noidx_col;
 SELECT pg_stat_force_next_flush();
 COMMIT;
 -- check that just index scan stats were incremented
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 658fe98dc5..ae1a5b9d0a 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -153,6 +153,7 @@ ROLLBACK;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -183,6 +184,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
diff --git a/src/test/regress/sql/union.sql b/src/test/regress/sql/union.sql
index 8afc580c63..30ab3bf941 100644
--- a/src/test/regress/sql/union.sql
+++ b/src/test/regress/sql/union.sql
@@ -401,19 +401,19 @@ INSERT INTO t2c VALUES ('vw'), ('cd'), ('mn'), ('ef');
 CREATE INDEX t1c_ab_idx on t1c ((a || b));
 
 set enable_seqscan = on;
-set enable_indexonlyscan = off;
 
+-- force xmin to be fetched to avoid an index-only scan
 explain (costs off)
   SELECT * FROM
-  (SELECT a || b AS ab FROM t1
+  (SELECT a || b AS ab, substr(xmin::text, 1, 0) AS c FROM t1
    UNION ALL
-   SELECT ab FROM t2) t
+   SELECT ab, substr(xmin::text, 1, 0) FROM t2) t
   ORDER BY 1 LIMIT 8;
 
   SELECT * FROM
-  (SELECT a || b AS ab FROM t1
+  (SELECT a || b AS ab, substr(xmin::text, 1, 0) AS c FROM t1
    UNION ALL
-   SELECT ab FROM t2) t
+   SELECT ab, substr(xmin::text, 1, 0) FROM t2) t
   ORDER BY 1 LIMIT 8;
 
 reset enable_seqscan;
@@ -428,16 +428,14 @@ create table other_events (event_id int primary key);
 create table events_child () inherits (events);
 
 explain (costs off)
-select event_id
- from (select event_id from events
+select event_id, xmin
+ from (select event_id, xmin from events
        union all
-       select event_id from other_events) ss
+       select event_id, xmin from other_events) ss
  order by event_id;
 
 drop table events_child, events, other_events;
 
-reset enable_indexonlyscan;
-
 -- Test constraint exclusion of UNION ALL subqueries
 explain (costs off)
  SELECT * FROM
-- 
2.39.3 (Apple Git-145)

#37

Greg Sabino Mullane

htamfids@gmail.com

almost 2 years ago

In reply to: Robert Haas (#36)

Re: On disable_cost

On Wed, Apr 3, 2024 at 3:21 PM Robert Haas <robertmhaas@gmail.com> wrote:

It's also pretty clear to me that the fact that enable_indexscan
and enable_indexonlyscan work completely differently from each other
is surprising at best, wrong at worst, but here again, what this patch
does about that is not above reproach.

Yes, that is wrong, surely there is a reason we have two vars. Thanks for
digging into this: if nothing else, the code will be better for this
discussion, even if we do nothing for now with disable_cost.

Cheers,
Greg

#38

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: Robert Haas (#36)

Re: On disable_cost

On Thu, 4 Apr 2024 at 08:21, Robert Haas <robertmhaas@gmail.com> wrote:

I wanted to further explore the idea of just not generating plans of
types that are currently disabled. I looked into doing this for
enable_indexscan and enable_indexonlyscan. As a first step, I
investigated how those settings work now, and was horrified. I don't
know whether I just wasn't paying attention back when the original
index-only scan work was done -- I remember discussing
enable_indexonlyscan with you at the time -- or whether it got changed
subsequently. Anyway, the current behavior is:

[A] enable_indexscan=false adds disable_cost to the cost of every
Index Scan path *and also* every Index-Only Scan path. So disabling
index-scans also in effect discourages the use of index-only scans,
which would make sense if we didn't have a separate setting called
enable_indexonlyscan, but we do. Given that, I think this is
completely and utterly wrong.

[b] enable_indexonlyscan=false causes index-only scan paths not to be
generated at all, but instead, we generate index-scan paths to do the
same thing that we would not have generated otherwise.

FWIW, I think changing this is a bad idea and I don't think the
behaviour that's in your patch is useful. With your patch, if I SET
enable_indexonlyscan=false, any index that *can* support an IOS for my
query will now not be considered at all!

With your patch applied, I see:

-- default enable_* GUC values.
postgres=# explain select oid from pg_class order by oid;
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Only Scan using pg_class_oid_index on pg_class
(cost=0.27..22.50 rows=415 width=4)
(1 row)

postgres=# set enable_indexonlyscan=0; -- no index scan?
SET
postgres=# explain select oid from pg_class order by oid;
QUERY PLAN
-----------------------------------------------------------------
Sort (cost=36.20..37.23 rows=415 width=4)
Sort Key: oid
-> Seq Scan on pg_class (cost=0.00..18.15 rows=415 width=4)
(3 rows)

postgres=# set enable_seqscan=0; -- still no index scan!
SET
postgres=# explain select oid from pg_class order by oid;
QUERY PLAN
------------------------------------------------------------------------------------
Sort (cost=10000000036.20..10000000037.23 rows=415 width=4)
Sort Key: oid
-> Seq Scan on pg_class (cost=10000000000.00..10000000018.15
rows=415 width=4)
(3 rows)

postgres=# explain select oid from pg_class order by oid,relname; --
now an index scan?!
QUERY PLAN
---------------------------------------------------------------------------------------------
Incremental Sort (cost=0.43..79.50 rows=415 width=68)
Sort Key: oid, relname
Presorted Key: oid
-> Index Scan using pg_class_oid_index on pg_class
(cost=0.27..60.82 rows=415 width=68)
(4 rows)

I don't think this is good as pg_class_oid_index effectively won't be
used as long as the particular query could use that index with an
index *only* scan. You can see above that as soon as I adjust the
query slightly so that IOS isn't possible, the index can be used
again. I think an Index Scan would have been a much better option for
the 2nd query than the seq scan and sort.

I think if I do SET enable_indexonlyscan=0; the index should still be
used with an Index Scan and it definitely shouldn't result in Index
Scan also being disabled if that index happens to contain all the
columns required to support an IOS.

FWIW, I'm fine with the current behaviour. It looks like we've
assumed that, when possible, IOS are always superior to Index Scan, so
there's no point in generating an Index Scan path when we can generate
an IOS path. I think this makes sense. For that not to be true,
checking the all visible flag would have to be more costly than
visiting the heap. Perhaps that could be true if the visibility map
page had to come from disk and the heap page was cached and the disk
was slow, but I don't think that scenario is worthy of considering
both Index scan and IOS path types when IOS is possible. We've no way
to accurately cost that anyway.

This all seems similar to enable_sort vs enable_incremental_sort. For
a while, we did consider both an incremental sort and a sort when an
incremental sort was possible, but it seemed to me that an incremental
sort would always be better when it was possible, so I changed that in
4a29eabd1. I've not seen anyone complain. I made it so that when an
incremental sort is possible but is disabled, we do a sort instead.
That seems fairly similar to how master handles
enable_indexonlyscan=false.

In short, I don't find it strange that disabling one node type results
in considering another type that we'd otherwise not consider in cases
where we assume that the disabled node type is always superior and
should always be used when it is possible.

I do agree that adding disable_cost to IOS when enable_indexscan=0 is
a bit weird. We don't penalise incremental sorts when sorts are
disabled, so aligning those might make sense.

David

#39

David Rowley

dgrowleyml@gmail.com

almost 2 years ago

In reply to: David Rowley (#38)

Re: On disable_cost

On Thu, 4 Apr 2024 at 10:15, David Rowley <dgrowleyml@gmail.com> wrote:

In short, I don't find it strange that disabling one node type results
in considering another type that we'd otherwise not consider in cases
where we assume that the disabled node type is always superior and
should always be used when it is possible.

In addition to what I said earlier, I think the current
enable_indexonlyscan is implemented in a way that has the planner do
what it did before IOS was added. I think that goal makes sense with
any patch that make the planner try something new. We want to have
some method to get the previous behaviour for the cases where the
planner makes a dumb choice or to avoid some bug in the new feature.

I think using that logic, the current scenario with enable_indexscan
and enable_indexonlyscan makes complete sense. I mean, including
enable_indexscan=0 adding disable_cost to IOS Paths.

David

#40

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: David Rowley (#39)

Re: On disable_cost

On Wed, Apr 3, 2024 at 11:09 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 4 Apr 2024 at 10:15, David Rowley <dgrowleyml@gmail.com> wrote:

In short, I don't find it strange that disabling one node type results
in considering another type that we'd otherwise not consider in cases
where we assume that the disabled node type is always superior and
should always be used when it is possible.

In addition to what I said earlier, I think the current
enable_indexonlyscan is implemented in a way that has the planner do
what it did before IOS was added. I think that goal makes sense with
any patch that make the planner try something new. We want to have
some method to get the previous behaviour for the cases where the
planner makes a dumb choice or to avoid some bug in the new feature.

I see the logic of this, and I agree that the resulting behavior might
be more intuitive than what I posted before. I'll do some experiments.

I think using that logic, the current scenario with enable_indexscan
and enable_indexonlyscan makes complete sense. I mean, including
enable_indexscan=0 adding disable_cost to IOS Paths.

This, for me, is a bridge too far. I don't think there's a real
argument that "what the planner did before IOS was added" was add
disable_cost to the cost of index-only scan paths. There was no such
path type. Independently of that argument, I also think the behavior
of a setting needs to be something that a user can understand. Right
now, the documentation says:

Enables or disables the query planner's use of index-scan plan types.
The default is on.
Enables or disables the query planner's use of index-only-scan plan
types (see Section 11.9). The default is on.

I do not think that a user can be expected to guess from these
descriptions that the first one also affects index-only scans, or that
the two GUCs disable their respective plan types in completely
different ways. Granted, the latter inconsistency affects a whole
bunch of these settings, not just this one, but still.

--
Robert Haas
EDB: http://www.enterprisedb.com

#41

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#11)

Re: On disable_cost

On Sat, Nov 2, 2019 at 10:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The idea that I've been thinking about is to not generate disabled
Paths in the first place, thus not only fixing the problem but saving
some cycles. While this seems easy enough for "optional" paths,
we have to reserve the ability to generate certain path types regardless,
if there's no other way to implement the query. This is a bit of a
stumbling block :-(. At the base relation level, we could do something
like generating seqscan last, and only if no other path has been
successfully generated.

Continuing my investigation into this rather old thread, I did a
rather primitive implementation of this idea, for baserels only, and
discovered that it caused a small number of planner failures running
the regression tests. Here is a slightly simplified example:

CREATE TABLE strtest (n text, t text);
CREATE INDEX strtest_n_idx ON strtest (n);
SET enable_seqscan=false;
EXPLAIN SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;

With the patch, I get:

ERROR: could not devise a query plan for the given query

The problem here is that it's perfectly possible to generate a valid
path for s1 -- and likewise for s2, since it's the same underlying
relation -- while respecting the enable_seqscan=false constraint.
However, all such paths are parameterized by the other of the two
relations, which means that if we do that, we can't plan the join,
because we need an unparameterized path for at least one of the two
sides in order to build a nested loop join, which is the only way to
satisfy the parameterization on the other side.

Now, you could try to fix this by deciding that planning for a baserel
hasn't really succeeded unless we got at least one *unparameterized*
path for that baserel. I haven't tried that, but I presume that if you
do, it fixes the above example, because now there will be a last-ditch
sequential scan on both sides and so this example will behave as
expected. But if you do that, then in other cases, that sequential
scan is going to get picked even when it isn't strictly necessary to
do so, just because some plan that uses it looks better on cost.
Presumably that problem can in turn be fixed by deciding that we also
need to keep disable_cost around (or the separate disable-counter idea
that we were discussing recently in another branch of this thread),
but that's arguably missing the point of this exercise.

Another idea is to remove the ERROR mentioned above from
set_cheapest() and just allow planning to continue even if some
relations end up with no paths. (This would necessitate finding and
fixing any code that could be confused by a pathless relation.) Then,
if you get to the top of the plan tree and you have no paths there,
redo the join search discarding the constraints (or maybe just some of
the constraints, e.g. allow sequential scans and nested loops, or
something). Conceptually, I like this idea a lot, but I think there
are a few problems. One is that I'm not quite sure how to find all the
code that would need to be adjusted to make it work, though the header
comment for standard_join_search() seems like it's got some helpful
tips. A second is that it's another version of the disable_cost =
infinity problem: once you find that you can't generate a path while
enforcing all of the restrictions, you just disregard the restrictions
completely, instead of discarding them only to the extent necessary. I
have a feeling that's not going to be very appealing.

Now, I suppose it might be that even if we can't remove disable_cost,
something along these lines is still worth doing, just to save CPU
cycles. You could for example try planning with only non-disabled
stuff and then do it over again with everything if that doesn't work
out, still keeping disable_cost around so that you avoid disabled
nodes where you can. But I'm kind of hoping that I'm missing something
and there's some approach that could both kill disable_cost and save
some cycles at the same time. If (any of) you have an idea, I'd love
to hear it!

--
Robert Haas
EDB: http://www.enterprisedb.com

#42

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#41)

Re: On disable_cost

On Sat, 4 May 2024 at 08:34, Robert Haas <robertmhaas@gmail.com> wrote:

Another idea is to remove the ERROR mentioned above from
set_cheapest() and just allow planning to continue even if some
relations end up with no paths. (This would necessitate finding and
fixing any code that could be confused by a pathless relation.) Then,
if you get to the top of the plan tree and you have no paths there,
redo the join search discarding the constraints (or maybe just some of
the constraints, e.g. allow sequential scans and nested loops, or
something).

I don't think you'd need to wait longer than where we do set_cheapest
and find no paths to find out that there's going to be a problem.

I don't think redoing planning is going to be easy or even useful. I
mean what do you change when you replan? You can't just do
enable_seqscan and enable_nestloop as if there's no index to provide
sorted input and the plan requires some sort, then you still can't
produce a plan. Adding enable_sort to the list does not give me much
confidence we'll never fail to produce a plan either. It just seems
impossible to know which of the disabled ones caused the RelOptInfo to
have no paths. Also, you might end up enabling one that caused the
planner to do something different than it would do today. For
example, a Path that today incurs 2x disable_cost vs a Path that only
receives 1x disable_cost might do something different if you just went
and enabled a bunch of enable* GUCs before replanning.

Now, I suppose it might be that even if we can't remove disable_cost,
something along these lines is still worth doing, just to save CPU
cycles. You could for example try planning with only non-disabled
stuff and then do it over again with everything if that doesn't work
out, still keeping disable_cost around so that you avoid disabled
nodes where you can. But I'm kind of hoping that I'm missing something
and there's some approach that could both kill disable_cost and save
some cycles at the same time. If (any of) you have an idea, I'd love
to hear it!

I think the int Path.disabledness idea is worth coding up to try it.
I imagine that a Path will incur the maximum of its subpath's
disabledness's then add_path() just needs to prefer lower-valued
disabledness Paths.

That doesn't get you the benefits of fewer CPU cycles, but where did
that come from as a motive to change this? There's no shortage of
other ways to make the planner faster if that's an issue.

David

#43

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#42)

Re: On disable_cost

David Rowley <dgrowleyml@gmail.com> writes:

I don't think you'd need to wait longer than where we do set_cheapest
and find no paths to find out that there's going to be a problem.

At a base relation, yes, but that doesn't work for joins: it may be
that a particular join cannot be formed, yet other join sequences
will work. We have that all the time from outer-join ordering
restrictions, never mind enable_xxxjoin flags. So I'm not sure
that we can usefully declare early failure for joins.

I think the int Path.disabledness idea is worth coding up to try it.
I imagine that a Path will incur the maximum of its subpath's
disabledness's then add_path() just needs to prefer lower-valued
disabledness Paths.

I would think sum not maximum, but that's a detail.

That doesn't get you the benefits of fewer CPU cycles, but where did
that come from as a motive to change this? There's no shortage of
other ways to make the planner faster if that's an issue.

The concern was to not *add* CPU cycles in order to make this area
better. But I do tend to agree that we've exhausted all the other
options.

BTW, I looked through costsize.c just now to see exactly what we are
using disable_cost for, and it seemed like a majority of the cases are
just wrong. Where possible, we should implement a plan-type-disable
flag by not generating the associated Path in the first place, not by
applying disable_cost to it. But it looks like a lot of people have
erroneously copied the wrong logic. I would say that only these plan
types should use the disable_cost method:

seqscan
nestloop join
sort

as those are the only ones where we risk not being able to make a
plan at all for lack of other alternatives.

There is also some weirdness around needing to force use of tidscan
if we have WHERE CURRENT OF. But perhaps a different hack could be
used for that.

We also have this for hashjoin:

* If the bucket holding the inner MCV would exceed hash_mem, we don't
* want to hash unless there is really no other alternative, so apply
* disable_cost.

I'm content to leave that be, if we can't remove disable_cost
entirely.

What I'm wondering at this point is whether we need to trouble with
implementing the separate-disabledness-count method, if we trim back
the number of places using disable_cost to the absolute minimum.

regards, tom lane

#44

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#43)

Re: On disable_cost

On Sun, 5 May 2024 at 04:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

That doesn't get you the benefits of fewer CPU cycles, but where did
that come from as a motive to change this? There's no shortage of
other ways to make the planner faster if that's an issue.

The concern was to not *add* CPU cycles in order to make this area
better. But I do tend to agree that we've exhausted all the other
options.

It really looks to me that Robert was talking about not generating
paths for disabled path types. He did write "just to save CPU cycles"
in the paragraph I quoted.

I think we should concern ourselves with adding overhead to add_path()
*only* when we actually see a patch which slows it down in a way that
we can measure. I find it hard to imagine that adding a single
comparison for every Path is measurable. Each of these paths has been
palloced and costed, both of which are significantly more expensive
than adding another comparison to compare_path_costs_fuzzily(). I'm
only willing for benchmarks on an actual patch to prove me wrong on
that. Nothing else. add_path() has become a rat's nest of conditions
over the years and those seem to have made it without concerns about
performance.

BTW, I looked through costsize.c just now to see exactly what we are
using disable_cost for, and it seemed like a majority of the cases are
just wrong. Where possible, we should implement a plan-type-disable
flag by not generating the associated Path in the first place, not by
applying disable_cost to it. But it looks like a lot of people have
erroneously copied the wrong logic. I would say that only these plan
types should use the disable_cost method:

seqscan
nestloop join
sort

I think this oversimplifies the situation. I only spent 30 seconds
looking and I saw cases where this would cause issues. If
enable_hashagg is false, we could fail to produce some plans where the
type is sortable but not hashable. There's also an issue with nested
loops being unable to FULL OUTER JOIN. However, I do agree that there
are some in there that are adding disable_cost that should be done by
just not creating the Path. enable_gathermerge is one.
enable_bitmapscan is probably another.

I understand you only talked about the cases adding disable_cost in
costsize.c. But just as a reminder, there are other things we need to
be careful not to break. For example, enable_indexonlyscan=false
should defer to still making an index scan. Nobody who disables
enable_indexonlyscan without disabling enable_indexscan wants queries
that are eligible to use IOS to use seq scan instead. They'd still
want Index Scan to be considered, otherwise they'd have disabled
enable_indexscan.

David

#45

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#42)

Re: On disable_cost

On Sat, May 4, 2024 at 9:16 AM David Rowley <dgrowleyml@gmail.com> wrote:

I don't think you'd need to wait longer than where we do set_cheapest
and find no paths to find out that there's going to be a problem.

I'm confused by this response, because I thought that the main point
of my previous email was explaining why that's not true. I showed an
example where you do find paths at set_cheapest() time and yet are
unable to complete planning.

I don't think redoing planning is going to be easy or even useful. I
mean what do you change when you replan? You can't just do
enable_seqscan and enable_nestloop as if there's no index to provide
sorted input and the plan requires some sort, then you still can't
produce a plan. Adding enable_sort to the list does not give me much
confidence we'll never fail to produce a plan either. It just seems
impossible to know which of the disabled ones caused the RelOptInfo to
have no paths. Also, you might end up enabling one that caused the
planner to do something different than it would do today. For
example, a Path that today incurs 2x disable_cost vs a Path that only
receives 1x disable_cost might do something different if you just went
and enabled a bunch of enable* GUCs before replanning.

I agree that there are problems here, both in terms of implementation
complexity and also in terms of what behavior you actually get, but I
do not think that a proposal which changes some current behavior
should be considered dead on arrival. Whatever new behavior we might
want to implement needs to make sense, and there need to be good
reasons for making whatever changes are contemplated, but I don't
think we should take the position that it has to be identical to
current.

I think the int Path.disabledness idea is worth coding up to try it.
I imagine that a Path will incur the maximum of its subpath's
disabledness's then add_path() just needs to prefer lower-valued
disabledness Paths.

It definitely needs to be sum, not max. Otherwise you can't get the
matest example from the regression tests right, where one child lacks
the ability to comply with the GUC setting.

That doesn't get you the benefits of fewer CPU cycles, but where did
that come from as a motive to change this? There's no shortage of
other ways to make the planner faster if that's an issue.

Well, I don't agree with that at all. If there are lots of ways to
make the planner faster, we should definitely do a bunch of that
stuff, because "will slow down the planner too much" has been a
leading cause of proposed planner patches being rejected for as long
as I've been involved with the project. My belief was that we were
rather short of good ideas in that area, actually. But even if it's
true that we have lots of other ways to speed up the planner, that
doesn't mean that it wouldn't be good to do it here, too.

Stepping back a bit, my current view of this area is: disable_cost is
highly imperfect both as an idea and as implemented in PostgreSQL.
Although I'm discovering that the current implementation gets more
things right than I had realized, it also sometimes gets things wrong.
The original poster gave an example of that, and there are others.
Furthermore, the current implementation has some weird
inconsistencies. Therefore, I would like something better. Better, to
me, could mean any combination of (a) superior behavior, (b) superior
performance, and (c) simpler, more elegant code. In a perfect world,
we'd be able to come up with something that wins in all three of those
areas, but I'm not seeing a way to achieve that, so I'm trying to
figure out what is achievable. And because we need to reach consensus
on whatever is to be done, I'm sharing raw research results rather
than just dropping a completed patch. I don't think it's at all easy
to understand what the realistic possibilities are in this area;
certainly it isn't for me. At some point I'm hoping that there will be
a patch (or a bunch of patches) that we can all agree are an
improvement over now and the best we can reasonably do, but I don't
yet know what the shape of those will be, because I'm still trying to
understand (and document on-list) what all the problems are.

--
Robert Haas
EDB: http://www.enterprisedb.com

#46

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#43)

Re: On disable_cost

On Sat, May 4, 2024 at 12:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

There is also some weirdness around needing to force use of tidscan
if we have WHERE CURRENT OF. But perhaps a different hack could be
used for that.

Yeah, figuring out what to do about this was the trickiest part of the
experimental patch that I wrote last week. The idea of the current
code is that cost_qual_eval_walker charges disable_cost for
CurrentOfExpr, but cost_tidscan then subtracts disable_cost if
tidquals contains a CurrentOfExpr, so that we effectively disable
everything except TID scan paths and, I think, also any TID scan paths
that don't use the CurrentOfExpr as a qual. I'm not entirely sure
whether the last can happen, but I imagine that it might be possible
if the cursor refers to a query that itself contains some other kind
of TID qual.

It's not very clear that this mechanism is actually 100% reliable,
because we know it's possible in general for the costs of two paths to
be different by more than disable_cost. Maybe that's not possible in
this specific context, though: I'm not sure.

The approach I took for my experimental patch was pretty grotty, and
probably not quite complete, but basically I defined the case where we
currently subtract out disable_cost as a "forced TID-scan". I passed
around a Boolean called forcedTidScan which gets set to true if we
discover that some plan is a forced TID-scan path, and then we discard
any other paths and then only add other forced TID-scan paths after
that point. There can be more than one, because of parameterization.

But I think that the right thing to do is probably to pull some of the
logic up out of create_tidscan_paths() and decide ONCE whether we're
in a forced TID-scan situation or not. If we are, then
set_plain_rel_pathlist() should arrange to create only forced TID-scan
paths; otherwise, it should proceed as it does now.

Maybe if I try to do that I'll find problems, but the current approach
seems backwards to me, like going to a restaurant and ordering one of
everything on the menu, then cancelling all of the orders except the
stuff you actually want.

--
Robert Haas
EDB: http://www.enterprisedb.com

#47

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Haas (#46)

Re: On disable_cost

On Mon, May 6, 2024 at 9:39 AM Robert Haas <robertmhaas@gmail.com> wrote:

It's not very clear that this mechanism is actually 100% reliable,

It isn't. Here's a test case. As a non-superuser, do this:

create table foo (a int, b text, primary key (a));
insert into foo values (1, 'Apple');
alter table foo enable row level security;
alter table foo force row level security;
create policy p1 on foo as permissive using (ctid in ('(0,1)', '(0,2)'));
begin;
declare c cursor for select * from foo;
fetch from c;
explain update foo set b = 'Manzana' where current of c;
update foo set b = 'Manzana' where current of c;

The explain produces this output:

Update on foo (cost=10000000000.00..10000000008.02 rows=0 width=0)
-> Tid Scan on foo (cost=10000000000.00..10000000008.02 rows=1 width=38)
TID Cond: (ctid = ANY ('{"(0,1)","(0,2)"}'::tid[]))
Filter: CURRENT OF c

Unless I'm quite confused, the point of the code is to force
CurrentOfExpr to be a TID Cond, and it normally succeeds in doing so,
because WHERE CURRENT OF cursor_name has to be the one and only WHERE
condition for a normal UPDATE. I tried various cases involving views
and CTEs and got nowhere. But then I wrote a patch to make the
regression tests fail if a baserel's restrictinfo list contains a
CurrentOfExpr and also some other qual, and a couple of row-level
security tests failed (and nothing else). Which then allowed me to
construct the example above, where there are two possible TID quals
and the logic in tidpath.c latches onto the wrong one. The actual
UPDATE fails like this:

ERROR: WHERE CURRENT OF is not supported for this table type

...because ExecEvalCurrentOfExpr supposes that the only way it can be
reached is for an FDW without the necessary support, but actually in
this case it's planner error that gets us here.

Fortunately, there's no real reason for anyone to ever do something
like this, or at least I can't see one, so the fact that it doesn't
work probably doesn't really matter that much. And you can argue that
the only problem here is that the costing hack just didn't get updated
for RLS and now needs to be a bit more clever. But I think it'd be
better to find a way of making it less hacky. With the way the code is
structured right now, the chances of anyone understanding that RLS
might have an impact on its correctness were just about nil, IMHO.

--
Robert Haas
EDB: http://www.enterprisedb.com

#48

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#47)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, May 6, 2024 at 9:39 AM Robert Haas <robertmhaas@gmail.com> wrote:

It's not very clear that this mechanism is actually 100% reliable,

It isn't. Here's a test case.

Very interesting.

... Which then allowed me to
construct the example above, where there are two possible TID quals
and the logic in tidpath.c latches onto the wrong one.

Hmm. Without having traced through it, I'm betting that the
CurrentOfExpr qual is rejected as a tidqual because it's not
considered leakproof. It's not obvious to me why we couldn't consider
it as leakproof, though. If we don't want to do that in general,
then we need some kind of hack in TidQualFromRestrictInfo to accept
CurrentOfExpr quals anyway.

In general I think you're right that something less rickety than
the disable_cost hack would be a good idea to ensure the desired
TidPath gets chosen, but this problem is not the fault of that.
We're not making the TidPath with the correct contents in the first
place.

regards, tom lane

#49

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#48)

Re: On disable_cost

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

... Which then allowed me to
construct the example above, where there are two possible TID quals
and the logic in tidpath.c latches onto the wrong one.

Hmm. Without having traced through it, I'm betting that the
CurrentOfExpr qual is rejected as a tidqual because it's not
considered leakproof.

Nah, I'm wrong: we do treat it as leakproof, and the comment about
that in contain_leaked_vars_walker shows that the interaction with
RLS quals *was* thought about. What wasn't thought about was the
possibility of RLS quals that themselves could be usable as tidquals,
which breaks this assumption in TidQualFromRestrictInfoList:

* Stop as soon as we find any usable CTID condition. In theory we
* could get CTID equality conditions from different AND'ed clauses,
* in which case we could try to pick the most efficient one. In
* practice, such usage seems very unlikely, so we don't bother; we
* just exit as soon as we find the first candidate.

The executor doesn't seem to be prepared to cope with multiple AND'ed
TID clauses (only OR'ed ones). So we need to fix this at least to the
extent of looking for a CurrentOfExpr qual, and preferring that over
anything else.

I'm also now wondering about this assumption in the executor:

/* CurrentOfExpr could never appear OR'd with something else */
Assert(list_length(tidstate->tss_tidexprs) == 1 ||
!tidstate->tss_isCurrentOf);

It still seems OK, because anything that might come in from RLS quals
would be AND'ed not OR'ed with the CurrentOfExpr.

In general I think you're right that something less rickety than
the disable_cost hack would be a good idea to ensure the desired
TidPath gets chosen, but this problem is not the fault of that.
We're not making the TidPath with the correct contents in the first
place.

Still true.

regards, tom lane

#50

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#49)

Re: On disable_cost

On Mon, May 6, 2024 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Nah, I'm wrong: we do treat it as leakproof, and the comment about
that in contain_leaked_vars_walker shows that the interaction with
RLS quals *was* thought about. What wasn't thought about was the
possibility of RLS quals that themselves could be usable as tidquals,
which breaks this assumption in TidQualFromRestrictInfoList:

* Stop as soon as we find any usable CTID condition. In theory we
* could get CTID equality conditions from different AND'ed clauses,
* in which case we could try to pick the most efficient one. In
* practice, such usage seems very unlikely, so we don't bother; we
* just exit as soon as we find the first candidate.

Right. I had noticed this but didn't spell it out.

The executor doesn't seem to be prepared to cope with multiple AND'ed
TID clauses (only OR'ed ones). So we need to fix this at least to the
extent of looking for a CurrentOfExpr qual, and preferring that over
anything else.

I'm also now wondering about this assumption in the executor:

/* CurrentOfExpr could never appear OR'd with something else */
Assert(list_length(tidstate->tss_tidexprs) == 1 ||
!tidstate->tss_isCurrentOf);

It still seems OK, because anything that might come in from RLS quals
would be AND'ed not OR'ed with the CurrentOfExpr.

This stuff I had not noticed.

In general I think you're right that something less rickety than
the disable_cost hack would be a good idea to ensure the desired
TidPath gets chosen, but this problem is not the fault of that.
We're not making the TidPath with the correct contents in the first
place.

Still true.

I'll look into this, unless you want to do it.

Incidentally, another thing I just noticed is that
IsCurrentOfClause()'s test for (node->cvarno == rel->relid) is
possibly dead code. At least, there are no examples in our test suite
where it fails to hold. Which seems like it makes sense, because if it
didn't, then how did the clause end up in baserestrictinfo? Maybe this
is worth keeping as defensive coding, or maybe it should be changed to
an Assert or something.

--
Robert Haas
EDB: http://www.enterprisedb.com

#51

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#50)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

I'll look into this, unless you want to do it.

I have a draft patch already. Need to add a test case.

Incidentally, another thing I just noticed is that
IsCurrentOfClause()'s test for (node->cvarno == rel->relid) is
possibly dead code. At least, there are no examples in our test suite
where it fails to hold. Which seems like it makes sense, because if it
didn't, then how did the clause end up in baserestrictinfo? Maybe this
is worth keeping as defensive coding, or maybe it should be changed to
an Assert or something.

I wouldn't remove it, but maybe an Assert is good enough. The tests
on Vars' varno should be equally pointless no?

regards, tom lane

#52

Peter Geoghegan

pg@bowt.ie

over 1 year ago

In reply to: Robert Haas (#45)

Re: On disable_cost

On Mon, May 6, 2024 at 8:27 AM Robert Haas <robertmhaas@gmail.com> wrote:

Stepping back a bit, my current view of this area is: disable_cost is
highly imperfect both as an idea and as implemented in PostgreSQL.
Although I'm discovering that the current implementation gets more
things right than I had realized, it also sometimes gets things wrong.
The original poster gave an example of that, and there are others.
Furthermore, the current implementation has some weird
inconsistencies. Therefore, I would like something better.

FWIW I always found those weird inconsistencies to be annoying at
best, and confusing at worst. I speak as somebody that uses
disable_cost a lot.

I certainly wouldn't ask anybody to make it a priority for that reason
alone -- it's not *that* bad. I've given my opinion on this because
it's already under discussion.

--
Peter Geoghegan

#53

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Peter Geoghegan (#52)

4 attachment(s)

Re: On disable_cost

On Mon, May 6, 2024 at 4:30 PM Peter Geoghegan <pg@bowt.ie> wrote:

FWIW I always found those weird inconsistencies to be annoying at
best, and confusing at worst. I speak as somebody that uses
disable_cost a lot.

I certainly wouldn't ask anybody to make it a priority for that reason
alone -- it's not *that* bad. I've given my opinion on this because
it's already under discussion.

Thanks, it's good to have other perspectives.

Here are some patches for discussion.

0001 gets rid of disable_cost as a mechanism for forcing a TID scan
plan to be chosen when CurrentOfExpr is present. Instead, it arranges
to generate only the valid path when that case occurs, and skip
everything else. I think this is a good cleanup, and it doesn't seem
totally impossible that it actually prevents a failure in some extreme
case.

0002 cleans up the behavior of enable_indexscan and
enable_indexonlyscan. Currently, setting enable_indexscan=false adds
disable_cost to both the cost of index scans and the cost of
index-only scans. I think that's indefensible and, in fact, a bug,
although I believe David Rowley disagrees. With this patch, we simply
don't generate index scans if enable_indexscan=false, and we don't
generate index-only scans if enable_indexonlyscan=false, which seems a
lot more consistent to me. However, I did revise one major thing from
the patch I posted before, per feedback from David Rowley and also per
my own observations: in this version, if enable_indexscan=true and
enable_indexonlyscan=false, we'll generate index-scan paths for any
cases where, with both set to true, we would have only generated
index-only scan paths. That change makes the behavior of this patch a
lot more comprehensible and intuitive: the only regression test
changes are places where somebody expected that they could disable
both index scans and index-only scans by setting
enable_indexscan=false.

0003 and 0004 extend the approach of "just don't generate the disabled
path" to bitmap scans and gather merge, respectively. I think these
are more debatable, mostly because it's not clear how far we can
really take this approach. Neither breaks any test cases, and 0003 is
closely related to the work done in 0002, which seems like a point in
its favor. 0004 was simply the only other case where it was obvious to
me that this kind of approach made sense. In my view, it makes most
sense to use this kind of approach for planner behaviors that seem
like they're sort of optional: like if you don't use gather merge, you
can still use gather, and if you don't use index scans, you can still
use sequential scans. With all these patches applied, the remaining
cases where we rely on disable_cost are:

sequential scans
sorts
hash aggregation
all 3 join types
hash joins where a bucket holding the inner MCV would exceed hash_mem

Sequential scans are clearly a last-ditch method. I find it a bit hard
to decide whether hashing or sorting is the default, especially giving
the asymmetry between enable_sort - presumptively anywhere - and
enable_hashagg - specific to aggregation. As for the join types, it's
tempting to consider nested-loop the default type -- it's the only way
to satisfy parameterizations, for instance -- but the fact that it's
the only method that can't do a full join undermines that position in
my book. But, I don't want to pretend like I have all the answers
here, either; I'm just sharing some thoughts.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

0004-When-enable_gathermerge-false-don-t-generate-gather-.patchapplication/octet-stream; name=0004-When-enable_gathermerge-false-don-t-generate-gather-.patchDownload

From 0aac0943362080b29657a7539644a4716023b2e0 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 7 May 2024 13:32:47 -0400
Subject: [PATCH 4/4] When enable_gathermerge=false, don't generate gather
 merge paths.

Previously, we generated them, and then added disable_cost to the
startup cost of each one. Not generating them at all is cheaper.
---
 src/backend/optimizer/path/allpaths.c | 25 ++++++++++++++++---------
 src/backend/optimizer/path/costsize.c |  4 ++--
 src/backend/optimizer/plan/planner.c  |  6 +++++-
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index aa78c0af0c..a1d1fefa05 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3090,18 +3090,21 @@ generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows)
 	 * For each useful ordering, we can consider an order-preserving Gather
 	 * Merge.
 	 */
-	foreach(lc, rel->partial_pathlist)
+	if (enable_gathermerge)
 	{
-		Path	   *subpath = (Path *) lfirst(lc);
-		GatherMergePath *path;
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *subpath = (Path *) lfirst(lc);
+			GatherMergePath *path;
 
-		if (subpath->pathkeys == NIL)
-			continue;
+			if (subpath->pathkeys == NIL)
+				continue;
 
-		rows = subpath->rows * subpath->parallel_workers;
-		path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
-										subpath->pathkeys, NULL, rowsp);
-		add_path(rel, &path->path);
+			rows = subpath->rows * subpath->parallel_workers;
+			path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
+											subpath->pathkeys, NULL, rowsp);
+			add_path(rel, &path->path);
+		}
 	}
 }
 
@@ -3214,6 +3217,10 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	/* generate the regular gather (merge) paths */
 	generate_gather_paths(root, rel, override_rows);
 
+	/* beyond this point, we only create gather merge paths */
+	if (!enable_gathermerge)
+		return;
+
 	/* consider incremental sort for interesting orderings */
 	useful_pathkeys_list = get_useful_pathkeys_for_relation(root, rel, true);
 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f56b6efe34..8b58e71fc7 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -490,8 +490,8 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	else
 		path->path.rows = rel->rows;
 
-	if (!enable_gathermerge)
-		startup_cost += disable_cost;
+	/* shouldn't reach here if enable_gathermerge = false */
+	Assert(enable_gathermerge);
 
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 032818423f..5c4157afba 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5231,7 +5231,7 @@ create_ordered_paths(PlannerInfo *root,
 	 * to the required output order and then use Gather Merge.
 	 */
 	if (ordered_rel->consider_parallel && root->sort_pathkeys != NIL &&
-		input_rel->partial_pathlist != NIL)
+		input_rel->partial_pathlist != NIL && enable_gathermerge)
 	{
 		Path	   *cheapest_partial_path;
 
@@ -7426,6 +7426,10 @@ gather_grouping_paths(PlannerInfo *root, RelOptInfo *rel)
 	/* Try Gather for unordered paths and Gather Merge for ordered ones. */
 	generate_useful_gather_paths(root, rel, true);
 
+	/* beyond this point, we only create gather merge paths */
+	if (!enable_gathermerge)
+		return;
+
 	cheapest_partial_path = linitial(rel->partial_pathlist);
 
 	/* XXX Shouldn't this also consider the group-key-reordering? */
-- 
2.39.3 (Apple Git-145)

0002-Don-t-generate-index-scan-paths-when-enable_indexsca.patchapplication/octet-stream; name=0002-Don-t-generate-index-scan-paths-when-enable_indexsca.patchDownload

From 840e5fa3e9ba505a772296bb42feda5429c88690 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 2 May 2024 11:18:44 -0400
Subject: [PATCH 2/4] Don't generate index-scan paths when
 enable_indexscan=false.

Previously, index-scan paths were still generated even when
enable_indexscan=false, but we added disable-cost to the cost of
both index scan plans and index-only scan plans. It doesn't make sense
for enable_indexscan to affect the whether index-only scans are chosen
given that we also have a GUC called enable_indexonlyscan.

With this commit, enable_indexscan and enable_indexonlyscan work
the same way: each one prevents consideration of paths of the
appropriate type, and neither has any affect on the cost of the
generate paths. This requires some updates to the regression tests,
which previously relied on enable_indexscan=false to also disable
index-only scans.

Note that when enable_indexonlyscan=false and enable_indexscan=true,
we will generate index-scan paths that would have not have been
generated if both had been set to true. That's because generating
both an index-scan path and an index-only path would be a waste
of cycles, since the index-only path should always win. In effect,
the index-scan plan shape was still being considered; we just
rejected it before actually constructing a path.
---
 src/backend/optimizer/path/costsize.c         |  4 ---
 src/backend/optimizer/path/indxpath.c         | 26 ++++++++++++++++---
 src/test/regress/expected/btree_index.out     |  3 +++
 src/test/regress/expected/create_index.out    |  2 ++
 src/test/regress/expected/select.out          |  1 +
 src/test/regress/expected/select_parallel.out |  2 ++
 src/test/regress/expected/tuplesort.out       |  2 ++
 src/test/regress/sql/btree_index.sql          |  5 ++++
 src/test/regress/sql/create_index.sql         |  2 ++
 src/test/regress/sql/select.sql               |  1 +
 src/test/regress/sql/select_parallel.sql      |  2 ++
 src/test/regress/sql/tuplesort.sql            |  2 ++
 12 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 2021c481b4..74fc5aab56 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -603,10 +603,6 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
-	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
-
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
 	 * for scanning the index, as well as the selectivity of the index (ie,
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index c0fcc7d78d..423099d725 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -742,7 +742,13 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		IndexPath  *ipath = (IndexPath *) lfirst(lc);
 
 		if (index->amhasgettuple)
-			add_path(rel, (Path *) ipath);
+		{
+			if (ipath->path.pathtype == T_IndexScan && enable_indexscan)
+				add_path(rel, (Path *) ipath);
+			else if (ipath->path.pathtype == T_IndexOnlyScan &&
+				enable_indexonlyscan)
+				add_path(rel, (Path *) ipath);
+		}
 
 		if (index->amhasgetbitmap &&
 			(ipath->path.pathkeys == NIL ||
@@ -831,6 +837,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		case ST_INDEXSCAN:
 			if (!index->amhasgettuple)
 				return NIL;
+			if (!enable_indexscan && !enable_indexonlyscan)
+				return NIL;
 			break;
 		case ST_BITMAPSCAN:
 			if (!index->amhasgetbitmap)
@@ -978,7 +986,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		 */
 		if (index->amcanparallel &&
 			rel->consider_parallel && outer_relids == NULL &&
-			scantype != ST_BITMAPSCAN)
+			scantype != ST_BITMAPSCAN &&
+			(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 		{
 			ipath = create_index_path(root, index,
 									  index_clauses,
@@ -1028,7 +1037,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 			/* If appropriate, consider parallel index scan */
 			if (index->amcanparallel &&
 				rel->consider_parallel && outer_relids == NULL &&
-				scantype != ST_BITMAPSCAN)
+				scantype != ST_BITMAPSCAN &&
+				(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 			{
 				ipath = create_index_path(root, index,
 										  index_clauses,
@@ -1735,7 +1745,15 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
 	ListCell   *lc;
 	int			i;
 
-	/* Index-only scans must be enabled */
+	/*
+	 * Index-only scans must be enabled.
+	 *
+	 * NB: Returning false here means that an index scan will be considered
+	 * instead, so setting enable_indexscan=false causes to consider paths
+	 * that we wouldn't have considered otherwise. That seems OK, because our
+	 * only reason for not generating the index-scan paths is that we expect
+	 * them to lose on cost.
+	 */
 	if (!enable_indexonlyscan)
 		return false;
 
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 510646cbce..f15db99771 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -247,6 +247,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 --
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -290,6 +291,7 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 (2 rows)
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -330,6 +332,7 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 ---------
 (0 rows)
 
+reset enable_indexonlyscan;
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                            QUERY PLAN                            
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index cf6eac5734..ec69bafd40 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -618,6 +618,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 EXPLAIN (COSTS OFF)
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
@@ -643,6 +644,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 --
 -- GIN over int[] and text[]
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0e..6445815741 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -844,6 +844,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
                          QUERY PLAN                          
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 87273fa635..f79eda79f6 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -522,6 +522,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -622,6 +623,7 @@ select * from explain_parallel_sort_stats();
 (14 rows)
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..87b05a22cb 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -362,6 +362,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -458,6 +459,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql
index 0d2a33f370..bc99f44dda 100644
--- a/src/test/regress/sql/btree_index.sql
+++ b/src/test/regress/sql/btree_index.sql
@@ -157,6 +157,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -168,6 +169,7 @@ explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -175,6 +177,9 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
 explain (costs off)
 select proname from pg_proc where proname ilike '00%foo' order by 1;
 select proname from pg_proc where proname ilike '00%foo' order by 1;
+
+reset enable_indexonlyscan;
+
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index e296891cab..04dea5225e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -246,6 +246,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 
 EXPLAIN (COSTS OFF)
@@ -254,6 +255,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 
 --
diff --git a/src/test/regress/sql/select.sql b/src/test/regress/sql/select.sql
index 019f1e7673..a0c7417dec 100644
--- a/src/test/regress/sql/select.sql
+++ b/src/test/regress/sql/select.sql
@@ -218,6 +218,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 20376c03fa..3f003e2e71 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -201,6 +201,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -248,6 +249,7 @@ $$;
 select * from explain_parallel_sort_stats();
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..95ac8ec04c 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -162,6 +162,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -192,6 +193,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
-- 
2.39.3 (Apple Git-145)

0003-When-enable_bitmapscan-false-just-don-t-generate-bit.patchapplication/octet-stream; name=0003-When-enable_bitmapscan-false-just-don-t-generate-bit.patchDownload

From c955ec08fe4feaab48b9ff63be57ecae0ca7f7bd Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 2 May 2024 11:30:26 -0400
Subject: [PATCH 3/4] When enable_bitmapscan=false, just don't generate bitmap
 scan paths.

Previously, we generated them, and then added disable_cost to the
startup cost of each one. Not generating them at all is cheaper.
---
 src/backend/optimizer/path/costsize.c | 4 ++--
 src/backend/optimizer/path/indxpath.c | 6 ++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 74fc5aab56..f56b6efe34 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1034,8 +1034,8 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
+	/* shouldn't reach here if enable_bitmapscan = false */
+	Assert(enable_bitmapscan);
 
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 423099d725..0f2f0c268f 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -332,7 +332,7 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * should be sufficient since there's basically only one figure of merit
 	 * (total cost) for such a path.
 	 */
-	if (bitindexpaths != NIL)
+	if (enable_bitmapscan && bitindexpaths != NIL)
 	{
 		Path	   *bitmapqual;
 		BitmapHeapPath *bpath;
@@ -357,7 +357,7 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * consider_index_join_clauses, but we're working with whole paths not
 	 * individual clauses.)
 	 */
-	if (bitjoinpaths != NIL)
+	if (enable_bitmapscan && bitjoinpaths != NIL)
 	{
 		List	   *all_path_outers;
 
@@ -843,6 +843,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		case ST_BITMAPSCAN:
 			if (!index->amhasgetbitmap)
 				return NIL;
+			if (!enable_bitmapscan)
+				return NIL;
 			break;
 		case ST_ANYSCAN:
 			/* either or both are OK */
-- 
2.39.3 (Apple Git-145)

0001-Remove-grotty-use-of-disable_cost-for-TID-scan-plans.patchapplication/octet-stream; name=0001-Remove-grotty-use-of-disable_cost-for-TID-scan-plans.patchDownload

From fc60b1a5077c0af496695b3e5eb0b8fec2d58955 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 7 May 2024 11:51:26 -0400
Subject: [PATCH 1/4] Remove grotty use of disable_cost for TID scan plans.

Previously, the code charged disable_cost for CurrentOfExpr, and then
subtracted disable_cost from the cost of a TID path that used
CurrentOfExpr as the TID qual, effectively disabling all paths except
that one. Now, we instead suppress generation of the disabled paths
entirely, and generate only the one that the executor will actually
understand.

With this approach, we do not need to rely on disable_cost being
large enough to prevent the wrong path from being chosen, and we
save some CPU cycle by avoiding generating paths that we can't
actually use. In my opinion, the code is also easier to understand
like this.
---
 src/backend/optimizer/path/allpaths.c | 14 +++++++--
 src/backend/optimizer/path/costsize.c | 26 -----------------
 src/backend/optimizer/path/tidpath.c  | 41 +++++++++++++++++++++++----
 src/include/optimizer/paths.h         |  2 +-
 4 files changed, 48 insertions(+), 35 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4895cee994..aa78c0af0c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -772,6 +772,17 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	 */
 	required_outer = rel->lateral_relids;
 
+	/*
+	 * Consider TID scans.
+	 *
+	 * If create_tidscan_paths returns true, then a TID scan path is forced.
+	 * This happens when rel->baserestrictinfo contains CurrentOfExpr, because
+	 * the executor can't handle any other type of path for such queries.
+	 * Hence, we return without adding any other paths.
+	 */
+	if (create_tidscan_paths(root, rel))
+		return;
+
 	/* Consider sequential scan */
 	add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
 
@@ -781,9 +792,6 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 
 	/* Consider index scans */
 	create_index_paths(root, rel);
-
-	/* Consider TID scans */
-	create_tidscan_paths(root, rel);
 }
 
 /*
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ee23ed7835..2021c481b4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1251,7 +1251,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 {
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
-	bool		isCurrentOf = false;
 	QualCost	qpqual_cost;
 	Cost		cpu_per_tuple;
 	QualCost	tid_qual_cost;
@@ -1287,7 +1286,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		else if (IsA(qual, CurrentOfExpr))
 		{
 			/* CURRENT OF yields 1 tuple */
-			isCurrentOf = true;
 			ntuples++;
 		}
 		else
@@ -1297,22 +1295,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		}
 	}
 
-	/*
-	 * We must force TID scan for WHERE CURRENT OF, because only nodeTidscan.c
-	 * understands how to do it correctly.  Therefore, honor enable_tidscan
-	 * only when CURRENT OF isn't present.  Also note that cost_qual_eval
-	 * counts a CurrentOfExpr as having startup cost disable_cost, which we
-	 * subtract off here; that's to prevent other plan types such as seqscan
-	 * from winning.
-	 */
-	if (isCurrentOf)
-	{
-		Assert(baserel->baserestrictcost.startup >= disable_cost);
-		startup_cost -= disable_cost;
-	}
-	else if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -1399,9 +1381,6 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	ntuples = selectivity * baserel->tuples;
 	nseqpages = pages - 1.0;
 
-	if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -4884,11 +4863,6 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 		/* Treat all these as having cost 1 */
 		context->total.per_tuple += cpu_operator_cost;
 	}
-	else if (IsA(node, CurrentOfExpr))
-	{
-		/* Report high cost to prevent selection of anything but TID scan */
-		context->total.startup += disable_cost;
-	}
 	else if (IsA(node, SubLink))
 	{
 		/* This routine should not be applied to un-planned expressions */
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index eb11bc79c7..b0323b26ec 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_operator.h"
 #include "catalog/pg_type.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -277,12 +278,15 @@ RestrictInfoIsTidQual(PlannerInfo *root, RestrictInfo *rinfo, RelOptInfo *rel)
  * that there's more than one choice.
  */
 static List *
-TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
+TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel,
+							bool *isCurrentOf)
 {
 	RestrictInfo *tidclause = NULL; /* best simple CTID qual so far */
 	List	   *orlist = NIL;	/* best OR'ed CTID qual so far */
 	ListCell   *l;
 
+	*isCurrentOf = false;
+
 	foreach(l, rlist)
 	{
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
@@ -305,9 +309,13 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 				if (is_andclause(orarg))
 				{
 					List	   *andargs = ((BoolExpr *) orarg)->args;
+					bool		sublistIsCurrentOf;
 
 					/* Recurse in case there are sub-ORs */
-					sublist = TidQualFromRestrictInfoList(root, andargs, rel);
+					sublist = TidQualFromRestrictInfoList(root, andargs, rel,
+														  &sublistIsCurrentOf);
+					if (sublistIsCurrentOf)
+						elog(ERROR, "IS CURRENT OF within OR clause");
 				}
 				else
 				{
@@ -353,7 +361,10 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 			{
 				/* We can stop immediately if it's a CurrentOfExpr */
 				if (IsCurrentOfClause(rinfo, rel))
+				{
+					*isCurrentOf = true;
 					return list_make1(rinfo);
+				}
 
 				/*
 				 * Otherwise, remember the first non-OR CTID qual.  We could
@@ -483,19 +494,24 @@ ec_member_matches_ctid(PlannerInfo *root, RelOptInfo *rel,
  *
  *	  Candidate paths are added to the rel's pathlist (using add_path).
  */
-void
+bool
 create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 {
 	List	   *tidquals;
 	List	   *tidrangequals;
+	bool		isCurrentOf;
 
 	/*
 	 * If any suitable quals exist in the rel's baserestrict list, generate a
 	 * plain (unparameterized) TidPath with them.
+	 *
+	 * We skip this when enable_tidscan = false, except when the qual is
+	 * CurrentOfExpr. In that case, a TID scan is the only correct path.
 	 */
-	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel);
+	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel,
+										   &isCurrentOf);
 
-	if (tidquals != NIL)
+	if (tidquals != NIL && (enable_tidscan || isCurrentOf))
 	{
 		/*
 		 * This path uses no join clauses, but it could still have required
@@ -505,8 +521,21 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 
 		add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
 												   required_outer));
+
+		/*
+		 * When the qual is CurrentOfExpr, the path that we just added is the
+		 * only one the executor can handle, so we should return before adding
+		 * any others. Returning true lets the caller know not to add any
+		 * others, either.
+		 */
+		if (isCurrentOf)
+			return true;
 	}
 
+	/* Skip the rest if TID scans are disabled. */
+	if (!enable_tidscan)
+		return false;
+
 	/*
 	 * If there are range quals in the baserestrict list, generate a
 	 * TidRangePath.
@@ -553,4 +582,6 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * join quals, for example.
 	 */
 	BuildParameterizedTidPaths(root, rel, rel->joininfo);
+
+	return false;
 }
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 914d9bdef5..17e307f3d5 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -83,7 +83,7 @@ extern void check_index_predicates(PlannerInfo *root, RelOptInfo *rel);
  * tidpath.c
  *	  routines to generate tid paths
  */
-extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
+extern bool create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
 
 /*
  * joinpath.c
-- 
2.39.3 (Apple Git-145)

#54

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Haas (#53)

4 attachment(s)

Re: On disable_cost

On Tue, May 7, 2024 at 4:19 PM Robert Haas <robertmhaas@gmail.com> wrote:

Here are some patches for discussion.

Well, that didn't generate much discussion, but here I am trying
again. Here I've got patches 0001 and 0002 from my previous posting;
I've dropped 0003 and 0004 from the previous set for now so as not to
distract from the main event, but they may still be a good idea.
Instead I've got an 0003 and an 0004 that implement the "count of
disabled nodes" approach that we have discussed previously. This seems
to work fine, unlike the approaches I tried earlier. I think this is
the right direction to go, but I'd like to know what concerns people
might have.

This doesn't completely remove disable_cost, because hash joins still
add it to the cost when it's impossible to fit the MCV value into
work_mem. I'm not sure what to do with that. Continuing to use
disable_cost in that one scenario seems OK to me. We could
alternatively make that scenario bump disabled_nodes, but I don't
really want to confuse the planner not wanting to do something with
the user telling the planner not to do something, so I don't think
that's a good idea. Or we could rejigger things so that in that case
we don't generate the plan at all. I'm not sure why we don't do that
already, actually.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v2-0004-Show-the-of-disabled-nodes-in-EXPLAIN-ANALYZE-out.patchapplication/octet-stream; name=v2-0004-Show-the-of-disabled-nodes-in-EXPLAIN-ANALYZE-out.patchDownload

From e3774956151c61679e4e00c0b4f0aac490506eb9 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 12 Jun 2024 11:01:23 -0400
Subject: [PATCH v2 4/4] Show the # of disabled nodes in EXPLAIN ANALYZE
 output.

Now that disable_cost is not included in the cost estimate, there's
no visible sign in EXPLAIN output of which plan nodes are disabled.
Fix that by propagating the number of disabled nodes from Path to
Plan, and then showing it in the EXPLAIN output.
---
 src/backend/commands/explain.c                |  4 ++++
 src/backend/optimizer/plan/createplan.c       |  2 ++
 src/include/nodes/plannodes.h                 |  1 +
 src/test/regress/expected/aggregates.out      | 21 ++++++++++++++++---
 .../regress/expected/collate.icu.utf8.out     |  6 ++++--
 .../regress/expected/incremental_sort.out     |  5 ++++-
 src/test/regress/expected/inherit.out         |  4 +++-
 src/test/regress/expected/join.out            |  4 +++-
 src/test/regress/expected/memoize.out         |  8 +++++--
 src/test/regress/expected/select_parallel.out |  6 +++++-
 src/test/regress/expected/union.out           |  3 ++-
 11 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 94511a5a02..9147cac1a6 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1889,6 +1889,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
+	if (plan->disabled_nodes != 0)
+		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
+							   es);
+
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
 	{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ce8a37bb58..3ffe6d96ad 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -5409,6 +5409,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
 static void
 copy_generic_path_info(Plan *dest, Path *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->rows;
@@ -5424,6 +5425,7 @@ copy_generic_path_info(Plan *dest, Path *src)
 static void
 copy_plan_costsize(Plan *dest, Plan *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->plan_rows;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1aeeaec95e..6d22d86bbb 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -125,6 +125,7 @@ typedef struct Plan
 	/*
 	 * estimated execution costs for plan (see costsize.c for more info)
 	 */
+	int			disabled_nodes;	/* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 1c1ca7573a..ab1de1bfd8 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2895,18 +2895,23 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
+   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
+         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(12 rows)
+                           Disabled Nodes: 1
+(17 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2928,19 +2933,24 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2950,19 +2960,24 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb4431..31345295c1 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -989,8 +989,9 @@ select * from collate_test1 where b ilike 'abc';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'abc'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'abc';
  a |  b  
@@ -1004,8 +1005,9 @@ select * from collate_test1 where b ilike 'ABC';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'ABC'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'ABC';
  a |  b  
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..79f0d37a87 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -701,16 +701,19 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
+   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
+         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
+               Disabled Nodes: 1
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(10 rows)
+(13 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index ad73213414..dbb748a2d2 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,6 +1614,7 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
+   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1623,10 +1624,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
+               Disabled Nodes: 1
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(13 rows)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 6b16c3a676..8840fb4e3e 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -7945,13 +7945,15 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
+   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
+         Disabled Nodes: 1
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(7 rows)
+(9 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 0fd103c06b..3b1fd3d95d 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -240,14 +240,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(8 rows)
+(10 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -255,14 +257,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(8 rows)
+(10 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 20c651aadb..08ef0df9a3 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -538,10 +538,14 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
+   Disabled Nodes: 1
    ->  Nested Loop
+         Disabled Nodes: 1
          ->  Gather
+               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
+                     Disabled Nodes: 1
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -549,7 +553,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(12 rows)
+(16 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0fd0e1c38b..0456d48c93 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,11 +822,12 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
+   Disabled Nodes: 1
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
          ->  Result
-(5 rows)
+(6 rows)
 
 reset enable_hashagg;
 --
-- 
2.39.3 (Apple Git-145)

v2-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patchapplication/octet-stream; name=v2-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patchDownload

From d3ac176733aa142b0bcdcbb0c079162a4c65d454 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 7 May 2024 11:51:26 -0400
Subject: [PATCH v2 1/4] Remove grotty use of disable_cost for TID scan plans.

Previously, the code charged disable_cost for CurrentOfExpr, and then
subtracted disable_cost from the cost of a TID path that used
CurrentOfExpr as the TID qual, effectively disabling all paths except
that one. Now, we instead suppress generation of the disabled paths
entirely, and generate only the one that the executor will actually
understand.

With this approach, we do not need to rely on disable_cost being
large enough to prevent the wrong path from being chosen, and we
save some CPU cycle by avoiding generating paths that we can't
actually use. In my opinion, the code is also easier to understand
like this.
---
 src/backend/optimizer/path/allpaths.c | 14 +++++++--
 src/backend/optimizer/path/costsize.c | 26 -----------------
 src/backend/optimizer/path/tidpath.c  | 41 +++++++++++++++++++++++----
 src/include/optimizer/paths.h         |  2 +-
 4 files changed, 48 insertions(+), 35 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4895cee994..aa78c0af0c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -772,6 +772,17 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	 */
 	required_outer = rel->lateral_relids;
 
+	/*
+	 * Consider TID scans.
+	 *
+	 * If create_tidscan_paths returns true, then a TID scan path is forced.
+	 * This happens when rel->baserestrictinfo contains CurrentOfExpr, because
+	 * the executor can't handle any other type of path for such queries.
+	 * Hence, we return without adding any other paths.
+	 */
+	if (create_tidscan_paths(root, rel))
+		return;
+
 	/* Consider sequential scan */
 	add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
 
@@ -781,9 +792,6 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 
 	/* Consider index scans */
 	create_index_paths(root, rel);
-
-	/* Consider TID scans */
-	create_tidscan_paths(root, rel);
 }
 
 /*
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ee23ed7835..2021c481b4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1251,7 +1251,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 {
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
-	bool		isCurrentOf = false;
 	QualCost	qpqual_cost;
 	Cost		cpu_per_tuple;
 	QualCost	tid_qual_cost;
@@ -1287,7 +1286,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		else if (IsA(qual, CurrentOfExpr))
 		{
 			/* CURRENT OF yields 1 tuple */
-			isCurrentOf = true;
 			ntuples++;
 		}
 		else
@@ -1297,22 +1295,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		}
 	}
 
-	/*
-	 * We must force TID scan for WHERE CURRENT OF, because only nodeTidscan.c
-	 * understands how to do it correctly.  Therefore, honor enable_tidscan
-	 * only when CURRENT OF isn't present.  Also note that cost_qual_eval
-	 * counts a CurrentOfExpr as having startup cost disable_cost, which we
-	 * subtract off here; that's to prevent other plan types such as seqscan
-	 * from winning.
-	 */
-	if (isCurrentOf)
-	{
-		Assert(baserel->baserestrictcost.startup >= disable_cost);
-		startup_cost -= disable_cost;
-	}
-	else if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -1399,9 +1381,6 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	ntuples = selectivity * baserel->tuples;
 	nseqpages = pages - 1.0;
 
-	if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -4884,11 +4863,6 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 		/* Treat all these as having cost 1 */
 		context->total.per_tuple += cpu_operator_cost;
 	}
-	else if (IsA(node, CurrentOfExpr))
-	{
-		/* Report high cost to prevent selection of anything but TID scan */
-		context->total.startup += disable_cost;
-	}
 	else if (IsA(node, SubLink))
 	{
 		/* This routine should not be applied to un-planned expressions */
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index eb11bc79c7..b0323b26ec 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_operator.h"
 #include "catalog/pg_type.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -277,12 +278,15 @@ RestrictInfoIsTidQual(PlannerInfo *root, RestrictInfo *rinfo, RelOptInfo *rel)
  * that there's more than one choice.
  */
 static List *
-TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
+TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel,
+							bool *isCurrentOf)
 {
 	RestrictInfo *tidclause = NULL; /* best simple CTID qual so far */
 	List	   *orlist = NIL;	/* best OR'ed CTID qual so far */
 	ListCell   *l;
 
+	*isCurrentOf = false;
+
 	foreach(l, rlist)
 	{
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
@@ -305,9 +309,13 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 				if (is_andclause(orarg))
 				{
 					List	   *andargs = ((BoolExpr *) orarg)->args;
+					bool		sublistIsCurrentOf;
 
 					/* Recurse in case there are sub-ORs */
-					sublist = TidQualFromRestrictInfoList(root, andargs, rel);
+					sublist = TidQualFromRestrictInfoList(root, andargs, rel,
+														  &sublistIsCurrentOf);
+					if (sublistIsCurrentOf)
+						elog(ERROR, "IS CURRENT OF within OR clause");
 				}
 				else
 				{
@@ -353,7 +361,10 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 			{
 				/* We can stop immediately if it's a CurrentOfExpr */
 				if (IsCurrentOfClause(rinfo, rel))
+				{
+					*isCurrentOf = true;
 					return list_make1(rinfo);
+				}
 
 				/*
 				 * Otherwise, remember the first non-OR CTID qual.  We could
@@ -483,19 +494,24 @@ ec_member_matches_ctid(PlannerInfo *root, RelOptInfo *rel,
  *
  *	  Candidate paths are added to the rel's pathlist (using add_path).
  */
-void
+bool
 create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 {
 	List	   *tidquals;
 	List	   *tidrangequals;
+	bool		isCurrentOf;
 
 	/*
 	 * If any suitable quals exist in the rel's baserestrict list, generate a
 	 * plain (unparameterized) TidPath with them.
+	 *
+	 * We skip this when enable_tidscan = false, except when the qual is
+	 * CurrentOfExpr. In that case, a TID scan is the only correct path.
 	 */
-	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel);
+	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel,
+										   &isCurrentOf);
 
-	if (tidquals != NIL)
+	if (tidquals != NIL && (enable_tidscan || isCurrentOf))
 	{
 		/*
 		 * This path uses no join clauses, but it could still have required
@@ -505,8 +521,21 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 
 		add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
 												   required_outer));
+
+		/*
+		 * When the qual is CurrentOfExpr, the path that we just added is the
+		 * only one the executor can handle, so we should return before adding
+		 * any others. Returning true lets the caller know not to add any
+		 * others, either.
+		 */
+		if (isCurrentOf)
+			return true;
 	}
 
+	/* Skip the rest if TID scans are disabled. */
+	if (!enable_tidscan)
+		return false;
+
 	/*
 	 * If there are range quals in the baserestrict list, generate a
 	 * TidRangePath.
@@ -553,4 +582,6 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * join quals, for example.
 	 */
 	BuildParameterizedTidPaths(root, rel, rel->joininfo);
+
+	return false;
 }
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 5e88c0224a..5c029b6b62 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -83,7 +83,7 @@ extern void check_index_predicates(PlannerInfo *root, RelOptInfo *rel);
  * tidpath.c
  *	  routines to generate tid paths
  */
-extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
+extern bool create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
 
 /*
  * joinpath.c
-- 
2.39.3 (Apple Git-145)

v2-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patchapplication/octet-stream; name=v2-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patchDownload

From 35823d581c0f74f1bec14f6c5c2287e4bb0f9309 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 2 May 2024 11:18:44 -0400
Subject: [PATCH v2 2/4] Rationalize behavior of enable_indexscan and
 enable_indexonlyscan.

Previously, index-scan paths were still generated even when
enable_indexscan=false, but we added disable-cost to the cost of
both index scan plans and index-only scan plans. It doesn't make sense
for enable_indexscan to affect the whether index-only scans are chosen
given that we also have a GUC called enable_indexonlyscan.

With this commit, enable_indexscan and enable_indexonlyscan work
the same way: each one prevents consideration of paths of the
appropriate type, and neither has any affect on the cost of the
generate paths. This requires some updates to the regression tests,
which previously relied on enable_indexscan=false to also disable
index-only scans.

Note that when enable_indexonlyscan=false and enable_indexscan=true,
we will generate index-scan paths that would have not have been
generated if both had been set to true. That's because generating
both an index-scan path and an index-only path would be a waste
of cycles, since the index-only path should always win. In effect,
the index-scan plan shape was still being considered; we just
rejected it before actually constructing a path.
---
 src/backend/optimizer/path/costsize.c         |  4 ---
 src/backend/optimizer/path/indxpath.c         | 26 ++++++++++++++++---
 src/test/regress/expected/btree_index.out     |  3 +++
 src/test/regress/expected/create_index.out    |  2 ++
 src/test/regress/expected/select.out          |  1 +
 src/test/regress/expected/select_parallel.out |  2 ++
 src/test/regress/expected/tuplesort.out       |  2 ++
 src/test/regress/sql/btree_index.sql          |  5 ++++
 src/test/regress/sql/create_index.sql         |  2 ++
 src/test/regress/sql/select.sql               |  1 +
 src/test/regress/sql/select_parallel.sql      |  2 ++
 src/test/regress/sql/tuplesort.sql            |  2 ++
 12 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 2021c481b4..74fc5aab56 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -603,10 +603,6 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
-	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
-
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
 	 * for scanning the index, as well as the selectivity of the index (ie,
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index c0fcc7d78d..423099d725 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -742,7 +742,13 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		IndexPath  *ipath = (IndexPath *) lfirst(lc);
 
 		if (index->amhasgettuple)
-			add_path(rel, (Path *) ipath);
+		{
+			if (ipath->path.pathtype == T_IndexScan && enable_indexscan)
+				add_path(rel, (Path *) ipath);
+			else if (ipath->path.pathtype == T_IndexOnlyScan &&
+				enable_indexonlyscan)
+				add_path(rel, (Path *) ipath);
+		}
 
 		if (index->amhasgetbitmap &&
 			(ipath->path.pathkeys == NIL ||
@@ -831,6 +837,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		case ST_INDEXSCAN:
 			if (!index->amhasgettuple)
 				return NIL;
+			if (!enable_indexscan && !enable_indexonlyscan)
+				return NIL;
 			break;
 		case ST_BITMAPSCAN:
 			if (!index->amhasgetbitmap)
@@ -978,7 +986,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		 */
 		if (index->amcanparallel &&
 			rel->consider_parallel && outer_relids == NULL &&
-			scantype != ST_BITMAPSCAN)
+			scantype != ST_BITMAPSCAN &&
+			(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 		{
 			ipath = create_index_path(root, index,
 									  index_clauses,
@@ -1028,7 +1037,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 			/* If appropriate, consider parallel index scan */
 			if (index->amcanparallel &&
 				rel->consider_parallel && outer_relids == NULL &&
-				scantype != ST_BITMAPSCAN)
+				scantype != ST_BITMAPSCAN &&
+				(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 			{
 				ipath = create_index_path(root, index,
 										  index_clauses,
@@ -1735,7 +1745,15 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
 	ListCell   *lc;
 	int			i;
 
-	/* Index-only scans must be enabled */
+	/*
+	 * Index-only scans must be enabled.
+	 *
+	 * NB: Returning false here means that an index scan will be considered
+	 * instead, so setting enable_indexscan=false causes to consider paths
+	 * that we wouldn't have considered otherwise. That seems OK, because our
+	 * only reason for not generating the index-scan paths is that we expect
+	 * them to lose on cost.
+	 */
 	if (!enable_indexonlyscan)
 		return false;
 
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 510646cbce..f15db99771 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -247,6 +247,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 --
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -290,6 +291,7 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 (2 rows)
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -330,6 +332,7 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 ---------
 (0 rows)
 
+reset enable_indexonlyscan;
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                            QUERY PLAN                            
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index cf6eac5734..ec69bafd40 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -618,6 +618,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 EXPLAIN (COSTS OFF)
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
@@ -643,6 +644,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 --
 -- GIN over int[] and text[]
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0e..6445815741 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -844,6 +844,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
                          QUERY PLAN                          
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 87273fa635..f79eda79f6 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -522,6 +522,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -622,6 +623,7 @@ select * from explain_parallel_sort_stats();
 (14 rows)
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..87b05a22cb 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -362,6 +362,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -458,6 +459,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql
index 0d2a33f370..bc99f44dda 100644
--- a/src/test/regress/sql/btree_index.sql
+++ b/src/test/regress/sql/btree_index.sql
@@ -157,6 +157,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -168,6 +169,7 @@ explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -175,6 +177,9 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
 explain (costs off)
 select proname from pg_proc where proname ilike '00%foo' order by 1;
 select proname from pg_proc where proname ilike '00%foo' order by 1;
+
+reset enable_indexonlyscan;
+
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index e296891cab..04dea5225e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -246,6 +246,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 
 EXPLAIN (COSTS OFF)
@@ -254,6 +255,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 
 --
diff --git a/src/test/regress/sql/select.sql b/src/test/regress/sql/select.sql
index 019f1e7673..a0c7417dec 100644
--- a/src/test/regress/sql/select.sql
+++ b/src/test/regress/sql/select.sql
@@ -218,6 +218,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 20376c03fa..3f003e2e71 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -201,6 +201,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -248,6 +249,7 @@ $$;
 select * from explain_parallel_sort_stats();
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..95ac8ec04c 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -162,6 +162,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -192,6 +193,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
-- 
2.39.3 (Apple Git-145)

v2-0003-Treat-the-of-disabled-nodes-in-a-path-as-a-separa.patchapplication/octet-stream; name=v2-0003-Treat-the-of-disabled-nodes-in-a-path-as-a-separa.patchDownload

From ce9e2f0d121b4c60fc9b001a94c19ebb7ac5e19d Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 10 Jun 2024 16:51:39 -0400
Subject: [PATCH v2 3/4] Treat the # of disabled nodes in a path as a separate
 cost metric.

Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the cost. Hence, the
path list is now sorted by disabled_nodes and then by total_cost,
instead of just by the latter, and likewise for the partial path list.
It is important that this number is a count and not simply a Boolean;
else, as soon as we're unable to respect disabled path types in all
portions of the path, we stop trying to avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
---
 contrib/file_fdw/file_fdw.c                   |   1 +
 contrib/postgres_fdw/postgres_fdw.c           |  44 +++-
 contrib/postgres_fdw/postgres_fdw.h           |   1 +
 src/backend/optimizer/path/costsize.c         | 169 ++++++++++----
 src/backend/optimizer/path/joinpath.c         |  15 +-
 src/backend/optimizer/plan/createplan.c       |   3 +
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/prep/prepunion.c        |   6 +-
 src/backend/optimizer/util/pathnode.c         | 206 +++++++++++++-----
 src/include/nodes/pathnodes.h                 |   2 +
 src/include/optimizer/cost.h                  |  10 +-
 src/include/optimizer/pathnode.h              |  12 +-
 src/test/regress/expected/select_parallel.out |   8 +-
 13 files changed, 362 insertions(+), 116 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 249d82d3a0..d16821f8e1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -576,6 +576,7 @@ fileGetForeignPaths(PlannerInfo *root,
 			 create_foreignscan_path(root, baserel,
 									 NULL,	/* default pathtarget */
 									 baserel->rows,
+									 0,
 									 startup_cost,
 									 total_cost,
 									 NIL,	/* no pathkeys */
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0bb9a5ae8f..f0308798a0 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -430,6 +430,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
 									List *pathkeys,
 									PgFdwPathExtraData *fpextra,
 									double *p_rows, int *p_width,
+									int *p_disabled_nodes,
 									Cost *p_startup_cost, Cost *p_total_cost);
 static void get_remote_estimate(const char *sql,
 								PGconn *conn,
@@ -442,6 +443,7 @@ static void adjust_foreign_grouping_path_cost(PlannerInfo *root,
 											  double retrieved_rows,
 											  double width,
 											  double limit_tuples,
+											  int *disabled_nodes,
 											  Cost *p_startup_cost,
 											  Cost *p_run_cost);
 static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
@@ -735,6 +737,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		 */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 
 		/* Report estimated baserel size to planner. */
@@ -765,6 +768,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		/* Fill in basically-bogus cost estimates for use later. */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 	}
 
@@ -1030,6 +1034,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 	path = create_foreignscan_path(root, baserel,
 								   NULL,	/* default pathtarget */
 								   fpinfo->rows,
+								   fpinfo->disabled_nodes,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
@@ -1184,13 +1189,14 @@ postgresGetForeignPaths(PlannerInfo *root,
 		ParamPathInfo *param_info = (ParamPathInfo *) lfirst(lc);
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 
 		/* Get a cost estimate from the remote */
 		estimate_path_cost_size(root, baserel,
 								param_info->ppi_clauses, NIL, NULL,
-								&rows, &width,
+								&rows, &width, &disabled_nodes,
 								&startup_cost, &total_cost);
 
 		/*
@@ -1203,6 +1209,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 		path = create_foreignscan_path(root, baserel,
 									   NULL,	/* default pathtarget */
 									   rows,
+									   disabled_nodes,
 									   startup_cost,
 									   total_cost,
 									   NIL, /* no pathkeys */
@@ -3078,12 +3085,14 @@ estimate_path_cost_size(PlannerInfo *root,
 						List *pathkeys,
 						PgFdwPathExtraData *fpextra,
 						double *p_rows, int *p_width,
+						int *p_disabled_nodes,
 						Cost *p_startup_cost, Cost *p_total_cost)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
 	double		rows;
 	double		retrieved_rows;
 	int			width;
+	int			disabled_nodes = 0;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -3473,6 +3482,7 @@ estimate_path_cost_size(PlannerInfo *root,
 				adjust_foreign_grouping_path_cost(root, pathkeys,
 												  retrieved_rows, width,
 												  fpextra->limit_tuples,
+												  &disabled_nodes,
 												  &startup_cost, &run_cost);
 			}
 			else
@@ -3567,6 +3577,7 @@ estimate_path_cost_size(PlannerInfo *root,
 	/* Return results. */
 	*p_rows = rows;
 	*p_width = width;
+	*p_disabled_nodes = disabled_nodes;
 	*p_startup_cost = startup_cost;
 	*p_total_cost = total_cost;
 }
@@ -3627,6 +3638,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 								  double retrieved_rows,
 								  double width,
 								  double limit_tuples,
+								  int *p_disabled_nodes,
 								  Cost *p_startup_cost,
 								  Cost *p_run_cost)
 {
@@ -3646,6 +3658,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  0,
 				  *p_startup_cost + *p_run_cost,
 				  retrieved_rows,
 				  width,
@@ -6137,13 +6150,15 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 	{
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 		List	   *useful_pathkeys = lfirst(lc);
 		Path	   *sorted_epq_path;
 
 		estimate_path_cost_size(root, rel, NIL, useful_pathkeys, NULL,
-								&rows, &width, &startup_cost, &total_cost);
+								&rows, &width, &disabled_nodes,
+								&startup_cost, &total_cost);
 
 		/*
 		 * The EPQ path must be at least as well sorted as the path itself, in
@@ -6165,6 +6180,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreignscan_path(root, rel,
 											 NULL,
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 useful_pathkeys,
@@ -6178,6 +6194,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreign_join_path(root, rel,
 											  NULL,
 											  rows,
+											  disabled_nodes,
 											  startup_cost,
 											  total_cost,
 											  useful_pathkeys,
@@ -6325,6 +6342,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 	ForeignPath *joinpath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	Path	   *epq_path;		/* Path to create plan to be executed when
@@ -6414,12 +6432,14 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 
 	/* Estimate costs for bare join relation */
 	estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	/* Now update this information in the joinrel */
 	joinrel->rows = rows;
 	joinrel->reltarget->width = width;
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6431,6 +6451,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 										joinrel,
 										NULL,	/* default pathtarget */
 										rows,
+										disabled_nodes,
 										startup_cost,
 										total_cost,
 										NIL,	/* no pathkeys */
@@ -6758,6 +6779,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	ForeignPath *grouppath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -6808,11 +6830,13 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the cost of push down */
 	estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/* Now update this information in the fpinfo */
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6821,6 +6845,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										  grouped_rel,
 										  grouped_rel->reltarget,
 										  rows,
+										  disabled_nodes,
 										  startup_cost,
 										  total_cost,
 										  NIL,	/* no pathkeys */
@@ -6849,6 +6874,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	PgFdwPathExtraData *fpextra;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -6942,7 +6968,8 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the costs of performing the final sort remotely */
 	estimate_path_cost_size(root, input_rel, NIL, root->sort_pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/*
 	 * Build the fdw_private list that will be used by postgresGetForeignPlan.
@@ -6955,6 +6982,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 											 input_rel,
 											 root->upper_targets[UPPERREL_ORDERED],
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 root->sort_pathkeys,
@@ -6988,6 +7016,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	bool		save_use_remote_estimate = false;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -7072,6 +7101,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 													   path->parent,
 													   path->pathtarget,
 													   path->rows,
+													   path->disabled_nodes,
 													   path->startup_cost,
 													   path->total_cost,
 													   path->pathkeys,
@@ -7189,7 +7219,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 		ifpinfo->use_remote_estimate = false;
 	}
 	estimate_path_cost_size(root, input_rel, NIL, pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	if (!fpextra->has_final_sort)
 		ifpinfo->use_remote_estimate = save_use_remote_estimate;
 
@@ -7208,6 +7239,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										   input_rel,
 										   root->upper_targets[UPPERREL_FINAL],
 										   rows,
+										   disabled_nodes,
 										   startup_cost,
 										   total_cost,
 										   pathkeys,
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 37c1575af6..9e501660d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -62,6 +62,7 @@ typedef struct PgFdwRelationInfo
 	/* Estimated size and cost for a scan, join, or grouping/aggregation. */
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 74fc5aab56..858e533f76 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -50,6 +50,17 @@
  * so beware of division-by-zero.)	The LIMIT is applied as a top-level
  * plan node.
  *
+ * Each path stores the total number of disabled nodes that exist at or
+ * below that point in the plan tree. This is regarded as a component of
+ * the cost, and paths with fewer disabled nodes should be regarded as
+ * cheaper than those with more. Disabled nodes occur when the user sets
+ * a GUC like enable_seqscan=false. We can't necessarily respect such a
+ * setting in every part of the plan tree, but we want to respect in as many
+ * parts of the plan tree as possible. Simpler schemes like storing a Boolean
+ * here rather than a count fail to do that. We used to disable nodes by
+ * adding a large constant to the startup cost, but that distorted planning
+ * in other ways.
+ *
  * For largely historical reasons, most of the routines in this module use
  * the passed result Path only to store their results (rows, startup_cost and
  * total_cost) into.  All the input data they need is passed as separate
@@ -301,9 +312,6 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_seqscan)
-		startup_cost += disable_cost;
-
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
 							  NULL,
@@ -346,6 +354,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
 		path->rows = clamp_row_est(path->rows / parallel_divisor);
 	}
 
+	path->disabled_nodes = enable_seqscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;
 }
@@ -418,6 +427,7 @@ cost_samplescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -456,6 +466,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows;
 
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = (startup_cost + run_cost);
 }
@@ -473,6 +484,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 void
 cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double *rows)
 {
@@ -490,9 +502,6 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	else
 		path->path.rows = rel->rows;
 
-	if (!enable_gathermerge)
-		startup_cost += disable_cost;
-
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
 	 * be overgenerous since the leader will do less work than other workers
@@ -523,6 +532,8 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows * 1.05;
 
+	path->path.disabled_nodes = input_disabled_nodes
+		+ (enable_gathermerge ? 0 : 1);
 	path->path.startup_cost = startup_cost + input_startup_cost;
 	path->path.total_cost = (startup_cost + run_cost + input_total_cost);
 }
@@ -812,6 +823,11 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 
 	run_cost += cpu_run_cost;
 
+	/*
+	 * enable_indexscan and enable_indexonlyscan are handled by skipping
+	 * path generation, so we need no logic for those cases here.
+	 */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = startup_cost + run_cost;
 }
@@ -1034,9 +1050,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
-
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
 										 &tuples_fetched);
@@ -1098,6 +1111,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = enable_bitmapscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1183,6 +1197,7 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
 	}
 	path->bitmapselectivity = selec;
 	path->path.rows = 0;		/* per above, not used */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = totalCost;
 	path->path.total_cost = totalCost;
 }
@@ -1257,6 +1272,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	/* Should only be applied to base relations */
 	Assert(baserel->relid > 0);
 	Assert(baserel->rtekind == RTE_RELATION);
+	Assert(tidquals != NIL);
 
 	/* Mark the path with the correct row estimate */
 	if (param_info)
@@ -1271,6 +1287,14 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
 		Expr	   *qual = rinfo->clause;
 
+		/*
+		 * We must use a TID scan for CurrentOfExpr; in any other case,
+		 * we should be generating a TID scan only if enable_tidscan=true.
+		 * Also, if CurrentOfExpr is the qual, there should be only one.
+		 */
+		Assert(enable_tidscan || IsA(qual, CurrentOfExpr));
+		Assert(list_length(tidquals) == 1 || !IsA(qual, CurrentOfExpr));
+
 		if (IsA(qual, ScalarArrayOpExpr))
 		{
 			/* Each element of the array yields 1 tuple */
@@ -1318,6 +1342,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/*
+	 * There are assertions above verifying that we only reach this function
+	 * either when enable_tidscan=true or when the TID scan is the only legal
+	 * path, so it's safe to set disabled_nodes to zero here.
+	 */
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1410,6 +1440,9 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/* we should not generate this path type when enable_tidscan=false */
+	Assert(enable_tidscan);
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1462,6 +1495,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
 	 * SubqueryScan node, plus cpu_tuple_cost to account for selection and
 	 * projection overhead.
 	 */
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = path->subpath->startup_cost;
 	path->path.total_cost = path->subpath->total_cost;
 
@@ -1552,6 +1586,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1608,6 +1643,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1655,6 +1691,7 @@ cost_valuesscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1702,6 +1739,7 @@ cost_ctescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1739,6 +1777,7 @@ cost_namedtuplestorescan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1773,6 +1812,7 @@ cost_resultscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1812,6 +1852,7 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm)
 	 */
 	total_cost += cpu_tuple_cost * total_rows;
 
+	runion->disabled_nodes = nrterm->disabled_nodes + rterm->disabled_nodes;
 	runion->startup_cost = startup_cost;
 	runion->total_cost = total_cost;
 	runion->rows = total_rows;
@@ -1960,6 +2001,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
 void
 cost_incremental_sort(Path *path,
 					  PlannerInfo *root, List *pathkeys, int presorted_keys,
+					  int input_disabled_nodes,
 					  Cost input_startup_cost, Cost input_total_cost,
 					  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 					  double limit_tuples)
@@ -2079,6 +2121,11 @@ cost_incremental_sort(Path *path,
 	run_cost += 2.0 * cpu_tuple_cost * input_groups;
 
 	path->rows = input_tuples;
+
+	/* should not generate these paths when enable_incremental_sort=false */
+	Assert(enable_incremental_sort);
+	path->disabled_nodes = input_disabled_nodes;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2097,7 +2144,8 @@ cost_incremental_sort(Path *path,
  */
 void
 cost_sort(Path *path, PlannerInfo *root,
-		  List *pathkeys, Cost input_cost, double tuples, int width,
+		  List *pathkeys, int input_disabled_nodes,
+		  Cost input_cost, double tuples, int width,
 		  Cost comparison_cost, int sort_mem,
 		  double limit_tuples)
 
@@ -2110,12 +2158,10 @@ cost_sort(Path *path, PlannerInfo *root,
 				   comparison_cost, sort_mem,
 				   limit_tuples);
 
-	if (!enable_sort)
-		startup_cost += disable_cost;
-
 	startup_cost += input_cost;
 
 	path->rows = tuples;
+	path->disabled_nodes = input_disabled_nodes + (enable_sort ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2207,6 +2253,7 @@ cost_append(AppendPath *apath)
 {
 	ListCell   *l;
 
+	apath->path.disabled_nodes = 0;
 	apath->path.startup_cost = 0;
 	apath->path.total_cost = 0;
 	apath->path.rows = 0;
@@ -2228,12 +2275,16 @@ cost_append(AppendPath *apath)
 			 */
 			apath->path.startup_cost = firstsubpath->startup_cost;
 
-			/* Compute rows and costs as sums of subplan rows and costs. */
+			/*
+			 * Compute rows, number of disabled nodes, and total cost as sums
+			 * of underlying subplan values.
+			 */
 			foreach(l, apath->subpaths)
 			{
 				Path	   *subpath = (Path *) lfirst(l);
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.total_cost += subpath->total_cost;
 			}
 		}
@@ -2273,6 +2324,7 @@ cost_append(AppendPath *apath)
 					cost_sort(&sort_path,
 							  NULL, /* doesn't currently need root */
 							  pathkeys,
+							  subpath->disabled_nodes,
 							  subpath->total_cost,
 							  subpath->rows,
 							  subpath->pathtarget->width,
@@ -2283,6 +2335,7 @@ cost_append(AppendPath *apath)
 				}
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.startup_cost += subpath->startup_cost;
 				apath->path.total_cost += subpath->total_cost;
 			}
@@ -2331,6 +2384,7 @@ cost_append(AppendPath *apath)
 				apath->path.total_cost += subpath->total_cost;
 			}
 
+			apath->path.disabled_nodes += subpath->disabled_nodes;
 			apath->path.rows = clamp_row_est(apath->path.rows);
 
 			i++;
@@ -2371,6 +2425,7 @@ cost_append(AppendPath *apath)
  *
  * 'pathkeys' is a list of sort keys
  * 'n_streams' is the number of input streams
+ * 'input_disabled_nodes' is the sum of the input streams' disabled node counts
  * 'input_startup_cost' is the sum of the input streams' startup costs
  * 'input_total_cost' is the sum of the input streams' total costs
  * 'tuples' is the number of tuples in all the streams
@@ -2378,6 +2433,7 @@ cost_append(AppendPath *apath)
 void
 cost_merge_append(Path *path, PlannerInfo *root,
 				  List *pathkeys, int n_streams,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double tuples)
 {
@@ -2408,6 +2464,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
 	 */
 	run_cost += cpu_tuple_cost * APPEND_CPU_COST_MULTIPLIER * tuples;
 
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost + input_startup_cost;
 	path->total_cost = startup_cost + run_cost + input_total_cost;
 }
@@ -2426,6 +2483,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
  */
 void
 cost_material(Path *path,
+			  int input_disabled_nodes,
 			  Cost input_startup_cost, Cost input_total_cost,
 			  double tuples, int width)
 {
@@ -2463,6 +2521,13 @@ cost_material(Path *path,
 		run_cost += seq_page_cost * npages;
 	}
 
+	/*
+	 * There are some situations where we add Materialize node even with
+	 * enable_material=false, but those are done when converting the Path to
+	 * a Plan; hence, enable_material should be true here.
+	 */
+	Assert(enable_material);
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2626,6 +2691,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 		 int numGroupCols, double numGroups,
 		 List *quals,
+		 int disabled_nodes,
 		 Cost input_startup_cost, Cost input_total_cost,
 		 double input_tuples, double input_width)
 {
@@ -2681,10 +2747,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		startup_cost = input_startup_cost;
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
-		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
-		}
+			++disabled_nodes;
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
 		total_cost += aggcosts->transCost.per_tuple * input_tuples;
@@ -2699,7 +2762,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			++disabled_nodes;
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		/* cost of computing hash value */
@@ -2808,6 +2871,7 @@ cost_agg(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3042,6 +3106,7 @@ get_windowclause_startup_tuples(PlannerInfo *root, WindowClause *wc,
 void
 cost_windowagg(Path *path, PlannerInfo *root,
 			   List *windowFuncs, WindowClause *winclause,
+			   int input_disabled_nodes,
 			   Cost input_startup_cost, Cost input_total_cost,
 			   double input_tuples)
 {
@@ -3107,6 +3172,7 @@ cost_windowagg(Path *path, PlannerInfo *root,
 	total_cost += cpu_tuple_cost * input_tuples;
 
 	path->rows = input_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 
@@ -3138,6 +3204,7 @@ void
 cost_group(Path *path, PlannerInfo *root,
 		   int numGroupCols, double numGroups,
 		   List *quals,
+		   int input_disabled_nodes,
 		   Cost input_startup_cost, Cost input_total_cost,
 		   double input_tuples)
 {
@@ -3176,6 +3243,7 @@ cost_group(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3210,6 +3278,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  Path *outer_path, Path *inner_path,
 					  JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3218,6 +3287,11 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Cost		inner_run_cost;
 	Cost		inner_rescan_run_cost;
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_nestloop ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* estimate costs to rescan the inner relation */
 	cost_rescan(root, inner_path,
 				&inner_rescan_start_cost,
@@ -3265,6 +3339,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_nestloop */
@@ -3294,6 +3369,9 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	QualCost	restrict_qual_cost;
 	double		ntuples;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (outer_path_rows <= 0)
 		outer_path_rows = 1;
@@ -3314,13 +3392,10 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_nestloop)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_nestloop ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3493,6 +3568,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					   List *outersortkeys, List *innersortkeys,
 					   JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3613,6 +3689,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Assert(outerstartsel <= outerendsel);
 	Assert(innerstartsel <= innerendsel);
 
+	disabled_nodes = enable_mergejoin ? 0 : 1;
+
 	/* cost of source data */
 
 	if (outersortkeys)			/* do we need to sort outer? */
@@ -3620,12 +3698,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  outersortkeys,
+				  outer_path->disabled_nodes,
 				  outer_path->total_cost,
 				  outer_path_rows,
 				  outer_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* outerstartsel;
@@ -3634,6 +3714,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += outer_path->disabled_nodes;
 		startup_cost += outer_path->startup_cost;
 		startup_cost += (outer_path->total_cost - outer_path->startup_cost)
 			* outerstartsel;
@@ -3646,12 +3727,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  innersortkeys,
+				  inner_path->disabled_nodes,
 				  inner_path->total_cost,
 				  inner_path_rows,
 				  inner_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* innerstartsel;
@@ -3660,6 +3743,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += inner_path->disabled_nodes;
 		startup_cost += inner_path->startup_cost;
 		startup_cost += (inner_path->total_cost - inner_path->startup_cost)
 			* innerstartsel;
@@ -3678,6 +3762,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost + inner_run_cost;
 	/* Save private data for final_cost_mergejoin */
@@ -3742,6 +3827,9 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 				rescannedtuples;
 	double		rescanratio;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
@@ -3761,14 +3849,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_mergejoin)
-		startup_cost += disable_cost;
-
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
 	 * separately.
@@ -4052,6 +4132,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  JoinPathExtraData *extra,
 					  bool parallel_hash)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -4063,6 +4144,11 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	int			num_skew_mcvs;
 	size_t		space_allowed;	/* unused */
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_hashjoin ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* cost of source data */
 	startup_cost += outer_path->startup_cost;
 	run_cost += outer_path->total_cost - outer_path->startup_cost;
@@ -4132,6 +4218,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_hashjoin */
@@ -4176,6 +4263,9 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	Selectivity innermcvfreq;
 	ListCell   *hcl;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Mark the path with the correct row estimate */
 	if (path->jpath.path.param_info)
 		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
@@ -4191,13 +4281,10 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_hashjoin)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_hashjoin ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 5be8da9e09..c1ac4f1fdc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -811,7 +811,7 @@ try_nestloop_path(PlannerInfo *root,
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -894,7 +894,8 @@ try_partial_nestloop_path(PlannerInfo *root,
 	 */
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -991,7 +992,7 @@ try_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -1067,7 +1068,8 @@ try_partial_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1136,7 +1138,7 @@ try_hashjoin_path(PlannerInfo *root,
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, false);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  NIL, required_outer))
 	{
@@ -1202,7 +1204,8 @@ try_partial_hashjoin_path(PlannerInfo *root,
 	 */
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, parallel_hash);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, NIL))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 6b64c4a362..ce8a37bb58 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -25,6 +25,7 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/print.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
@@ -5456,6 +5457,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
+			  0,		/* a Plan contains no count of disabled nodes */
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6550,6 +6552,7 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
+				  0,		/* a Plan contains no count of disabled nodes */
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4711f91239..e3d9fa9e81 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6749,6 +6749,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	/* Estimate the cost of seq scan + sort */
 	seqScanPath = create_seqscan_path(root, rel, NULL, 0);
 	cost_sort(&seqScanAndSortPath, root, NIL,
+			  seqScanPath->disabled_nodes,
 			  seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
 			  comparisonCost, maintenance_work_mem, -1.0);
 
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 1c69c6e97e..a0baf6d4a1 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1346,6 +1346,7 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	cost_agg(&hashed_p, root, AGG_HASHED, NULL,
 			 numGroupCols, dNumGroups,
 			 NIL,
+			 input_path->disabled_nodes,
 			 input_path->startup_cost, input_path->total_cost,
 			 input_path->rows, input_path->pathtarget->width);
 
@@ -1353,14 +1354,17 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	 * Now for the sorted case.  Note that the input is *always* unsorted,
 	 * since it was made by appending unrelated sub-relations together.
 	 */
+	sorted_p.disabled_nodes = input_path->disabled_nodes;
 	sorted_p.startup_cost = input_path->startup_cost;
 	sorted_p.total_cost = input_path->total_cost;
 	/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
-	cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
+	cost_sort(&sorted_p, root, NIL, sorted_p.disabled_nodes,
+			  sorted_p.total_cost,
 			  input_path->rows, input_path->pathtarget->width,
 			  0.0, work_mem, -1.0);
 	cost_group(&sorted_p, root, numGroupCols, dNumGroups,
 			   NIL,
+			   sorted_p.disabled_nodes,
 			   sorted_p.startup_cost, sorted_p.total_cost,
 			   input_path->rows);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 3491c3af1c..015c916edb 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
 int
 compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
 {
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (criterion == STARTUP_COST)
 	{
 		if (path1->startup_cost < path2->startup_cost)
@@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 	Cost		cost1,
 				cost2;
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (fraction <= 0.0 || fraction >= 1.0)
 		return compare_path_costs(path1, path2, TOTAL_COST);
 	cost1 = path1->startup_cost +
@@ -166,6 +184,15 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 #define CONSIDER_PATH_STARTUP_COST(p)  \
 	((p)->param_info == NULL ? (p)->parent->consider_startup : (p)->parent->consider_param_startup)
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return COSTS_BETTER1;
+		else
+			return COSTS_BETTER2;
+	}
+
 	/*
 	 * Check total cost first since it's more likely to be different; many
 	 * paths have zero startup cost.
@@ -362,15 +389,29 @@ set_cheapest(RelOptInfo *parent_rel)
  * add_path
  *	  Consider a potential implementation path for the specified parent rel,
  *	  and add it to the rel's pathlist if it is worthy of consideration.
+ *
  *	  A path is worthy if it has a better sort order (better pathkeys) or
- *	  cheaper cost (on either dimension), or generates fewer rows, than any
- *	  existing path that has the same or superset parameterization rels.
- *	  We also consider parallel-safe paths more worthy than others.
+ *	  cheaper cost (as defined below), or generates fewer rows, than any
+ *    existing path that has the same or superset parameterization rels.  We
+ *    also consider parallel-safe paths more worthy than others.
+ *
+ *    Cheaper cost can mean either a cheaper total cost or a cheaper startup
+ *    cost; if one path is cheaper in one of these aspects and another is
+ *    cheaper in the other, we keep both. However, when some path type is
+ *    disabled (e.g. due to enable_seqscan=false), the number of times that
+ *    a disabled path type is used is considered to be a higher-order
+ *    component of the cost. Hence, if path A uses no disabled path type,
+ *    and path B uses 1 or more disabled path types, A is cheaper, no matter
+ *    what we estimate for the startup and total costs. The startup and total
+ *    cost essentially act as a tiebreak when comparing paths that use equal
+ *    numbers of disabled path nodes; but in practice this tiebreak is almost
+ *    always used, since normally no path types are disabled.
  *
- *	  We also remove from the rel's pathlist any old paths that are dominated
- *	  by new_path --- that is, new_path is cheaper, at least as well ordered,
- *	  generates no more rows, requires no outer rels not required by the old
- *	  path, and is no less parallel-safe.
+ *	  In addition to possibly adding new_path, we also remove from the rel's
+ *    pathlist any old paths that are dominated by new_path --- that is,
+ *    new_path is cheaper, at least as well ordered, generates no more rows,
+ *    requires no outer rels not required by the old path, and is no less
+ *    parallel-safe.
  *
  *	  In most cases, a path with a superset parameterization will generate
  *	  fewer rows (since it has more join clauses to apply), so that those two
@@ -389,10 +430,10 @@ set_cheapest(RelOptInfo *parent_rel)
  *	  parent_rel->consider_param_startup is true for a parameterized one.
  *	  Again, this allows discarding useless paths sooner.
  *
- *	  The pathlist is kept sorted by total_cost, with cheaper paths
- *	  at the front.  Within this routine, that's simply a speed hack:
- *	  doing it that way makes it more likely that we will reject an inferior
- *	  path after a few comparisons, rather than many comparisons.
+ *	  The pathlist is kept sorted by disabled_nodes and then by total_cost,
+ *    with cheaper paths at the front.  Within this routine, that's simply a
+ *    speed hack: doing it that way makes it more likely that we will reject
+ *    an inferior path after a few comparisons, rather than many comparisons.
  *	  However, add_path_precheck relies on this ordering to exit early
  *	  when possible.
  *
@@ -593,8 +634,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 		}
 		else
 		{
-			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			/*
+			 * new belongs after this old path if it has more disabled nodes
+			 * or if it has the same number of nodes but a greater total cost
+			 */
+			if (new_path->disabled_nodes > old_path->disabled_nodes ||
+				(new_path->disabled_nodes == old_path->disabled_nodes &&
+				 new_path->total_cost >= old_path->total_cost))
 				insert_at = foreach_current_index(p1) + 1;
 		}
 
@@ -639,7 +685,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
  * so the required information has to be passed piecemeal.
  */
 bool
-add_path_precheck(RelOptInfo *parent_rel,
+add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 				  Cost startup_cost, Cost total_cost,
 				  List *pathkeys, Relids required_outer)
 {
@@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
 		Path	   *old_path = (Path *) lfirst(p1);
 		PathKeysComparison keyscmp;
 
+		/*
+		 * Since the pathlist is sorted by disabled_nodes and then by
+		 * total_cost, we can stop looking once we reach a path with more
+		 * disabled nodes, or the same number of disabled nodes plus a
+		 * total_cost larger than the new path's.
+		 */
+		if (unlikely(old_path->disabled_nodes != disabled_nodes))
+		{
+			if (disabled_nodes < old_path->disabled_nodes)
+				break;
+		}
+		else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
+			break;
+
 		/*
 		 * We are looking for an old_path with the same parameterization (and
 		 * by assumption the same rowcount) that dominates the new path on
@@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
 		 *
 		 * Cost comparisons here should match compare_path_costs_fuzzily.
 		 */
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		/* new path can win on startup cost only if consider_startup */
+		if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
+			!consider_startup)
 		{
-			/* new path can win on startup cost only if consider_startup */
-			if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
-				!consider_startup)
+			/* new path loses on cost, so check pathkeys... */
+			List	   *old_path_pathkeys;
+
+			old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
+			keyscmp = compare_pathkeys(new_path_pathkeys,
+									   old_path_pathkeys);
+			if (keyscmp == PATHKEYS_EQUAL ||
+				keyscmp == PATHKEYS_BETTER2)
 			{
-				/* new path loses on cost, so check pathkeys... */
-				List	   *old_path_pathkeys;
-
-				old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
-				keyscmp = compare_pathkeys(new_path_pathkeys,
-										   old_path_pathkeys);
-				if (keyscmp == PATHKEYS_EQUAL ||
-					keyscmp == PATHKEYS_BETTER2)
+				/* new path does not win on pathkeys... */
+				if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
 				{
-					/* new path does not win on pathkeys... */
-					if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
-					{
-						/* Found an old path that dominates the new one */
-						return false;
-					}
+					/* Found an old path that dominates the new one */
+					return false;
 				}
 			}
 		}
-		else
-		{
-			/*
-			 * Since the pathlist is sorted by total_cost, we can stop looking
-			 * once we reach a path with a total_cost larger than the new
-			 * path's.
-			 */
-			break;
-		}
 	}
 
 	return true;
@@ -734,7 +782,7 @@ add_path_precheck(RelOptInfo *parent_rel,
  *	  produce the same number of rows.  Neither do we need to consider startup
  *	  costs: parallelism is only used for plans that will be run to completion.
  *	  Therefore, this routine is much simpler than add_path: it needs to
- *	  consider only pathkeys and total cost.
+ *	  consider only disabled nodes, pathkeys and total cost.
  *
  *	  As with add_path, we pfree paths that are found to be dominated by
  *	  another partial path; this requires that there be no other references to
@@ -775,7 +823,15 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Unless pathkeys are incompatible, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+			if (unlikely(new_path->disabled_nodes != old_path->disabled_nodes))
+			{
+				if (new_path->disabled_nodes > old_path->disabled_nodes)
+					accept_new = false;
+				else
+					remove_old = true;
+			}
+			else if (new_path->total_cost > old_path->total_cost
+					 * STD_FUZZ_FACTOR)
 			{
 				/* New path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER1)
@@ -862,8 +918,8 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
  * is surely a loser.
  */
 bool
-add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
-						  List *pathkeys)
+add_partial_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
+						  Cost total_cost, List *pathkeys)
 {
 	ListCell   *p1;
 
@@ -906,8 +962,8 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 	 * partial path; the resulting plans, if run in parallel, will be run to
 	 * completion.
 	 */
-	if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
-						   NULL))
+	if (!add_path_precheck(parent_rel, disabled_nodes, total_cost, total_cost,
+						   pathkeys, NULL))
 		return false;
 
 	return true;
@@ -1419,6 +1475,7 @@ create_merge_append_path(PlannerInfo *root,
 						 Relids required_outer)
 {
 	MergeAppendPath *pathnode = makeNode(MergeAppendPath);
+	int			input_disabled_nodes;
 	Cost		input_startup_cost;
 	Cost		input_total_cost;
 	ListCell   *l;
@@ -1447,6 +1504,7 @@ create_merge_append_path(PlannerInfo *root,
 	 * Add up the sizes and costs of the input paths.
 	 */
 	pathnode->path.rows = 0;
+	input_disabled_nodes = 0;
 	input_startup_cost = 0;
 	input_total_cost = 0;
 	foreach(l, subpaths)
@@ -1460,6 +1518,7 @@ create_merge_append_path(PlannerInfo *root,
 		if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 		{
 			/* Subpath is adequately ordered, we won't need to sort it */
+			input_disabled_nodes += subpath->disabled_nodes;
 			input_startup_cost += subpath->startup_cost;
 			input_total_cost += subpath->total_cost;
 		}
@@ -1471,12 +1530,14 @@ create_merge_append_path(PlannerInfo *root,
 			cost_sort(&sort_path,
 					  root,
 					  pathkeys,
+					  subpath->disabled_nodes,
 					  subpath->total_cost,
 					  subpath->rows,
 					  subpath->pathtarget->width,
 					  0.0,
 					  work_mem,
 					  pathnode->limit_tuples);
+			input_disabled_nodes += sort_path.disabled_nodes;
 			input_startup_cost += sort_path.startup_cost;
 			input_total_cost += sort_path.total_cost;
 		}
@@ -1495,12 +1556,14 @@ create_merge_append_path(PlannerInfo *root,
 		((Path *) linitial(subpaths))->parallel_aware ==
 		pathnode->path.parallel_aware)
 	{
+		pathnode->path.disabled_nodes = input_disabled_nodes;
 		pathnode->path.startup_cost = input_startup_cost;
 		pathnode->path.total_cost = input_total_cost;
 	}
 	else
 		cost_merge_append(&pathnode->path, root,
 						  pathkeys, list_length(subpaths),
+						  input_disabled_nodes,
 						  input_startup_cost, input_total_cost,
 						  pathnode->path.rows);
 
@@ -1582,6 +1645,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 	pathnode->subpath = subpath;
 
 	cost_material(&pathnode->path,
+				  subpath->disabled_nodes,
 				  subpath->startup_cost,
 				  subpath->total_cost,
 				  subpath->rows,
@@ -1628,6 +1692,10 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	 */
 	pathnode->est_entries = 0;
 
+	/* we should not generate this path type when enable_memoize=false */
+	Assert(enable_memoize);
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
+
 	/*
 	 * Add a small additional charge for caching the first entry.  All the
 	 * harder calculations for rescans are performed in cost_memoize_rescan().
@@ -1727,6 +1795,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	{
 		pathnode->umethod = UNIQUE_PATH_NOOP;
 		pathnode->path.rows = rel->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost;
 		pathnode->path.total_cost = subpath->total_cost;
 		pathnode->path.pathkeys = subpath->pathkeys;
@@ -1765,6 +1834,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 			{
 				pathnode->umethod = UNIQUE_PATH_NOOP;
 				pathnode->path.rows = rel->rows;
+				pathnode->path.disabled_nodes = subpath->disabled_nodes;
 				pathnode->path.startup_cost = subpath->startup_cost;
 				pathnode->path.total_cost = subpath->total_cost;
 				pathnode->path.pathkeys = subpath->pathkeys;
@@ -1792,6 +1862,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		 * Estimate cost for sort+unique implementation
 		 */
 		cost_sort(&sort_path, root, NIL,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  rel->rows,
 				  subpath->pathtarget->width,
@@ -1829,6 +1900,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 					 AGG_HASHED, NULL,
 					 numCols, pathnode->path.rows,
 					 NIL,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 rel->rows,
@@ -1837,7 +1909,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (sjinfo->semi_can_btree && sjinfo->semi_can_hash)
 	{
-		if (agg_path.total_cost < sort_path.total_cost)
+		if (agg_path.disabled_nodes < sort_path.disabled_nodes ||
+			(agg_path.disabled_nodes == sort_path.disabled_nodes &&
+			 agg_path.total_cost < sort_path.total_cost))
 			pathnode->umethod = UNIQUE_PATH_HASH;
 		else
 			pathnode->umethod = UNIQUE_PATH_SORT;
@@ -1855,11 +1929,13 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (pathnode->umethod == UNIQUE_PATH_HASH)
 	{
+		pathnode->path.disabled_nodes = agg_path.disabled_nodes;
 		pathnode->path.startup_cost = agg_path.startup_cost;
 		pathnode->path.total_cost = agg_path.total_cost;
 	}
 	else
 	{
+		pathnode->path.disabled_nodes = sort_path.disabled_nodes;
 		pathnode->path.startup_cost = sort_path.startup_cost;
 		pathnode->path.total_cost = sort_path.total_cost;
 	}
@@ -1883,6 +1959,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 						 Relids required_outer, double *rows)
 {
 	GatherMergePath *pathnode = makeNode(GatherMergePath);
+	int			input_disabled_nodes = 0;
 	Cost		input_startup_cost = 0;
 	Cost		input_total_cost = 0;
 
@@ -1904,6 +1981,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 	{
 		/* Subpath is adequately ordered, we won't need to sort it */
+		input_disabled_nodes += subpath->disabled_nodes;
 		input_startup_cost += subpath->startup_cost;
 		input_total_cost += subpath->total_cost;
 	}
@@ -1915,18 +1993,21 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  subpath->rows,
 				  subpath->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1);
+		input_disabled_nodes += sort_path.disabled_nodes;
 		input_startup_cost += sort_path.startup_cost;
 		input_total_cost += sort_path.total_cost;
 	}
 
 	cost_gather_merge(pathnode, root, rel, pathnode->path.param_info,
-					  input_startup_cost, input_total_cost, rows);
+					  input_disabled_nodes, input_startup_cost,
+					  input_total_cost, rows);
 
 	return pathnode;
 }
@@ -2234,7 +2315,8 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 						PathTarget *target,
-						double rows, Cost startup_cost, Cost total_cost,
+						double rows, int disabled_nodes,
+						Cost startup_cost, Cost total_cost,
 						List *pathkeys,
 						Relids required_outer,
 						Path *fdw_outerpath,
@@ -2255,6 +2337,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2280,7 +2363,8 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 						 PathTarget *target,
-						 double rows, Cost startup_cost, Cost total_cost,
+						 double rows, int disabled_nodes,
+						 Cost startup_cost, Cost total_cost,
 						 List *pathkeys,
 						 Relids required_outer,
 						 Path *fdw_outerpath,
@@ -2307,6 +2391,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2332,7 +2417,8 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 						  PathTarget *target,
-						  double rows, Cost startup_cost, Cost total_cost,
+						  double rows, int disabled_nodes,
+						  Cost startup_cost, Cost total_cost,
 						  List *pathkeys,
 						  Path *fdw_outerpath,
 						  List *fdw_restrictinfo,
@@ -2354,6 +2440,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2741,6 +2828,7 @@ create_projection_path(PlannerInfo *root,
 		 * Set cost of plan as subpath's cost, adjusted for tlist replacement.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			(target->cost.startup - oldtarget->cost.startup);
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2757,6 +2845,7 @@ create_projection_path(PlannerInfo *root,
 		 * evaluating the tlist.  There is no qual to worry about.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			target->cost.startup;
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2974,6 +3063,7 @@ create_incremental_sort_path(PlannerInfo *root,
 
 	cost_incremental_sort(&pathnode->path,
 						  root, pathkeys, presorted_keys,
+						  subpath->disabled_nodes,
 						  subpath->startup_cost,
 						  subpath->total_cost,
 						  subpath->rows,
@@ -3020,6 +3110,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->subpath = subpath;
 
 	cost_sort(&pathnode->path, root, pathkeys,
+			  subpath->disabled_nodes,
 			  subpath->total_cost,
 			  subpath->rows,
 			  subpath->pathtarget->width,
@@ -3072,6 +3163,7 @@ create_group_path(PlannerInfo *root,
 			   list_length(groupClause),
 			   numGroups,
 			   qual,
+			   subpath->disabled_nodes,
 			   subpath->startup_cost, subpath->total_cost,
 			   subpath->rows);
 
@@ -3129,6 +3221,7 @@ create_upper_unique_path(PlannerInfo *root,
 	 * all columns get compared at most of the tuples.  (XXX probably this is
 	 * an overestimate.)
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_operator_cost * subpath->rows * numCols;
@@ -3207,6 +3300,7 @@ create_agg_path(PlannerInfo *root,
 			 aggstrategy, aggcosts,
 			 list_length(groupClause), numGroups,
 			 qual,
+			 subpath->disabled_nodes,
 			 subpath->startup_cost, subpath->total_cost,
 			 subpath->rows, subpath->pathtarget->width);
 
@@ -3315,6 +3409,7 @@ create_groupingsets_path(PlannerInfo *root,
 					 numGroupCols,
 					 rollup->numGroups,
 					 having_qual,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 subpath->rows,
@@ -3340,7 +3435,7 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
-						 0.0, 0.0,
+						 0, 0.0, 0.0,
 						 subpath->rows,
 						 subpath->pathtarget->width);
 				if (!rollup->is_hashed)
@@ -3349,7 +3444,7 @@ create_groupingsets_path(PlannerInfo *root,
 			else
 			{
 				/* Account for cost of sort, but don't charge input cost again */
-				cost_sort(&sort_path, root, NIL,
+				cost_sort(&sort_path, root, NIL, 0,
 						  0.0,
 						  subpath->rows,
 						  subpath->pathtarget->width,
@@ -3365,12 +3460,14 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
+						 sort_path.disabled_nodes,
 						 sort_path.startup_cost,
 						 sort_path.total_cost,
 						 sort_path.rows,
 						 subpath->pathtarget->width);
 			}
 
+			pathnode->path.disabled_nodes += agg_path.disabled_nodes;
 			pathnode->path.total_cost += agg_path.total_cost;
 			pathnode->path.rows += agg_path.rows;
 		}
@@ -3524,6 +3621,7 @@ create_windowagg_path(PlannerInfo *root,
 	cost_windowagg(&pathnode->path, root,
 				   windowFuncs,
 				   winclause,
+				   subpath->disabled_nodes,
 				   subpath->startup_cost,
 				   subpath->total_cost,
 				   subpath->rows);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2ba297c117..4f4d767971 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1658,6 +1658,7 @@ typedef struct Path
 
 	/* estimated size/costs for path (see costsize.c for more info) */
 	Cardinality rows;			/* estimated number of result tuples */
+	int			disabled_nodes;	/* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
@@ -3333,6 +3334,7 @@ typedef struct
 typedef struct JoinCostWorkspace
 {
 	/* Preliminary cost estimates --- must not be larger than final ones! */
+	int			disabled_nodes;
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b1c51a4e70..731e8dc641 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -108,35 +108,42 @@ extern void cost_resultscan(Path *path, PlannerInfo *root,
 							RelOptInfo *baserel, ParamPathInfo *param_info);
 extern void cost_recursive_union(Path *runion, Path *nrterm, Path *rterm);
 extern void cost_sort(Path *path, PlannerInfo *root,
-					  List *pathkeys, Cost input_cost, double tuples, int width,
+					  List *pathkeys, int disabled_nodes,
+					  Cost input_cost, double tuples, int width,
 					  Cost comparison_cost, int sort_mem,
 					  double limit_tuples);
 extern void cost_incremental_sort(Path *path,
 								  PlannerInfo *root, List *pathkeys, int presorted_keys,
+								  int input_disabled_nodes,
 								  Cost input_startup_cost, Cost input_total_cost,
 								  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 								  double limit_tuples);
 extern void cost_append(AppendPath *apath);
 extern void cost_merge_append(Path *path, PlannerInfo *root,
 							  List *pathkeys, int n_streams,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double tuples);
 extern void cost_material(Path *path,
+						  int input_disabled_nodes,
 						  Cost input_startup_cost, Cost input_total_cost,
 						  double tuples, int width);
 extern void cost_agg(Path *path, PlannerInfo *root,
 					 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 					 int numGroupCols, double numGroups,
 					 List *quals,
+					 int input_disabled_nodes,
 					 Cost input_startup_cost, Cost input_total_cost,
 					 double input_tuples, double input_width);
 extern void cost_windowagg(Path *path, PlannerInfo *root,
 						   List *windowFuncs, WindowClause *winclause,
+						   int input_disabled_nodes,
 						   Cost input_startup_cost, Cost input_total_cost,
 						   double input_tuples);
 extern void cost_group(Path *path, PlannerInfo *root,
 					   int numGroupCols, double numGroups,
 					   List *quals,
+					   int input_disabled_nodes,
 					   Cost input_startup_cost, Cost input_total_cost,
 					   double input_tuples);
 extern void initial_cost_nestloop(PlannerInfo *root,
@@ -171,6 +178,7 @@ extern void cost_gather(GatherPath *path, PlannerInfo *root,
 						RelOptInfo *rel, ParamPathInfo *param_info, double *rows);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 							  RelOptInfo *rel, ParamPathInfo *param_info,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 112e7c23d4..36e1c24a56 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -27,11 +27,12 @@ extern int	compare_fractional_path_costs(Path *path1, Path *path2,
 										  double fraction);
 extern void set_cheapest(RelOptInfo *parent_rel);
 extern void add_path(RelOptInfo *parent_rel, Path *new_path);
-extern bool add_path_precheck(RelOptInfo *parent_rel,
+extern bool add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 							  Cost startup_cost, Cost total_cost,
 							  List *pathkeys, Relids required_outer);
 extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
 extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
+									  int disabled_nodes,
 									  Cost total_cost, List *pathkeys);
 
 extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
@@ -124,7 +125,8 @@ extern Path *create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 									   Relids required_outer);
 extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											PathTarget *target,
-											double rows, Cost startup_cost, Cost total_cost,
+											double rows, int disabled_nodes,
+											Cost startup_cost, Cost total_cost,
 											List *pathkeys,
 											Relids required_outer,
 											Path *fdw_outerpath,
@@ -132,7 +134,8 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											List *fdw_private);
 extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 PathTarget *target,
-											 double rows, Cost startup_cost, Cost total_cost,
+											 double rows, int disabled_nodes,
+											 Cost startup_cost, Cost total_cost,
 											 List *pathkeys,
 											 Relids required_outer,
 											 Path *fdw_outerpath,
@@ -140,7 +143,8 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 List *fdw_private);
 extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 											  PathTarget *target,
-											  double rows, Cost startup_cost, Cost total_cost,
+											  double rows, int disabled_nodes,
+											  Cost startup_cost, Cost total_cost,
 											  List *pathkeys,
 											  Path *fdw_outerpath,
 											  List *fdw_restrictinfo,
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index f79eda79f6..20c651aadb 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -539,15 +539,17 @@ explain (costs off)
 ------------------------------------------------------------
  Aggregate
    ->  Nested Loop
-         ->  Seq Scan on tenk2
-               Filter: (thousand = 0)
+         ->  Gather
+               Workers Planned: 4
+               ->  Parallel Seq Scan on tenk2
+                     Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
                ->  Parallel Bitmap Heap Scan on tenk1
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(10 rows)
+(12 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
-- 
2.39.3 (Apple Git-145)

#55

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Robert Haas (#54)

Re: On disable_cost

Hi,

On 2024-06-12 11:35:48 -0400, Robert Haas wrote:

Subject: [PATCH v2 3/4] Treat the # of disabled nodes in a path as a separate
cost metric.

Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the cost. Hence, the
path list is now sorted by disabled_nodes and then by total_cost,
instead of just by the latter, and likewise for the partial path list.
It is important that this number is a count and not simply a Boolean;
else, as soon as we're unable to respect disabled path types in all
portions of the path, we stop trying to avoid them where we can.

if (criterion == STARTUP_COST)
{
if (path1->startup_cost < path2->startup_cost)
@@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
Cost cost1,
cost2;
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}

I suspect it's going to be ok, because the branch is going to be very
predictable in normal workloads, but I still worry a bit about making
compare_path_costs_fuzzily() more expensive. For more join-heavy queries it
can really show up and there's plenty ORM generated join-heavy query
workloads.

If costs were 32 bit integers, I'd have suggested just stashing the disabled
counts in the upper 32 bits of a 64bit integer. But ...

<can't resist trying if I see overhead>

In an extreme case i can see a tiny bit of overhead, but not enough to be
worth worrying about. Mostly because we're so profligate in doing
bms_overlap() that cost comparisons don't end up mattering as much - I seem to
recall that being different in the not distant past though.

Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
mergejoins_allowed. If mergejoins are disabled we end up reaching
match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
only set mergejoin_enabled for right joins inside select_mergejoin_clauses(),
but we don't call select_mergejoin_clauses() if !enable_mergejoin and jointype
!= FULL. I, what?

Greetings,

Andres Freund

#56

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Andres Freund (#55)

Re: On disable_cost

On Wed, Jun 12, 2024 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:

<can't resist trying if I see overhead>

In an extreme case i can see a tiny bit of overhead, but not enough to be
worth worrying about. Mostly because we're so profligate in doing
bms_overlap() that cost comparisons don't end up mattering as much - I seem to
recall that being different in the not distant past though.

There are very few things I love more than when you can't resist
trying to break my patches and yet fail to find a problem. Granted the
latter part only happens once a century or so, but I'll take it.

Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
mergejoins_allowed. If mergejoins are disabled we end up reaching
match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
only set mergejoin_enabled for right joins inside select_mergejoin_clauses(),
but we don't call select_mergejoin_clauses() if !enable_mergejoin and jointype
!= FULL. I, what?

I agree this logic is extremely confusing, but "we only set
mergejoin_enabled for right joins inside select_mergejoin_clauses()"
doesn't seem to be true. It starts out true, and always stays true
except for right, right-anti, and full joins, where
select_mergejoin_clauses() can set it to false. Since the call to
match_unsorted_outer() is gated by mergejoin_enabled, you might think
that we'd skip considering nested loops on the strength of not being
able to do a merge join, but comment "2." in add_paths_to_joinrel
explains that the join types for which mergejoin_enabled can end up
false aren't supported by nested loops anyway. Still, this logic is
really tortured.

--
Robert Haas
EDB: http://www.enterprisedb.com

#57

Andres Freund

andres@anarazel.de

over 1 year ago

In reply to: Robert Haas (#56)

Re: On disable_cost

Hi,

On 2024-06-12 14:33:31 -0400, Robert Haas wrote:

On Wed, Jun 12, 2024 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:

<can't resist trying if I see overhead>

In an extreme case i can see a tiny bit of overhead, but not enough to be
worth worrying about. Mostly because we're so profligate in doing
bms_overlap() that cost comparisons don't end up mattering as much - I seem to
recall that being different in the not distant past though.

There are very few things I love more than when you can't resist
trying to break my patches and yet fail to find a problem. Granted the
latter part only happens once a century or so, but I'll take it.

Too high cost in path cost comparison is what made me look at the PG code for
the first time, IIRC :)

Aside: I'm somewhat confused by add_paths_to_joinrel()'s handling of
mergejoins_allowed. If mergejoins are disabled we end up reaching
match_unsorted_outer() in more cases than with mergejoins enabled. E.g. we
only set mergejoin_enabled for right joins inside select_mergejoin_clauses(),
but we don't call select_mergejoin_clauses() if !enable_mergejoin and jointype
!= FULL. I, what?

I agree this logic is extremely confusing, but "we only set
mergejoin_enabled for right joins inside select_mergejoin_clauses()"
doesn't seem to be true.

Sorry, should have been more precise. With "set" I didn't mean set to true,
but that that it's only modified within select_mergejoin_clauses().

It starts out true, and always stays true except for right, right-anti, and
full joins, where select_mergejoin_clauses() can set it to false. Since the
call to match_unsorted_outer() is gated by mergejoin_enabled, you might
think that we'd skip considering nested loops on the strength of not being
able to do a merge join, but comment "2." in add_paths_to_joinrel explains
that the join types for which mergejoin_enabled can end up false aren't
supported by nested loops anyway. Still, this logic is really tortured.

Agree that that's the logic - but doesn't that mean we'll consider nestloops
for e.g. right joins iff enable_mergejoin=false?

Greetings,

Andres Freund

#58

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Andres Freund (#57)

Re: On disable_cost

On Wed, Jun 12, 2024 at 2:48 PM Andres Freund <andres@anarazel.de> wrote:

Sorry, should have been more precise. With "set" I didn't mean set to true,
but that that it's only modified within select_mergejoin_clauses().

Oh. "set" has more than one relevant meaning here.

It starts out true, and always stays true except for right, right-anti, and
full joins, where select_mergejoin_clauses() can set it to false. Since the
call to match_unsorted_outer() is gated by mergejoin_enabled, you might
think that we'd skip considering nested loops on the strength of not being
able to do a merge join, but comment "2." in add_paths_to_joinrel explains
that the join types for which mergejoin_enabled can end up false aren't
supported by nested loops anyway. Still, this logic is really tortured.

Agree that that's the logic - but doesn't that mean we'll consider nestloops
for e.g. right joins iff enable_mergejoin=false?

No, because that function has its own internal guards. See nestjoinOK.

But don't misunderstand me: I'm not defending the status quo. The
whole thing seems like a Rube Goldberg machine to me.

--
Robert Haas
EDB: http://www.enterprisedb.com

#59

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Haas (#54)

4 attachment(s)

Re: On disable_cost

On Wed, Jun 12, 2024 at 11:35 AM Robert Haas <robertmhaas@gmail.com> wrote:

Well, that didn't generate much discussion, but here I am trying
again. Here I've got patches 0001 and 0002 from my previous posting;
I've dropped 0003 and 0004 from the previous set for now so as not to
distract from the main event, but they may still be a good idea.
Instead I've got an 0003 and an 0004 that implement the "count of
disabled nodes" approach that we have discussed previously. This seems
to work fine, unlike the approaches I tried earlier. I think this is
the right direction to go, but I'd like to know what concerns people
might have.

Here is a rebased patch set, where I also fixed pgindent damage and a
couple of small oversights in 0004.

I am hoping to get these committed some time in July. So if somebody
thinks that's too soon or thinks it shouldn't happen at all, please
don't wait too long to let me know about that.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v3-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patchapplication/octet-stream; name=v3-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patchDownload

From 066c346c30f52f0042efb51d26c60a6f4e5b5207 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 7 May 2024 11:51:26 -0400
Subject: [PATCH v3 1/4] Remove grotty use of disable_cost for TID scan plans.

Previously, the code charged disable_cost for CurrentOfExpr, and then
subtracted disable_cost from the cost of a TID path that used
CurrentOfExpr as the TID qual, effectively disabling all paths except
that one. Now, we instead suppress generation of the disabled paths
entirely, and generate only the one that the executor will actually
understand.

With this approach, we do not need to rely on disable_cost being
large enough to prevent the wrong path from being chosen, and we
save some CPU cycle by avoiding generating paths that we can't
actually use. In my opinion, the code is also easier to understand
like this.
---
 src/backend/optimizer/path/allpaths.c | 14 +++++++--
 src/backend/optimizer/path/costsize.c | 26 -----------------
 src/backend/optimizer/path/tidpath.c  | 41 +++++++++++++++++++++++----
 src/include/optimizer/paths.h         |  2 +-
 4 files changed, 48 insertions(+), 35 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4895cee994..aa78c0af0c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -772,6 +772,17 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	 */
 	required_outer = rel->lateral_relids;
 
+	/*
+	 * Consider TID scans.
+	 *
+	 * If create_tidscan_paths returns true, then a TID scan path is forced.
+	 * This happens when rel->baserestrictinfo contains CurrentOfExpr, because
+	 * the executor can't handle any other type of path for such queries.
+	 * Hence, we return without adding any other paths.
+	 */
+	if (create_tidscan_paths(root, rel))
+		return;
+
 	/* Consider sequential scan */
 	add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
 
@@ -781,9 +792,6 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 
 	/* Consider index scans */
 	create_index_paths(root, rel);
-
-	/* Consider TID scans */
-	create_tidscan_paths(root, rel);
 }
 
 /*
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ee23ed7835..2021c481b4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1251,7 +1251,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 {
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
-	bool		isCurrentOf = false;
 	QualCost	qpqual_cost;
 	Cost		cpu_per_tuple;
 	QualCost	tid_qual_cost;
@@ -1287,7 +1286,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		else if (IsA(qual, CurrentOfExpr))
 		{
 			/* CURRENT OF yields 1 tuple */
-			isCurrentOf = true;
 			ntuples++;
 		}
 		else
@@ -1297,22 +1295,6 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		}
 	}
 
-	/*
-	 * We must force TID scan for WHERE CURRENT OF, because only nodeTidscan.c
-	 * understands how to do it correctly.  Therefore, honor enable_tidscan
-	 * only when CURRENT OF isn't present.  Also note that cost_qual_eval
-	 * counts a CurrentOfExpr as having startup cost disable_cost, which we
-	 * subtract off here; that's to prevent other plan types such as seqscan
-	 * from winning.
-	 */
-	if (isCurrentOf)
-	{
-		Assert(baserel->baserestrictcost.startup >= disable_cost);
-		startup_cost -= disable_cost;
-	}
-	else if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -1399,9 +1381,6 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	ntuples = selectivity * baserel->tuples;
 	nseqpages = pages - 1.0;
 
-	if (!enable_tidscan)
-		startup_cost += disable_cost;
-
 	/*
 	 * The TID qual expressions will be computed once, any other baserestrict
 	 * quals once per retrieved tuple.
@@ -4884,11 +4863,6 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 		/* Treat all these as having cost 1 */
 		context->total.per_tuple += cpu_operator_cost;
 	}
-	else if (IsA(node, CurrentOfExpr))
-	{
-		/* Report high cost to prevent selection of anything but TID scan */
-		context->total.startup += disable_cost;
-	}
 	else if (IsA(node, SubLink))
 	{
 		/* This routine should not be applied to un-planned expressions */
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index eb11bc79c7..b0323b26ec 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_operator.h"
 #include "catalog/pg_type.h"
 #include "nodes/nodeFuncs.h"
+#include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -277,12 +278,15 @@ RestrictInfoIsTidQual(PlannerInfo *root, RestrictInfo *rinfo, RelOptInfo *rel)
  * that there's more than one choice.
  */
 static List *
-TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
+TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel,
+							bool *isCurrentOf)
 {
 	RestrictInfo *tidclause = NULL; /* best simple CTID qual so far */
 	List	   *orlist = NIL;	/* best OR'ed CTID qual so far */
 	ListCell   *l;
 
+	*isCurrentOf = false;
+
 	foreach(l, rlist)
 	{
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
@@ -305,9 +309,13 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 				if (is_andclause(orarg))
 				{
 					List	   *andargs = ((BoolExpr *) orarg)->args;
+					bool		sublistIsCurrentOf;
 
 					/* Recurse in case there are sub-ORs */
-					sublist = TidQualFromRestrictInfoList(root, andargs, rel);
+					sublist = TidQualFromRestrictInfoList(root, andargs, rel,
+														  &sublistIsCurrentOf);
+					if (sublistIsCurrentOf)
+						elog(ERROR, "IS CURRENT OF within OR clause");
 				}
 				else
 				{
@@ -353,7 +361,10 @@ TidQualFromRestrictInfoList(PlannerInfo *root, List *rlist, RelOptInfo *rel)
 			{
 				/* We can stop immediately if it's a CurrentOfExpr */
 				if (IsCurrentOfClause(rinfo, rel))
+				{
+					*isCurrentOf = true;
 					return list_make1(rinfo);
+				}
 
 				/*
 				 * Otherwise, remember the first non-OR CTID qual.  We could
@@ -483,19 +494,24 @@ ec_member_matches_ctid(PlannerInfo *root, RelOptInfo *rel,
  *
  *	  Candidate paths are added to the rel's pathlist (using add_path).
  */
-void
+bool
 create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 {
 	List	   *tidquals;
 	List	   *tidrangequals;
+	bool		isCurrentOf;
 
 	/*
 	 * If any suitable quals exist in the rel's baserestrict list, generate a
 	 * plain (unparameterized) TidPath with them.
+	 *
+	 * We skip this when enable_tidscan = false, except when the qual is
+	 * CurrentOfExpr. In that case, a TID scan is the only correct path.
 	 */
-	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel);
+	tidquals = TidQualFromRestrictInfoList(root, rel->baserestrictinfo, rel,
+										   &isCurrentOf);
 
-	if (tidquals != NIL)
+	if (tidquals != NIL && (enable_tidscan || isCurrentOf))
 	{
 		/*
 		 * This path uses no join clauses, but it could still have required
@@ -505,8 +521,21 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 
 		add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
 												   required_outer));
+
+		/*
+		 * When the qual is CurrentOfExpr, the path that we just added is the
+		 * only one the executor can handle, so we should return before adding
+		 * any others. Returning true lets the caller know not to add any
+		 * others, either.
+		 */
+		if (isCurrentOf)
+			return true;
 	}
 
+	/* Skip the rest if TID scans are disabled. */
+	if (!enable_tidscan)
+		return false;
+
 	/*
 	 * If there are range quals in the baserestrict list, generate a
 	 * TidRangePath.
@@ -553,4 +582,6 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * join quals, for example.
 	 */
 	BuildParameterizedTidPaths(root, rel, rel->joininfo);
+
+	return false;
 }
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 5e88c0224a..5c029b6b62 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -83,7 +83,7 @@ extern void check_index_predicates(PlannerInfo *root, RelOptInfo *rel);
  * tidpath.c
  *	  routines to generate tid paths
  */
-extern void create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
+extern bool create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel);
 
 /*
  * joinpath.c
-- 
2.39.3 (Apple Git-145)

v3-0004-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchapplication/octet-stream; name=v3-0004-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchDownload

From 0565c32b066fd03cfdc07336455bf9ac7b3aacde Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 27 Jun 2024 13:21:58 -0400
Subject: [PATCH v3 4/4] Show number of disabled nodes in EXPLAIN ANALYZE
 output.

Now that disable_cost is not included in the cost estimate, there's
no visible sign in EXPLAIN output of which plan nodes are disabled.
Fix that by propagating the number of disabled nodes from Path to
Plan, and then showing it in the EXPLAIN output.
---
 src/backend/commands/explain.c                |  4 ++++
 src/backend/optimizer/plan/createplan.c       |  6 ++++--
 src/include/nodes/plannodes.h                 |  1 +
 src/test/regress/expected/aggregates.out      | 21 ++++++++++++++++---
 .../regress/expected/collate.icu.utf8.out     |  6 ++++--
 .../regress/expected/incremental_sort.out     |  5 ++++-
 src/test/regress/expected/inherit.out         |  4 +++-
 src/test/regress/expected/join.out            |  4 +++-
 src/test/regress/expected/memoize.out         |  8 +++++--
 src/test/regress/expected/select_parallel.out |  6 +++++-
 src/test/regress/expected/union.out           |  3 ++-
 11 files changed, 54 insertions(+), 14 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 94511a5a02..9147cac1a6 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1889,6 +1889,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
+	if (plan->disabled_nodes != 0)
+		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
+							   es);
+
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
 	{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 20236e8c4d..1904eea873 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -5409,6 +5409,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
 static void
 copy_generic_path_info(Plan *dest, Path *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->rows;
@@ -5424,6 +5425,7 @@ copy_generic_path_info(Plan *dest, Path *src)
 static void
 copy_plan_costsize(Plan *dest, Plan *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->plan_rows;
@@ -5457,7 +5459,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
-			  0,				/* a Plan contains no count of disabled nodes */
+			  plan->plan.disabled_nodes,
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6552,7 +6554,7 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
-				  0,			/* a Plan contains no count of disabled nodes */
+				  subplan->disabled_nodes,
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1aeeaec95e..62cd6a6666 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -125,6 +125,7 @@ typedef struct Plan
 	/*
 	 * estimated execution costs for plan (see costsize.c for more info)
 	 */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 1c1ca7573a..ab1de1bfd8 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2895,18 +2895,23 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
+   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
+         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(12 rows)
+                           Disabled Nodes: 1
+(17 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2928,19 +2933,24 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2950,19 +2960,24 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb4431..31345295c1 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -989,8 +989,9 @@ select * from collate_test1 where b ilike 'abc';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'abc'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'abc';
  a |  b  
@@ -1004,8 +1005,9 @@ select * from collate_test1 where b ilike 'ABC';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'ABC'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'ABC';
  a |  b  
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..79f0d37a87 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -701,16 +701,19 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
+   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
+         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
+               Disabled Nodes: 1
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(10 rows)
+(13 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index ad73213414..dbb748a2d2 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,6 +1614,7 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
+   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1623,10 +1624,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
+               Disabled Nodes: 1
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(13 rows)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 6b16c3a676..8840fb4e3e 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -7945,13 +7945,15 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
+   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
+         Disabled Nodes: 1
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(7 rows)
+(9 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 0fd103c06b..3b1fd3d95d 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -240,14 +240,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(8 rows)
+(10 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -255,14 +257,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(8 rows)
+(10 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 20c651aadb..08ef0df9a3 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -538,10 +538,14 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
+   Disabled Nodes: 1
    ->  Nested Loop
+         Disabled Nodes: 1
          ->  Gather
+               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
+                     Disabled Nodes: 1
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -549,7 +553,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(12 rows)
+(16 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0fd0e1c38b..0456d48c93 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,11 +822,12 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
+   Disabled Nodes: 1
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
          ->  Result
-(5 rows)
+(6 rows)
 
 reset enable_hashagg;
 --
-- 
2.39.3 (Apple Git-145)

v3-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patchapplication/octet-stream; name=v3-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patchDownload

From 4d194ee9aa8cacdc9f8f80ec0e2c716c9a2e2349 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Thu, 2 May 2024 11:18:44 -0400
Subject: [PATCH v3 2/4] Rationalize behavior of enable_indexscan and
 enable_indexonlyscan.

Previously, index-scan paths were still generated even when
enable_indexscan=false, but we added disable-cost to the cost of
both index scan plans and index-only scan plans. It doesn't make sense
for enable_indexscan to affect the whether index-only scans are chosen
given that we also have a GUC called enable_indexonlyscan.

With this commit, enable_indexscan and enable_indexonlyscan work
the same way: each one prevents consideration of paths of the
appropriate type, and neither has any affect on the cost of the
generate paths. This requires some updates to the regression tests,
which previously relied on enable_indexscan=false to also disable
index-only scans.

Note that when enable_indexonlyscan=false and enable_indexscan=true,
we will generate index-scan paths that would have not have been
generated if both had been set to true. That's because generating
both an index-scan path and an index-only path would be a waste
of cycles, since the index-only path should always win. In effect,
the index-scan plan shape was still being considered; we just
rejected it before actually constructing a path.
---
 src/backend/optimizer/path/costsize.c         |  4 ---
 src/backend/optimizer/path/indxpath.c         | 26 ++++++++++++++++---
 src/test/regress/expected/btree_index.out     |  3 +++
 src/test/regress/expected/create_index.out    |  2 ++
 src/test/regress/expected/select.out          |  1 +
 src/test/regress/expected/select_parallel.out |  2 ++
 src/test/regress/expected/tuplesort.out       |  2 ++
 src/test/regress/sql/btree_index.sql          |  5 ++++
 src/test/regress/sql/create_index.sql         |  2 ++
 src/test/regress/sql/select.sql               |  1 +
 src/test/regress/sql/select_parallel.sql      |  2 ++
 src/test/regress/sql/tuplesort.sql            |  2 ++
 12 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 2021c481b4..74fc5aab56 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -603,10 +603,6 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
-	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
-
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
 	 * for scanning the index, as well as the selectivity of the index (ie,
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index c0fcc7d78d..8887231cf9 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -742,7 +742,13 @@ get_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		IndexPath  *ipath = (IndexPath *) lfirst(lc);
 
 		if (index->amhasgettuple)
-			add_path(rel, (Path *) ipath);
+		{
+			if (ipath->path.pathtype == T_IndexScan && enable_indexscan)
+				add_path(rel, (Path *) ipath);
+			else if (ipath->path.pathtype == T_IndexOnlyScan &&
+					 enable_indexonlyscan)
+				add_path(rel, (Path *) ipath);
+		}
 
 		if (index->amhasgetbitmap &&
 			(ipath->path.pathkeys == NIL ||
@@ -831,6 +837,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		case ST_INDEXSCAN:
 			if (!index->amhasgettuple)
 				return NIL;
+			if (!enable_indexscan && !enable_indexonlyscan)
+				return NIL;
 			break;
 		case ST_BITMAPSCAN:
 			if (!index->amhasgetbitmap)
@@ -978,7 +986,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 		 */
 		if (index->amcanparallel &&
 			rel->consider_parallel && outer_relids == NULL &&
-			scantype != ST_BITMAPSCAN)
+			scantype != ST_BITMAPSCAN &&
+			(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 		{
 			ipath = create_index_path(root, index,
 									  index_clauses,
@@ -1028,7 +1037,8 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 			/* If appropriate, consider parallel index scan */
 			if (index->amcanparallel &&
 				rel->consider_parallel && outer_relids == NULL &&
-				scantype != ST_BITMAPSCAN)
+				scantype != ST_BITMAPSCAN &&
+				(index_only_scan ? enable_indexonlyscan : enable_indexscan))
 			{
 				ipath = create_index_path(root, index,
 										  index_clauses,
@@ -1735,7 +1745,15 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
 	ListCell   *lc;
 	int			i;
 
-	/* Index-only scans must be enabled */
+	/*
+	 * Index-only scans must be enabled.
+	 *
+	 * NB: Returning false here means that an index scan will be considered
+	 * instead, so setting enable_indexscan=false causes to consider paths
+	 * that we wouldn't have considered otherwise. That seems OK, because our
+	 * only reason for not generating the index-scan paths is that we expect
+	 * them to lose on cost.
+	 */
 	if (!enable_indexonlyscan)
 		return false;
 
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 510646cbce..f15db99771 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -247,6 +247,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 --
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -290,6 +291,7 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 (2 rows)
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -330,6 +332,7 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 ---------
 (0 rows)
 
+reset enable_indexonlyscan;
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                            QUERY PLAN                            
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index cf6eac5734..ec69bafd40 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -618,6 +618,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 EXPLAIN (COSTS OFF)
 SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0,1';
@@ -643,6 +644,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 --
 -- GIN over int[] and text[]
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index 33a6dceb0e..6445815741 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -844,6 +844,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
                          QUERY PLAN                          
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 87273fa635..f79eda79f6 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -522,6 +522,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -622,6 +623,7 @@ select * from explain_parallel_sort_stats();
 (14 rows)
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/expected/tuplesort.out b/src/test/regress/expected/tuplesort.out
index 6dd97e7427..87b05a22cb 100644
--- a/src/test/regress/expected/tuplesort.out
+++ b/src/test/regress/expected/tuplesort.out
@@ -362,6 +362,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -458,6 +459,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql
index 0d2a33f370..bc99f44dda 100644
--- a/src/test/regress/sql/btree_index.sql
+++ b/src/test/regress/sql/btree_index.sql
@@ -157,6 +157,7 @@ select thousand from tenk1 where thousand in (364, 366,380) and tenthous = 20000
 
 set enable_seqscan to false;
 set enable_indexscan to true;
+set enable_indexonlyscan to true;
 set enable_bitmapscan to false;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -168,6 +169,7 @@ explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
 set enable_indexscan to false;
+set enable_indexonlyscan to false;
 set enable_bitmapscan to true;
 explain (costs off)
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
@@ -175,6 +177,9 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
 explain (costs off)
 select proname from pg_proc where proname ilike '00%foo' order by 1;
 select proname from pg_proc where proname ilike '00%foo' order by 1;
+
+reset enable_indexonlyscan;
+
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
 
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index e296891cab..04dea5225e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -246,6 +246,7 @@ SELECT point(x,x), (SELECT f1 FROM gpolygon_tbl ORDER BY f1 <-> point(x,x) LIMIT
 -- Now check the results from bitmap indexscan
 SET enable_seqscan = OFF;
 SET enable_indexscan = OFF;
+SET enable_indexonlyscan = OFF;
 SET enable_bitmapscan = ON;
 
 EXPLAIN (COSTS OFF)
@@ -254,6 +255,7 @@ SELECT * FROM point_tbl WHERE f1 <@ '(-10,-10),(10,10)':: box ORDER BY f1 <-> '0
 
 RESET enable_seqscan;
 RESET enable_indexscan;
+RESET enable_indexonlyscan;
 RESET enable_bitmapscan;
 
 --
diff --git a/src/test/regress/sql/select.sql b/src/test/regress/sql/select.sql
index 019f1e7673..a0c7417dec 100644
--- a/src/test/regress/sql/select.sql
+++ b/src/test/regress/sql/select.sql
@@ -218,6 +218,7 @@ select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'C';
 -- partial index implies clause, but bitmap scan must recheck predicate anyway
 SET enable_indexscan TO off;
+SET enable_indexonlyscan TO off;
 explain (costs off)
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
 select unique2 from onek2 where unique2 = 11 and stringu1 < 'B';
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 20376c03fa..3f003e2e71 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -201,6 +201,7 @@ reset enable_indexscan;
 -- test parallel bitmap heap scan.
 set enable_seqscan to off;
 set enable_indexscan to off;
+set enable_indexonlyscan to off;
 set enable_hashjoin to off;
 set enable_mergejoin to off;
 set enable_material to off;
@@ -248,6 +249,7 @@ $$;
 select * from explain_parallel_sort_stats();
 
 reset enable_indexscan;
+reset enable_indexonlyscan;
 reset enable_hashjoin;
 reset enable_mergejoin;
 reset enable_material;
diff --git a/src/test/regress/sql/tuplesort.sql b/src/test/regress/sql/tuplesort.sql
index 8476e594e6..95ac8ec04c 100644
--- a/src/test/regress/sql/tuplesort.sql
+++ b/src/test/regress/sql/tuplesort.sql
@@ -162,6 +162,7 @@ ORDER BY v.a DESC;
 -- in-memory
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
 EXPLAIN (COSTS OFF) DECLARE c SCROLL CURSOR FOR SELECT noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_decreasing;
@@ -192,6 +193,7 @@ COMMIT;
 -- disk based
 BEGIN;
 SET LOCAL enable_indexscan = false;
+SET LOCAL enable_indexonlyscan = false;
 SET LOCAL work_mem = '100kB';
 -- unfortunately can't show analyze output confirming sort method,
 -- the memory used output wouldn't be stable
-- 
2.39.3 (Apple Git-145)

v3-0003-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchapplication/octet-stream; name=v3-0003-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchDownload

From 339d27ec06f448595223e7ad73dbd992e967b9e8 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Mon, 10 Jun 2024 16:51:39 -0400
Subject: [PATCH v3 3/4] Treat number of disabled nodes in a path as a separate
 cost metric.

Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the cost. Hence, the
path list is now sorted by disabled_nodes and then by total_cost,
instead of just by the latter, and likewise for the partial path list.
It is important that this number is a count and not simply a Boolean;
else, as soon as we're unable to respect disabled path types in all
portions of the path, we stop trying to avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
---
 contrib/file_fdw/file_fdw.c                   |   1 +
 contrib/postgres_fdw/postgres_fdw.c           |  44 +++-
 contrib/postgres_fdw/postgres_fdw.h           |   1 +
 src/backend/optimizer/path/costsize.c         | 169 ++++++++++----
 src/backend/optimizer/path/joinpath.c         |  15 +-
 src/backend/optimizer/plan/createplan.c       |   3 +
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/prep/prepunion.c        |   6 +-
 src/backend/optimizer/util/pathnode.c         | 206 +++++++++++++-----
 src/include/nodes/pathnodes.h                 |   2 +
 src/include/optimizer/cost.h                  |  10 +-
 src/include/optimizer/pathnode.h              |  12 +-
 src/test/regress/expected/select_parallel.out |   8 +-
 13 files changed, 362 insertions(+), 116 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 249d82d3a0..d16821f8e1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -576,6 +576,7 @@ fileGetForeignPaths(PlannerInfo *root,
 			 create_foreignscan_path(root, baserel,
 									 NULL,	/* default pathtarget */
 									 baserel->rows,
+									 0,
 									 startup_cost,
 									 total_cost,
 									 NIL,	/* no pathkeys */
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 0bb9a5ae8f..f0308798a0 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -430,6 +430,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
 									List *pathkeys,
 									PgFdwPathExtraData *fpextra,
 									double *p_rows, int *p_width,
+									int *p_disabled_nodes,
 									Cost *p_startup_cost, Cost *p_total_cost);
 static void get_remote_estimate(const char *sql,
 								PGconn *conn,
@@ -442,6 +443,7 @@ static void adjust_foreign_grouping_path_cost(PlannerInfo *root,
 											  double retrieved_rows,
 											  double width,
 											  double limit_tuples,
+											  int *disabled_nodes,
 											  Cost *p_startup_cost,
 											  Cost *p_run_cost);
 static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
@@ -735,6 +737,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		 */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 
 		/* Report estimated baserel size to planner. */
@@ -765,6 +768,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		/* Fill in basically-bogus cost estimates for use later. */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 	}
 
@@ -1030,6 +1034,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 	path = create_foreignscan_path(root, baserel,
 								   NULL,	/* default pathtarget */
 								   fpinfo->rows,
+								   fpinfo->disabled_nodes,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
@@ -1184,13 +1189,14 @@ postgresGetForeignPaths(PlannerInfo *root,
 		ParamPathInfo *param_info = (ParamPathInfo *) lfirst(lc);
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 
 		/* Get a cost estimate from the remote */
 		estimate_path_cost_size(root, baserel,
 								param_info->ppi_clauses, NIL, NULL,
-								&rows, &width,
+								&rows, &width, &disabled_nodes,
 								&startup_cost, &total_cost);
 
 		/*
@@ -1203,6 +1209,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 		path = create_foreignscan_path(root, baserel,
 									   NULL,	/* default pathtarget */
 									   rows,
+									   disabled_nodes,
 									   startup_cost,
 									   total_cost,
 									   NIL, /* no pathkeys */
@@ -3078,12 +3085,14 @@ estimate_path_cost_size(PlannerInfo *root,
 						List *pathkeys,
 						PgFdwPathExtraData *fpextra,
 						double *p_rows, int *p_width,
+						int *p_disabled_nodes,
 						Cost *p_startup_cost, Cost *p_total_cost)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
 	double		rows;
 	double		retrieved_rows;
 	int			width;
+	int			disabled_nodes = 0;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -3473,6 +3482,7 @@ estimate_path_cost_size(PlannerInfo *root,
 				adjust_foreign_grouping_path_cost(root, pathkeys,
 												  retrieved_rows, width,
 												  fpextra->limit_tuples,
+												  &disabled_nodes,
 												  &startup_cost, &run_cost);
 			}
 			else
@@ -3567,6 +3577,7 @@ estimate_path_cost_size(PlannerInfo *root,
 	/* Return results. */
 	*p_rows = rows;
 	*p_width = width;
+	*p_disabled_nodes = disabled_nodes;
 	*p_startup_cost = startup_cost;
 	*p_total_cost = total_cost;
 }
@@ -3627,6 +3638,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 								  double retrieved_rows,
 								  double width,
 								  double limit_tuples,
+								  int *p_disabled_nodes,
 								  Cost *p_startup_cost,
 								  Cost *p_run_cost)
 {
@@ -3646,6 +3658,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  0,
 				  *p_startup_cost + *p_run_cost,
 				  retrieved_rows,
 				  width,
@@ -6137,13 +6150,15 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 	{
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 		List	   *useful_pathkeys = lfirst(lc);
 		Path	   *sorted_epq_path;
 
 		estimate_path_cost_size(root, rel, NIL, useful_pathkeys, NULL,
-								&rows, &width, &startup_cost, &total_cost);
+								&rows, &width, &disabled_nodes,
+								&startup_cost, &total_cost);
 
 		/*
 		 * The EPQ path must be at least as well sorted as the path itself, in
@@ -6165,6 +6180,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreignscan_path(root, rel,
 											 NULL,
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 useful_pathkeys,
@@ -6178,6 +6194,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreign_join_path(root, rel,
 											  NULL,
 											  rows,
+											  disabled_nodes,
 											  startup_cost,
 											  total_cost,
 											  useful_pathkeys,
@@ -6325,6 +6342,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 	ForeignPath *joinpath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	Path	   *epq_path;		/* Path to create plan to be executed when
@@ -6414,12 +6432,14 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 
 	/* Estimate costs for bare join relation */
 	estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	/* Now update this information in the joinrel */
 	joinrel->rows = rows;
 	joinrel->reltarget->width = width;
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6431,6 +6451,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 										joinrel,
 										NULL,	/* default pathtarget */
 										rows,
+										disabled_nodes,
 										startup_cost,
 										total_cost,
 										NIL,	/* no pathkeys */
@@ -6758,6 +6779,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	ForeignPath *grouppath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -6808,11 +6830,13 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the cost of push down */
 	estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/* Now update this information in the fpinfo */
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6821,6 +6845,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										  grouped_rel,
 										  grouped_rel->reltarget,
 										  rows,
+										  disabled_nodes,
 										  startup_cost,
 										  total_cost,
 										  NIL,	/* no pathkeys */
@@ -6849,6 +6874,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	PgFdwPathExtraData *fpextra;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -6942,7 +6968,8 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the costs of performing the final sort remotely */
 	estimate_path_cost_size(root, input_rel, NIL, root->sort_pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/*
 	 * Build the fdw_private list that will be used by postgresGetForeignPlan.
@@ -6955,6 +6982,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 											 input_rel,
 											 root->upper_targets[UPPERREL_ORDERED],
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 root->sort_pathkeys,
@@ -6988,6 +7016,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	bool		save_use_remote_estimate = false;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -7072,6 +7101,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 													   path->parent,
 													   path->pathtarget,
 													   path->rows,
+													   path->disabled_nodes,
 													   path->startup_cost,
 													   path->total_cost,
 													   path->pathkeys,
@@ -7189,7 +7219,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 		ifpinfo->use_remote_estimate = false;
 	}
 	estimate_path_cost_size(root, input_rel, NIL, pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	if (!fpextra->has_final_sort)
 		ifpinfo->use_remote_estimate = save_use_remote_estimate;
 
@@ -7208,6 +7239,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										   input_rel,
 										   root->upper_targets[UPPERREL_FINAL],
 										   rows,
+										   disabled_nodes,
 										   startup_cost,
 										   total_cost,
 										   pathkeys,
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 37c1575af6..9e501660d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -62,6 +62,7 @@ typedef struct PgFdwRelationInfo
 	/* Estimated size and cost for a scan, join, or grouping/aggregation. */
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 74fc5aab56..4a86cb963f 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -50,6 +50,17 @@
  * so beware of division-by-zero.)	The LIMIT is applied as a top-level
  * plan node.
  *
+ * Each path stores the total number of disabled nodes that exist at or
+ * below that point in the plan tree. This is regarded as a component of
+ * the cost, and paths with fewer disabled nodes should be regarded as
+ * cheaper than those with more. Disabled nodes occur when the user sets
+ * a GUC like enable_seqscan=false. We can't necessarily respect such a
+ * setting in every part of the plan tree, but we want to respect in as many
+ * parts of the plan tree as possible. Simpler schemes like storing a Boolean
+ * here rather than a count fail to do that. We used to disable nodes by
+ * adding a large constant to the startup cost, but that distorted planning
+ * in other ways.
+ *
  * For largely historical reasons, most of the routines in this module use
  * the passed result Path only to store their results (rows, startup_cost and
  * total_cost) into.  All the input data they need is passed as separate
@@ -301,9 +312,6 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_seqscan)
-		startup_cost += disable_cost;
-
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
 							  NULL,
@@ -346,6 +354,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
 		path->rows = clamp_row_est(path->rows / parallel_divisor);
 	}
 
+	path->disabled_nodes = enable_seqscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;
 }
@@ -418,6 +427,7 @@ cost_samplescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -456,6 +466,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows;
 
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = (startup_cost + run_cost);
 }
@@ -473,6 +484,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 void
 cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double *rows)
 {
@@ -490,9 +502,6 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	else
 		path->path.rows = rel->rows;
 
-	if (!enable_gathermerge)
-		startup_cost += disable_cost;
-
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
 	 * be overgenerous since the leader will do less work than other workers
@@ -523,6 +532,8 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows * 1.05;
 
+	path->path.disabled_nodes = input_disabled_nodes
+		+ (enable_gathermerge ? 0 : 1);
 	path->path.startup_cost = startup_cost + input_startup_cost;
 	path->path.total_cost = (startup_cost + run_cost + input_total_cost);
 }
@@ -812,6 +823,11 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 
 	run_cost += cpu_run_cost;
 
+	/*
+	 * enable_indexscan and enable_indexonlyscan are handled by skipping path
+	 * generation, so we need no logic for those cases here.
+	 */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = startup_cost + run_cost;
 }
@@ -1034,9 +1050,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
-
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
 										 &tuples_fetched);
@@ -1098,6 +1111,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = enable_bitmapscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1183,6 +1197,7 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
 	}
 	path->bitmapselectivity = selec;
 	path->path.rows = 0;		/* per above, not used */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = totalCost;
 	path->path.total_cost = totalCost;
 }
@@ -1257,6 +1272,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	/* Should only be applied to base relations */
 	Assert(baserel->relid > 0);
 	Assert(baserel->rtekind == RTE_RELATION);
+	Assert(tidquals != NIL);
 
 	/* Mark the path with the correct row estimate */
 	if (param_info)
@@ -1271,6 +1287,14 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
 		Expr	   *qual = rinfo->clause;
 
+		/*
+		 * We must use a TID scan for CurrentOfExpr; in any other case, we
+		 * should be generating a TID scan only if enable_tidscan=true. Also,
+		 * if CurrentOfExpr is the qual, there should be only one.
+		 */
+		Assert(enable_tidscan || IsA(qual, CurrentOfExpr));
+		Assert(list_length(tidquals) == 1 || !IsA(qual, CurrentOfExpr));
+
 		if (IsA(qual, ScalarArrayOpExpr))
 		{
 			/* Each element of the array yields 1 tuple */
@@ -1318,6 +1342,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/*
+	 * There are assertions above verifying that we only reach this function
+	 * either when enable_tidscan=true or when the TID scan is the only legal
+	 * path, so it's safe to set disabled_nodes to zero here.
+	 */
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1410,6 +1440,9 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/* we should not generate this path type when enable_tidscan=false */
+	Assert(enable_tidscan);
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1462,6 +1495,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
 	 * SubqueryScan node, plus cpu_tuple_cost to account for selection and
 	 * projection overhead.
 	 */
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = path->subpath->startup_cost;
 	path->path.total_cost = path->subpath->total_cost;
 
@@ -1552,6 +1586,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1608,6 +1643,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1655,6 +1691,7 @@ cost_valuesscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1702,6 +1739,7 @@ cost_ctescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1739,6 +1777,7 @@ cost_namedtuplestorescan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1773,6 +1812,7 @@ cost_resultscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1812,6 +1852,7 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm)
 	 */
 	total_cost += cpu_tuple_cost * total_rows;
 
+	runion->disabled_nodes = nrterm->disabled_nodes + rterm->disabled_nodes;
 	runion->startup_cost = startup_cost;
 	runion->total_cost = total_cost;
 	runion->rows = total_rows;
@@ -1960,6 +2001,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
 void
 cost_incremental_sort(Path *path,
 					  PlannerInfo *root, List *pathkeys, int presorted_keys,
+					  int input_disabled_nodes,
 					  Cost input_startup_cost, Cost input_total_cost,
 					  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 					  double limit_tuples)
@@ -2079,6 +2121,11 @@ cost_incremental_sort(Path *path,
 	run_cost += 2.0 * cpu_tuple_cost * input_groups;
 
 	path->rows = input_tuples;
+
+	/* should not generate these paths when enable_incremental_sort=false */
+	Assert(enable_incremental_sort);
+	path->disabled_nodes = input_disabled_nodes;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2097,7 +2144,8 @@ cost_incremental_sort(Path *path,
  */
 void
 cost_sort(Path *path, PlannerInfo *root,
-		  List *pathkeys, Cost input_cost, double tuples, int width,
+		  List *pathkeys, int input_disabled_nodes,
+		  Cost input_cost, double tuples, int width,
 		  Cost comparison_cost, int sort_mem,
 		  double limit_tuples)
 
@@ -2110,12 +2158,10 @@ cost_sort(Path *path, PlannerInfo *root,
 				   comparison_cost, sort_mem,
 				   limit_tuples);
 
-	if (!enable_sort)
-		startup_cost += disable_cost;
-
 	startup_cost += input_cost;
 
 	path->rows = tuples;
+	path->disabled_nodes = input_disabled_nodes + (enable_sort ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2207,6 +2253,7 @@ cost_append(AppendPath *apath)
 {
 	ListCell   *l;
 
+	apath->path.disabled_nodes = 0;
 	apath->path.startup_cost = 0;
 	apath->path.total_cost = 0;
 	apath->path.rows = 0;
@@ -2228,12 +2275,16 @@ cost_append(AppendPath *apath)
 			 */
 			apath->path.startup_cost = firstsubpath->startup_cost;
 
-			/* Compute rows and costs as sums of subplan rows and costs. */
+			/*
+			 * Compute rows, number of disabled nodes, and total cost as sums
+			 * of underlying subplan values.
+			 */
 			foreach(l, apath->subpaths)
 			{
 				Path	   *subpath = (Path *) lfirst(l);
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.total_cost += subpath->total_cost;
 			}
 		}
@@ -2273,6 +2324,7 @@ cost_append(AppendPath *apath)
 					cost_sort(&sort_path,
 							  NULL, /* doesn't currently need root */
 							  pathkeys,
+							  subpath->disabled_nodes,
 							  subpath->total_cost,
 							  subpath->rows,
 							  subpath->pathtarget->width,
@@ -2283,6 +2335,7 @@ cost_append(AppendPath *apath)
 				}
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.startup_cost += subpath->startup_cost;
 				apath->path.total_cost += subpath->total_cost;
 			}
@@ -2331,6 +2384,7 @@ cost_append(AppendPath *apath)
 				apath->path.total_cost += subpath->total_cost;
 			}
 
+			apath->path.disabled_nodes += subpath->disabled_nodes;
 			apath->path.rows = clamp_row_est(apath->path.rows);
 
 			i++;
@@ -2371,6 +2425,7 @@ cost_append(AppendPath *apath)
  *
  * 'pathkeys' is a list of sort keys
  * 'n_streams' is the number of input streams
+ * 'input_disabled_nodes' is the sum of the input streams' disabled node counts
  * 'input_startup_cost' is the sum of the input streams' startup costs
  * 'input_total_cost' is the sum of the input streams' total costs
  * 'tuples' is the number of tuples in all the streams
@@ -2378,6 +2433,7 @@ cost_append(AppendPath *apath)
 void
 cost_merge_append(Path *path, PlannerInfo *root,
 				  List *pathkeys, int n_streams,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double tuples)
 {
@@ -2408,6 +2464,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
 	 */
 	run_cost += cpu_tuple_cost * APPEND_CPU_COST_MULTIPLIER * tuples;
 
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost + input_startup_cost;
 	path->total_cost = startup_cost + run_cost + input_total_cost;
 }
@@ -2426,6 +2483,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
  */
 void
 cost_material(Path *path,
+			  int input_disabled_nodes,
 			  Cost input_startup_cost, Cost input_total_cost,
 			  double tuples, int width)
 {
@@ -2463,6 +2521,13 @@ cost_material(Path *path,
 		run_cost += seq_page_cost * npages;
 	}
 
+	/*
+	 * There are some situations where we add Materialize node even with
+	 * enable_material=false, but those are done when converting the Path to a
+	 * Plan; hence, enable_material should be true here.
+	 */
+	Assert(enable_material);
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2626,6 +2691,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 		 int numGroupCols, double numGroups,
 		 List *quals,
+		 int disabled_nodes,
 		 Cost input_startup_cost, Cost input_total_cost,
 		 double input_tuples, double input_width)
 {
@@ -2681,10 +2747,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		startup_cost = input_startup_cost;
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
-		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
-		}
+			++disabled_nodes;
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
 		total_cost += aggcosts->transCost.per_tuple * input_tuples;
@@ -2699,7 +2762,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			++disabled_nodes;
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		/* cost of computing hash value */
@@ -2808,6 +2871,7 @@ cost_agg(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3042,6 +3106,7 @@ get_windowclause_startup_tuples(PlannerInfo *root, WindowClause *wc,
 void
 cost_windowagg(Path *path, PlannerInfo *root,
 			   List *windowFuncs, WindowClause *winclause,
+			   int input_disabled_nodes,
 			   Cost input_startup_cost, Cost input_total_cost,
 			   double input_tuples)
 {
@@ -3107,6 +3172,7 @@ cost_windowagg(Path *path, PlannerInfo *root,
 	total_cost += cpu_tuple_cost * input_tuples;
 
 	path->rows = input_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 
@@ -3138,6 +3204,7 @@ void
 cost_group(Path *path, PlannerInfo *root,
 		   int numGroupCols, double numGroups,
 		   List *quals,
+		   int input_disabled_nodes,
 		   Cost input_startup_cost, Cost input_total_cost,
 		   double input_tuples)
 {
@@ -3176,6 +3243,7 @@ cost_group(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3210,6 +3278,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  Path *outer_path, Path *inner_path,
 					  JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3218,6 +3287,11 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Cost		inner_run_cost;
 	Cost		inner_rescan_run_cost;
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_nestloop ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* estimate costs to rescan the inner relation */
 	cost_rescan(root, inner_path,
 				&inner_rescan_start_cost,
@@ -3265,6 +3339,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_nestloop */
@@ -3294,6 +3369,9 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	QualCost	restrict_qual_cost;
 	double		ntuples;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (outer_path_rows <= 0)
 		outer_path_rows = 1;
@@ -3314,13 +3392,10 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_nestloop)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_nestloop ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3493,6 +3568,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					   List *outersortkeys, List *innersortkeys,
 					   JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3613,6 +3689,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Assert(outerstartsel <= outerendsel);
 	Assert(innerstartsel <= innerendsel);
 
+	disabled_nodes = enable_mergejoin ? 0 : 1;
+
 	/* cost of source data */
 
 	if (outersortkeys)			/* do we need to sort outer? */
@@ -3620,12 +3698,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  outersortkeys,
+				  outer_path->disabled_nodes,
 				  outer_path->total_cost,
 				  outer_path_rows,
 				  outer_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* outerstartsel;
@@ -3634,6 +3714,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += outer_path->disabled_nodes;
 		startup_cost += outer_path->startup_cost;
 		startup_cost += (outer_path->total_cost - outer_path->startup_cost)
 			* outerstartsel;
@@ -3646,12 +3727,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  innersortkeys,
+				  inner_path->disabled_nodes,
 				  inner_path->total_cost,
 				  inner_path_rows,
 				  inner_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* innerstartsel;
@@ -3660,6 +3743,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += inner_path->disabled_nodes;
 		startup_cost += inner_path->startup_cost;
 		startup_cost += (inner_path->total_cost - inner_path->startup_cost)
 			* innerstartsel;
@@ -3678,6 +3762,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost + inner_run_cost;
 	/* Save private data for final_cost_mergejoin */
@@ -3742,6 +3827,9 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 				rescannedtuples;
 	double		rescanratio;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
@@ -3761,14 +3849,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_mergejoin)
-		startup_cost += disable_cost;
-
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
 	 * separately.
@@ -4052,6 +4132,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  JoinPathExtraData *extra,
 					  bool parallel_hash)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -4063,6 +4144,11 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	int			num_skew_mcvs;
 	size_t		space_allowed;	/* unused */
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_hashjoin ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* cost of source data */
 	startup_cost += outer_path->startup_cost;
 	run_cost += outer_path->total_cost - outer_path->startup_cost;
@@ -4132,6 +4218,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_hashjoin */
@@ -4176,6 +4263,9 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	Selectivity innermcvfreq;
 	ListCell   *hcl;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Mark the path with the correct row estimate */
 	if (path->jpath.path.param_info)
 		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
@@ -4191,13 +4281,10 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_hashjoin)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_hashjoin ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 5be8da9e09..c1ac4f1fdc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -811,7 +811,7 @@ try_nestloop_path(PlannerInfo *root,
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -894,7 +894,8 @@ try_partial_nestloop_path(PlannerInfo *root,
 	 */
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -991,7 +992,7 @@ try_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -1067,7 +1068,8 @@ try_partial_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1136,7 +1138,7 @@ try_hashjoin_path(PlannerInfo *root,
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, false);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  NIL, required_outer))
 	{
@@ -1202,7 +1204,8 @@ try_partial_hashjoin_path(PlannerInfo *root,
 	 */
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, parallel_hash);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, NIL))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 6b64c4a362..20236e8c4d 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -25,6 +25,7 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/print.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
@@ -5456,6 +5457,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
+			  0,				/* a Plan contains no count of disabled nodes */
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6550,6 +6552,7 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
+				  0,			/* a Plan contains no count of disabled nodes */
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 4711f91239..e3d9fa9e81 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6749,6 +6749,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	/* Estimate the cost of seq scan + sort */
 	seqScanPath = create_seqscan_path(root, rel, NULL, 0);
 	cost_sort(&seqScanAndSortPath, root, NIL,
+			  seqScanPath->disabled_nodes,
 			  seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
 			  comparisonCost, maintenance_work_mem, -1.0);
 
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 1c69c6e97e..a0baf6d4a1 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1346,6 +1346,7 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	cost_agg(&hashed_p, root, AGG_HASHED, NULL,
 			 numGroupCols, dNumGroups,
 			 NIL,
+			 input_path->disabled_nodes,
 			 input_path->startup_cost, input_path->total_cost,
 			 input_path->rows, input_path->pathtarget->width);
 
@@ -1353,14 +1354,17 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	 * Now for the sorted case.  Note that the input is *always* unsorted,
 	 * since it was made by appending unrelated sub-relations together.
 	 */
+	sorted_p.disabled_nodes = input_path->disabled_nodes;
 	sorted_p.startup_cost = input_path->startup_cost;
 	sorted_p.total_cost = input_path->total_cost;
 	/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
-	cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
+	cost_sort(&sorted_p, root, NIL, sorted_p.disabled_nodes,
+			  sorted_p.total_cost,
 			  input_path->rows, input_path->pathtarget->width,
 			  0.0, work_mem, -1.0);
 	cost_group(&sorted_p, root, numGroupCols, dNumGroups,
 			   NIL,
+			   sorted_p.disabled_nodes,
 			   sorted_p.startup_cost, sorted_p.total_cost,
 			   input_path->rows);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index c42742d2c7..bb15fed134 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
 int
 compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
 {
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (criterion == STARTUP_COST)
 	{
 		if (path1->startup_cost < path2->startup_cost)
@@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 	Cost		cost1,
 				cost2;
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (fraction <= 0.0 || fraction >= 1.0)
 		return compare_path_costs(path1, path2, TOTAL_COST);
 	cost1 = path1->startup_cost +
@@ -166,6 +184,15 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 #define CONSIDER_PATH_STARTUP_COST(p)  \
 	((p)->param_info == NULL ? (p)->parent->consider_startup : (p)->parent->consider_param_startup)
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return COSTS_BETTER1;
+		else
+			return COSTS_BETTER2;
+	}
+
 	/*
 	 * Check total cost first since it's more likely to be different; many
 	 * paths have zero startup cost.
@@ -362,15 +389,29 @@ set_cheapest(RelOptInfo *parent_rel)
  * add_path
  *	  Consider a potential implementation path for the specified parent rel,
  *	  and add it to the rel's pathlist if it is worthy of consideration.
+ *
  *	  A path is worthy if it has a better sort order (better pathkeys) or
- *	  cheaper cost (on either dimension), or generates fewer rows, than any
- *	  existing path that has the same or superset parameterization rels.
- *	  We also consider parallel-safe paths more worthy than others.
+ *	  cheaper cost (as defined below), or generates fewer rows, than any
+ *    existing path that has the same or superset parameterization rels.  We
+ *    also consider parallel-safe paths more worthy than others.
+ *
+ *    Cheaper cost can mean either a cheaper total cost or a cheaper startup
+ *    cost; if one path is cheaper in one of these aspects and another is
+ *    cheaper in the other, we keep both. However, when some path type is
+ *    disabled (e.g. due to enable_seqscan=false), the number of times that
+ *    a disabled path type is used is considered to be a higher-order
+ *    component of the cost. Hence, if path A uses no disabled path type,
+ *    and path B uses 1 or more disabled path types, A is cheaper, no matter
+ *    what we estimate for the startup and total costs. The startup and total
+ *    cost essentially act as a tiebreak when comparing paths that use equal
+ *    numbers of disabled path nodes; but in practice this tiebreak is almost
+ *    always used, since normally no path types are disabled.
  *
- *	  We also remove from the rel's pathlist any old paths that are dominated
- *	  by new_path --- that is, new_path is cheaper, at least as well ordered,
- *	  generates no more rows, requires no outer rels not required by the old
- *	  path, and is no less parallel-safe.
+ *	  In addition to possibly adding new_path, we also remove from the rel's
+ *    pathlist any old paths that are dominated by new_path --- that is,
+ *    new_path is cheaper, at least as well ordered, generates no more rows,
+ *    requires no outer rels not required by the old path, and is no less
+ *    parallel-safe.
  *
  *	  In most cases, a path with a superset parameterization will generate
  *	  fewer rows (since it has more join clauses to apply), so that those two
@@ -389,10 +430,10 @@ set_cheapest(RelOptInfo *parent_rel)
  *	  parent_rel->consider_param_startup is true for a parameterized one.
  *	  Again, this allows discarding useless paths sooner.
  *
- *	  The pathlist is kept sorted by total_cost, with cheaper paths
- *	  at the front.  Within this routine, that's simply a speed hack:
- *	  doing it that way makes it more likely that we will reject an inferior
- *	  path after a few comparisons, rather than many comparisons.
+ *	  The pathlist is kept sorted by disabled_nodes and then by total_cost,
+ *    with cheaper paths at the front.  Within this routine, that's simply a
+ *    speed hack: doing it that way makes it more likely that we will reject
+ *    an inferior path after a few comparisons, rather than many comparisons.
  *	  However, add_path_precheck relies on this ordering to exit early
  *	  when possible.
  *
@@ -593,8 +634,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 		}
 		else
 		{
-			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			/*
+			 * new belongs after this old path if it has more disabled nodes
+			 * or if it has the same number of nodes but a greater total cost
+			 */
+			if (new_path->disabled_nodes > old_path->disabled_nodes ||
+				(new_path->disabled_nodes == old_path->disabled_nodes &&
+				 new_path->total_cost >= old_path->total_cost))
 				insert_at = foreach_current_index(p1) + 1;
 		}
 
@@ -639,7 +685,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
  * so the required information has to be passed piecemeal.
  */
 bool
-add_path_precheck(RelOptInfo *parent_rel,
+add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 				  Cost startup_cost, Cost total_cost,
 				  List *pathkeys, Relids required_outer)
 {
@@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
 		Path	   *old_path = (Path *) lfirst(p1);
 		PathKeysComparison keyscmp;
 
+		/*
+		 * Since the pathlist is sorted by disabled_nodes and then by
+		 * total_cost, we can stop looking once we reach a path with more
+		 * disabled nodes, or the same number of disabled nodes plus a
+		 * total_cost larger than the new path's.
+		 */
+		if (unlikely(old_path->disabled_nodes != disabled_nodes))
+		{
+			if (disabled_nodes < old_path->disabled_nodes)
+				break;
+		}
+		else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
+			break;
+
 		/*
 		 * We are looking for an old_path with the same parameterization (and
 		 * by assumption the same rowcount) that dominates the new path on
@@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
 		 *
 		 * Cost comparisons here should match compare_path_costs_fuzzily.
 		 */
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		/* new path can win on startup cost only if consider_startup */
+		if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
+			!consider_startup)
 		{
-			/* new path can win on startup cost only if consider_startup */
-			if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
-				!consider_startup)
+			/* new path loses on cost, so check pathkeys... */
+			List	   *old_path_pathkeys;
+
+			old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
+			keyscmp = compare_pathkeys(new_path_pathkeys,
+									   old_path_pathkeys);
+			if (keyscmp == PATHKEYS_EQUAL ||
+				keyscmp == PATHKEYS_BETTER2)
 			{
-				/* new path loses on cost, so check pathkeys... */
-				List	   *old_path_pathkeys;
-
-				old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
-				keyscmp = compare_pathkeys(new_path_pathkeys,
-										   old_path_pathkeys);
-				if (keyscmp == PATHKEYS_EQUAL ||
-					keyscmp == PATHKEYS_BETTER2)
+				/* new path does not win on pathkeys... */
+				if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
 				{
-					/* new path does not win on pathkeys... */
-					if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
-					{
-						/* Found an old path that dominates the new one */
-						return false;
-					}
+					/* Found an old path that dominates the new one */
+					return false;
 				}
 			}
 		}
-		else
-		{
-			/*
-			 * Since the pathlist is sorted by total_cost, we can stop looking
-			 * once we reach a path with a total_cost larger than the new
-			 * path's.
-			 */
-			break;
-		}
 	}
 
 	return true;
@@ -734,7 +782,7 @@ add_path_precheck(RelOptInfo *parent_rel,
  *	  produce the same number of rows.  Neither do we need to consider startup
  *	  costs: parallelism is only used for plans that will be run to completion.
  *	  Therefore, this routine is much simpler than add_path: it needs to
- *	  consider only pathkeys and total cost.
+ *	  consider only disabled nodes, pathkeys and total cost.
  *
  *	  As with add_path, we pfree paths that are found to be dominated by
  *	  another partial path; this requires that there be no other references to
@@ -775,7 +823,15 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Unless pathkeys are incompatible, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+			if (unlikely(new_path->disabled_nodes != old_path->disabled_nodes))
+			{
+				if (new_path->disabled_nodes > old_path->disabled_nodes)
+					accept_new = false;
+				else
+					remove_old = true;
+			}
+			else if (new_path->total_cost > old_path->total_cost
+					 * STD_FUZZ_FACTOR)
 			{
 				/* New path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER1)
@@ -862,8 +918,8 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
  * is surely a loser.
  */
 bool
-add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
-						  List *pathkeys)
+add_partial_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
+						  Cost total_cost, List *pathkeys)
 {
 	ListCell   *p1;
 
@@ -906,8 +962,8 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 	 * partial path; the resulting plans, if run in parallel, will be run to
 	 * completion.
 	 */
-	if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
-						   NULL))
+	if (!add_path_precheck(parent_rel, disabled_nodes, total_cost, total_cost,
+						   pathkeys, NULL))
 		return false;
 
 	return true;
@@ -1419,6 +1475,7 @@ create_merge_append_path(PlannerInfo *root,
 						 Relids required_outer)
 {
 	MergeAppendPath *pathnode = makeNode(MergeAppendPath);
+	int			input_disabled_nodes;
 	Cost		input_startup_cost;
 	Cost		input_total_cost;
 	ListCell   *l;
@@ -1447,6 +1504,7 @@ create_merge_append_path(PlannerInfo *root,
 	 * Add up the sizes and costs of the input paths.
 	 */
 	pathnode->path.rows = 0;
+	input_disabled_nodes = 0;
 	input_startup_cost = 0;
 	input_total_cost = 0;
 	foreach(l, subpaths)
@@ -1460,6 +1518,7 @@ create_merge_append_path(PlannerInfo *root,
 		if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 		{
 			/* Subpath is adequately ordered, we won't need to sort it */
+			input_disabled_nodes += subpath->disabled_nodes;
 			input_startup_cost += subpath->startup_cost;
 			input_total_cost += subpath->total_cost;
 		}
@@ -1471,12 +1530,14 @@ create_merge_append_path(PlannerInfo *root,
 			cost_sort(&sort_path,
 					  root,
 					  pathkeys,
+					  subpath->disabled_nodes,
 					  subpath->total_cost,
 					  subpath->rows,
 					  subpath->pathtarget->width,
 					  0.0,
 					  work_mem,
 					  pathnode->limit_tuples);
+			input_disabled_nodes += sort_path.disabled_nodes;
 			input_startup_cost += sort_path.startup_cost;
 			input_total_cost += sort_path.total_cost;
 		}
@@ -1495,12 +1556,14 @@ create_merge_append_path(PlannerInfo *root,
 		((Path *) linitial(subpaths))->parallel_aware ==
 		pathnode->path.parallel_aware)
 	{
+		pathnode->path.disabled_nodes = input_disabled_nodes;
 		pathnode->path.startup_cost = input_startup_cost;
 		pathnode->path.total_cost = input_total_cost;
 	}
 	else
 		cost_merge_append(&pathnode->path, root,
 						  pathkeys, list_length(subpaths),
+						  input_disabled_nodes,
 						  input_startup_cost, input_total_cost,
 						  pathnode->path.rows);
 
@@ -1582,6 +1645,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 	pathnode->subpath = subpath;
 
 	cost_material(&pathnode->path,
+				  subpath->disabled_nodes,
 				  subpath->startup_cost,
 				  subpath->total_cost,
 				  subpath->rows,
@@ -1628,6 +1692,10 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	 */
 	pathnode->est_entries = 0;
 
+	/* we should not generate this path type when enable_memoize=false */
+	Assert(enable_memoize);
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
+
 	/*
 	 * Add a small additional charge for caching the first entry.  All the
 	 * harder calculations for rescans are performed in cost_memoize_rescan().
@@ -1727,6 +1795,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	{
 		pathnode->umethod = UNIQUE_PATH_NOOP;
 		pathnode->path.rows = rel->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost;
 		pathnode->path.total_cost = subpath->total_cost;
 		pathnode->path.pathkeys = subpath->pathkeys;
@@ -1765,6 +1834,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 			{
 				pathnode->umethod = UNIQUE_PATH_NOOP;
 				pathnode->path.rows = rel->rows;
+				pathnode->path.disabled_nodes = subpath->disabled_nodes;
 				pathnode->path.startup_cost = subpath->startup_cost;
 				pathnode->path.total_cost = subpath->total_cost;
 				pathnode->path.pathkeys = subpath->pathkeys;
@@ -1792,6 +1862,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		 * Estimate cost for sort+unique implementation
 		 */
 		cost_sort(&sort_path, root, NIL,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  rel->rows,
 				  subpath->pathtarget->width,
@@ -1829,6 +1900,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 					 AGG_HASHED, NULL,
 					 numCols, pathnode->path.rows,
 					 NIL,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 rel->rows,
@@ -1837,7 +1909,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (sjinfo->semi_can_btree && sjinfo->semi_can_hash)
 	{
-		if (agg_path.total_cost < sort_path.total_cost)
+		if (agg_path.disabled_nodes < sort_path.disabled_nodes ||
+			(agg_path.disabled_nodes == sort_path.disabled_nodes &&
+			 agg_path.total_cost < sort_path.total_cost))
 			pathnode->umethod = UNIQUE_PATH_HASH;
 		else
 			pathnode->umethod = UNIQUE_PATH_SORT;
@@ -1855,11 +1929,13 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (pathnode->umethod == UNIQUE_PATH_HASH)
 	{
+		pathnode->path.disabled_nodes = agg_path.disabled_nodes;
 		pathnode->path.startup_cost = agg_path.startup_cost;
 		pathnode->path.total_cost = agg_path.total_cost;
 	}
 	else
 	{
+		pathnode->path.disabled_nodes = sort_path.disabled_nodes;
 		pathnode->path.startup_cost = sort_path.startup_cost;
 		pathnode->path.total_cost = sort_path.total_cost;
 	}
@@ -1883,6 +1959,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 						 Relids required_outer, double *rows)
 {
 	GatherMergePath *pathnode = makeNode(GatherMergePath);
+	int			input_disabled_nodes = 0;
 	Cost		input_startup_cost = 0;
 	Cost		input_total_cost = 0;
 
@@ -1904,6 +1981,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 	{
 		/* Subpath is adequately ordered, we won't need to sort it */
+		input_disabled_nodes += subpath->disabled_nodes;
 		input_startup_cost += subpath->startup_cost;
 		input_total_cost += subpath->total_cost;
 	}
@@ -1915,18 +1993,21 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  subpath->rows,
 				  subpath->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1);
+		input_disabled_nodes += sort_path.disabled_nodes;
 		input_startup_cost += sort_path.startup_cost;
 		input_total_cost += sort_path.total_cost;
 	}
 
 	cost_gather_merge(pathnode, root, rel, pathnode->path.param_info,
-					  input_startup_cost, input_total_cost, rows);
+					  input_disabled_nodes, input_startup_cost,
+					  input_total_cost, rows);
 
 	return pathnode;
 }
@@ -2234,7 +2315,8 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 						PathTarget *target,
-						double rows, Cost startup_cost, Cost total_cost,
+						double rows, int disabled_nodes,
+						Cost startup_cost, Cost total_cost,
 						List *pathkeys,
 						Relids required_outer,
 						Path *fdw_outerpath,
@@ -2255,6 +2337,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2280,7 +2363,8 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 						 PathTarget *target,
-						 double rows, Cost startup_cost, Cost total_cost,
+						 double rows, int disabled_nodes,
+						 Cost startup_cost, Cost total_cost,
 						 List *pathkeys,
 						 Relids required_outer,
 						 Path *fdw_outerpath,
@@ -2307,6 +2391,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2332,7 +2417,8 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 						  PathTarget *target,
-						  double rows, Cost startup_cost, Cost total_cost,
+						  double rows, int disabled_nodes,
+						  Cost startup_cost, Cost total_cost,
 						  List *pathkeys,
 						  Path *fdw_outerpath,
 						  List *fdw_restrictinfo,
@@ -2354,6 +2440,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2741,6 +2828,7 @@ create_projection_path(PlannerInfo *root,
 		 * Set cost of plan as subpath's cost, adjusted for tlist replacement.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			(target->cost.startup - oldtarget->cost.startup);
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2757,6 +2845,7 @@ create_projection_path(PlannerInfo *root,
 		 * evaluating the tlist.  There is no qual to worry about.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			target->cost.startup;
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2974,6 +3063,7 @@ create_incremental_sort_path(PlannerInfo *root,
 
 	cost_incremental_sort(&pathnode->path,
 						  root, pathkeys, presorted_keys,
+						  subpath->disabled_nodes,
 						  subpath->startup_cost,
 						  subpath->total_cost,
 						  subpath->rows,
@@ -3020,6 +3110,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->subpath = subpath;
 
 	cost_sort(&pathnode->path, root, pathkeys,
+			  subpath->disabled_nodes,
 			  subpath->total_cost,
 			  subpath->rows,
 			  subpath->pathtarget->width,
@@ -3072,6 +3163,7 @@ create_group_path(PlannerInfo *root,
 			   list_length(groupClause),
 			   numGroups,
 			   qual,
+			   subpath->disabled_nodes,
 			   subpath->startup_cost, subpath->total_cost,
 			   subpath->rows);
 
@@ -3129,6 +3221,7 @@ create_upper_unique_path(PlannerInfo *root,
 	 * all columns get compared at most of the tuples.  (XXX probably this is
 	 * an overestimate.)
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_operator_cost * subpath->rows * numCols;
@@ -3207,6 +3300,7 @@ create_agg_path(PlannerInfo *root,
 			 aggstrategy, aggcosts,
 			 list_length(groupClause), numGroups,
 			 qual,
+			 subpath->disabled_nodes,
 			 subpath->startup_cost, subpath->total_cost,
 			 subpath->rows, subpath->pathtarget->width);
 
@@ -3315,6 +3409,7 @@ create_groupingsets_path(PlannerInfo *root,
 					 numGroupCols,
 					 rollup->numGroups,
 					 having_qual,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 subpath->rows,
@@ -3340,7 +3435,7 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
-						 0.0, 0.0,
+						 0, 0.0, 0.0,
 						 subpath->rows,
 						 subpath->pathtarget->width);
 				if (!rollup->is_hashed)
@@ -3349,7 +3444,7 @@ create_groupingsets_path(PlannerInfo *root,
 			else
 			{
 				/* Account for cost of sort, but don't charge input cost again */
-				cost_sort(&sort_path, root, NIL,
+				cost_sort(&sort_path, root, NIL, 0,
 						  0.0,
 						  subpath->rows,
 						  subpath->pathtarget->width,
@@ -3365,12 +3460,14 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
+						 sort_path.disabled_nodes,
 						 sort_path.startup_cost,
 						 sort_path.total_cost,
 						 sort_path.rows,
 						 subpath->pathtarget->width);
 			}
 
+			pathnode->path.disabled_nodes += agg_path.disabled_nodes;
 			pathnode->path.total_cost += agg_path.total_cost;
 			pathnode->path.rows += agg_path.rows;
 		}
@@ -3524,6 +3621,7 @@ create_windowagg_path(PlannerInfo *root,
 	cost_windowagg(&pathnode->path, root,
 				   windowFuncs,
 				   winclause,
+				   subpath->disabled_nodes,
 				   subpath->startup_cost,
 				   subpath->total_cost,
 				   subpath->rows);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 2ba297c117..9bd0b3d86d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1658,6 +1658,7 @@ typedef struct Path
 
 	/* estimated size/costs for path (see costsize.c for more info) */
 	Cardinality rows;			/* estimated number of result tuples */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
@@ -3333,6 +3334,7 @@ typedef struct
 typedef struct JoinCostWorkspace
 {
 	/* Preliminary cost estimates --- must not be larger than final ones! */
+	int			disabled_nodes;
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b1c51a4e70..731e8dc641 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -108,35 +108,42 @@ extern void cost_resultscan(Path *path, PlannerInfo *root,
 							RelOptInfo *baserel, ParamPathInfo *param_info);
 extern void cost_recursive_union(Path *runion, Path *nrterm, Path *rterm);
 extern void cost_sort(Path *path, PlannerInfo *root,
-					  List *pathkeys, Cost input_cost, double tuples, int width,
+					  List *pathkeys, int disabled_nodes,
+					  Cost input_cost, double tuples, int width,
 					  Cost comparison_cost, int sort_mem,
 					  double limit_tuples);
 extern void cost_incremental_sort(Path *path,
 								  PlannerInfo *root, List *pathkeys, int presorted_keys,
+								  int input_disabled_nodes,
 								  Cost input_startup_cost, Cost input_total_cost,
 								  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 								  double limit_tuples);
 extern void cost_append(AppendPath *apath);
 extern void cost_merge_append(Path *path, PlannerInfo *root,
 							  List *pathkeys, int n_streams,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double tuples);
 extern void cost_material(Path *path,
+						  int input_disabled_nodes,
 						  Cost input_startup_cost, Cost input_total_cost,
 						  double tuples, int width);
 extern void cost_agg(Path *path, PlannerInfo *root,
 					 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 					 int numGroupCols, double numGroups,
 					 List *quals,
+					 int input_disabled_nodes,
 					 Cost input_startup_cost, Cost input_total_cost,
 					 double input_tuples, double input_width);
 extern void cost_windowagg(Path *path, PlannerInfo *root,
 						   List *windowFuncs, WindowClause *winclause,
+						   int input_disabled_nodes,
 						   Cost input_startup_cost, Cost input_total_cost,
 						   double input_tuples);
 extern void cost_group(Path *path, PlannerInfo *root,
 					   int numGroupCols, double numGroups,
 					   List *quals,
+					   int input_disabled_nodes,
 					   Cost input_startup_cost, Cost input_total_cost,
 					   double input_tuples);
 extern void initial_cost_nestloop(PlannerInfo *root,
@@ -171,6 +178,7 @@ extern void cost_gather(GatherPath *path, PlannerInfo *root,
 						RelOptInfo *rel, ParamPathInfo *param_info, double *rows);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 							  RelOptInfo *rel, ParamPathInfo *param_info,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 112e7c23d4..36e1c24a56 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -27,11 +27,12 @@ extern int	compare_fractional_path_costs(Path *path1, Path *path2,
 										  double fraction);
 extern void set_cheapest(RelOptInfo *parent_rel);
 extern void add_path(RelOptInfo *parent_rel, Path *new_path);
-extern bool add_path_precheck(RelOptInfo *parent_rel,
+extern bool add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 							  Cost startup_cost, Cost total_cost,
 							  List *pathkeys, Relids required_outer);
 extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
 extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
+									  int disabled_nodes,
 									  Cost total_cost, List *pathkeys);
 
 extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
@@ -124,7 +125,8 @@ extern Path *create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 									   Relids required_outer);
 extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											PathTarget *target,
-											double rows, Cost startup_cost, Cost total_cost,
+											double rows, int disabled_nodes,
+											Cost startup_cost, Cost total_cost,
 											List *pathkeys,
 											Relids required_outer,
 											Path *fdw_outerpath,
@@ -132,7 +134,8 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											List *fdw_private);
 extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 PathTarget *target,
-											 double rows, Cost startup_cost, Cost total_cost,
+											 double rows, int disabled_nodes,
+											 Cost startup_cost, Cost total_cost,
 											 List *pathkeys,
 											 Relids required_outer,
 											 Path *fdw_outerpath,
@@ -140,7 +143,8 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 List *fdw_private);
 extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 											  PathTarget *target,
-											  double rows, Cost startup_cost, Cost total_cost,
+											  double rows, int disabled_nodes,
+											  Cost startup_cost, Cost total_cost,
 											  List *pathkeys,
 											  Path *fdw_outerpath,
 											  List *fdw_restrictinfo,
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index f79eda79f6..20c651aadb 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -539,15 +539,17 @@ explain (costs off)
 ------------------------------------------------------------
  Aggregate
    ->  Nested Loop
-         ->  Seq Scan on tenk2
-               Filter: (thousand = 0)
+         ->  Gather
+               Workers Planned: 4
+               ->  Parallel Seq Scan on tenk2
+                     Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
                ->  Parallel Bitmap Heap Scan on tenk1
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(10 rows)
+(12 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
-- 
2.39.3 (Apple Git-145)

#60

Heikki Linnakangas

hlinnaka@iki.fi

over 1 year ago

In reply to: Robert Haas (#59)

Re: On disable_cost

On 28/06/2024 18:46, Robert Haas wrote:

On Wed, Jun 12, 2024 at 11:35 AM Robert Haas <robertmhaas@gmail.com> wrote:

Well, that didn't generate much discussion, but here I am trying
again. Here I've got patches 0001 and 0002 from my previous posting;
I've dropped 0003 and 0004 from the previous set for now so as not to
distract from the main event, but they may still be a good idea.
Instead I've got an 0003 and an 0004 that implement the "count of
disabled nodes" approach that we have discussed previously. This seems
to work fine, unlike the approaches I tried earlier. I think this is
the right direction to go, but I'd like to know what concerns people
might have.

Here is a rebased patch set, where I also fixed pgindent damage and a
couple of small oversights in 0004.

I am hoping to get these committed some time in July. So if somebody
thinks that's too soon or thinks it shouldn't happen at all, please
don't wait too long to let me know about that.

v3-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patch:

+1, this seems ready for commit

v3-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patch:

I fear this will break people's applications, if they are currently
forcing a sequential scan with "set enable_indexscan=off". Now they will
need to do "set enable_indexscan=off; set enable_indexonlyscan=off" for
the same effect. Maybe it's acceptable, disabling sequential scans to
force an index scan is much more common than the other way round.

v3-0003-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patch:

@@ -1318,6 +1342,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
startup_cost += path->pathtarget->cost.startup;
run_cost += path->pathtarget->cost.per_tuple * path->rows;
+	/*
+	 * There are assertions above verifying that we only reach this function
+	 * either when enable_tidscan=true or when the TID scan is the only legal
+	 * path, so it's safe to set disabled_nodes to zero here.
+	 */
+	path->disabled_nodes = 0;
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}

So if you have enable_tidscan=off, and have a query with "WHERE CURRENT
OF foo" that is planned with a TID scan, we set disable_nodes = 0? That
sounds wrong, shouldn't disable_nodes be 1 in that case? It probably
cannot affect the rest of the plan, given that "WHERE CURRENT OF" is
only valid in an UPDATE or DELETE, but still. At least it deserves a
better explanation in the comment.

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 6b64c4a362..20236e8c4d 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -25,6 +25,7 @@
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/print.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/optimizer.h"

left over from debugging?

@@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
int
compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
{
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
if (criterion == STARTUP_COST)
{
if (path1->startup_cost < path2->startup_cost)

Is "unlikely()" really appropriate here (and elsewhere in the patch)? If
you run with enable_seqscan=off but have no indexes, you could take that
path pretty often.

If this function needs optimizing, I'd suggest splitting it into two
functions, one for comparing the startup cost and another for the total
cost. Almost all callers pass a constant for that argument, so they
might as well call the correct function directly and avoid the branch
for that.

@@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
Path *old_path = (Path *) lfirst(p1);
PathKeysComparison keyscmp;

+		/*
+		 * Since the pathlist is sorted by disabled_nodes and then by
+		 * total_cost, we can stop looking once we reach a path with more
+		 * disabled nodes, or the same number of disabled nodes plus a
+		 * total_cost larger than the new path's.
+		 */
+		if (unlikely(old_path->disabled_nodes != disabled_nodes))
+		{
+			if (disabled_nodes < old_path->disabled_nodes)
+				break;
+		}
+		else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
+			break;
+
/*
* We are looking for an old_path with the same parameterization (and
* by assumption the same rowcount) that dominates the new path on
@@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
*
* Cost comparisons here should match compare_path_costs_fuzzily.
*/
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		/* new path can win on startup cost only if consider_startup */
+		if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
+			!consider_startup)
{

The "Cost comparisons here should match compare_path_costs_fuzzily"
comment also applies to the check on total_cost that you moved up. Maybe
move up the comment to the beginning of the loop.

v3-0004-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patch:

It's surprising that the "Disable Nodes" is printed even with the COSTS
OFF option. It's handy for our regression tests, it's good to print them
there, but it feels wrong.

Could we cram it into the "cost=... rows=..." part? And perhaps a marker
that a node was disabled would be more user friendly than showing the
cumulative count? Something like:

postgres=# set enable_material=off;
SET
postgres=# set enable_seqscan=off;
SET
postgres=# set enable_bitmapscan=off;
SET
postgres=# explain select * from foo, bar;
QUERY PLAN

------------------------------------------------------------------------------------
Nested Loop (cost=0.15..155632.40 rows=6502500 width=8)
-> Index Only Scan using foo_i_idx on foo (cost=0.15..82.41
rows=2550 width=4)
-> Seq Scan on bar (cost=0.00..35.50 (disabled) rows=2550 width=4)
(5 rows)

--
Heikki Linnakangas
Neon (https://neon.tech)

#61

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Heikki Linnakangas (#60)

Re: On disable_cost

Thanks for the review!

On Tue, Jul 2, 2024 at 10:57 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

v3-0001-Remove-grotty-use-of-disable_cost-for-TID-scan-pl.patch:

+1, this seems ready for commit

Cool.

v3-0002-Rationalize-behavior-of-enable_indexscan-and-enab.patch:

I fear this will break people's applications, if they are currently
forcing a sequential scan with "set enable_indexscan=off". Now they will
need to do "set enable_indexscan=off; set enable_indexonlyscan=off" for
the same effect. Maybe it's acceptable, disabling sequential scans to
force an index scan is much more common than the other way round.

Well, I think it's pretty important that the GUC does what the name
and documentation say it does. One could of course argue that we ought
not to have two different GUCs -- or perhaps even that we ought not to
have two different plan nodes -- and I think those arguments might be
quite defensible. One could also argue for another interface, like a
GUC enable_indexscan and a value that is a comma-separated list
consisting of plain, bitmap, and index-only, or
none/0/false/any/1/true -- and that might also be quite defensible.
But I don't think one can have a GUC called enable_indexscan and
another GUC called enable_indexonlyscan and argue that it's OK for the
former one to affect both kinds of scans. That's extremely confusing
and, well, just plain wrong. I think this is a bug, and I'm not going
to back-patch the fix precisely because of the considerations you
note, but I really don't think we can leave it like this. The current
behavior is so nonsensical that the code is essentially unmaintable,
or at least I think it is.

v3-0003-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patch:
@@ -1318,6 +1342,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
startup_cost += path->pathtarget->cost.startup;
run_cost += path->pathtarget->cost.per_tuple * path->rows;
+     /*
+      * There are assertions above verifying that we only reach this function
+      * either when enable_tidscan=true or when the TID scan is the only legal
+      * path, so it's safe to set disabled_nodes to zero here.
+      */
+     path->disabled_nodes = 0;
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
So if you have enable_tidscan=off, and have a query with "WHERE CURRENT
OF foo" that is planned with a TID scan, we set disable_nodes = 0? That
sounds wrong, shouldn't disable_nodes be 1 in that case? It probably
cannot affect the rest of the plan, given that "WHERE CURRENT OF" is
only valid in an UPDATE or DELETE, but still. At least it deserves a
better explanation in the comment.

So, right now, when the planner disregards enable_WHATEVER because it
thinks it's the only way to implement something, it doesn't add
disable_cost. So, I made the patch not increment disabled_nodes in
that case. Maybe we want to rethink that choice at some point, but it
doesn't seem like a good idea to do it right now. I've found while
working on this stuff that it's super-easy to have seemingly innocuous
changes disturb regression test results, and I don't really want to
have a bunch of extra regression test changes that are due to
rethinking things other than disable_cost -> disabled_nodes. So for
now I'd like to increment disabled_nodes in just the cases where we
currently add disable_cost.

left over from debugging?

Yeah, will fix.

Is "unlikely()" really appropriate here (and elsewhere in the patch)? If
you run with enable_seqscan=off but have no indexes, you could take that
path pretty often.

That's true, but I think it's right to assume that's the uncommon
case. If we speed up planning for people who disabled sequential scans
and slow it down for people running with a normal planner
configuration, no one will thank us.

If this function needs optimizing, I'd suggest splitting it into two
functions, one for comparing the startup cost and another for the total
cost. Almost all callers pass a constant for that argument, so they
might as well call the correct function directly and avoid the branch
for that.

That's not a bad idea but seems like a separate patch.

The "Cost comparisons here should match compare_path_costs_fuzzily"
comment also applies to the check on total_cost that you moved up. Maybe
move up the comment to the beginning of the loop.

Will have a look.

v3-0004-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patch:

It's surprising that the "Disable Nodes" is printed even with the COSTS
OFF option. It's handy for our regression tests, it's good to print them
there, but it feels wrong.

I'm open to doing what people think is best here. Although we're
regarding them as part of the cost for purposes of how to compare
paths, they're not unpredictable in the way that costs are, so I think
the current handling is defensible and, as you say, it's useful for
the regression tests. However, I'm not going to fight tooth and nail
if people really want it the other way.

Could we cram it into the "cost=... rows=..." part? And perhaps a marker
that a node was disabled would be more user friendly than showing the
cumulative count? Something like:

The problem is that we'd have to derive that. What we actually know is
the disable count; to figure out whether the node itself was disabled,
we'd have to subtract the value for the underlying nodes back out.
That seems like it might be buggy or confusing.

--
Robert Haas
EDB: http://www.enterprisedb.com

#62

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#61)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 2, 2024 at 10:57 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I fear this will break people's applications, if they are currently
forcing a sequential scan with "set enable_indexscan=off". Now they will
need to do "set enable_indexscan=off; set enable_indexonlyscan=off" for
the same effect. Maybe it's acceptable, disabling sequential scans to
force an index scan is much more common than the other way round.

But I don't think one can have a GUC called enable_indexscan and
another GUC called enable_indexonlyscan and argue that it's OK for the
former one to affect both kinds of scans. That's extremely confusing
and, well, just plain wrong.

FWIW, I disagree completely. I think it's entirely natural to
consider bitmap index scans to be a subset of index scans, so that
enable_indexscan should affect both. I admit that the current set
of GUCs doesn't let you force a bitmap scan over a plain one, but
I can't recall many people complaining about that. I don't follow
the argument that this definition is somehow unmaintainable, either.

Could we cram it into the "cost=... rows=..." part? And perhaps a marker
that a node was disabled would be more user friendly than showing the
cumulative count? Something like:

The problem is that we'd have to derive that.

The other problem is it'd break an awful lot of client code that knows
the format of those lines. (Sure, by now all such code should have
been rewritten to look at JSON or other more machine-friendly output
formats ... but since we haven't even done that in our own regression
tests, we should know better than to assume other people have done it.)

I'm not really convinced that we need to show anything about this.

regards, tom lane

#63

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#62)

Re: On disable_cost

On Tue, Jul 2, 2024 at 1:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

FWIW, I disagree completely. I think it's entirely natural to
consider bitmap index scans to be a subset of index scans, so that
enable_indexscan should affect both. I admit that the current set
of GUCs doesn't let you force a bitmap scan over a plain one, but
I can't recall many people complaining about that. I don't follow
the argument that this definition is somehow unmaintainable, either.

Well... but that's not what the GUC does either. Not now, and not with
the patch.

What happens right now is:

- If you set enable_indexscan=false, then disable_cost is added to the
cost of index scan paths and the cost of index-only scan paths.

- If you set enable_indexonlyscan=false, then index-only scan paths
are not generated at all.

Bitmap scans are controlled by enable_bitmapscan.

--
Robert Haas
EDB: http://www.enterprisedb.com

#64

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#63)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

What happens right now is:

- If you set enable_indexscan=false, then disable_cost is added to the
cost of index scan paths and the cost of index-only scan paths.

- If you set enable_indexonlyscan=false, then index-only scan paths
are not generated at all.

Hm. The first part of that seems pretty weird to me --- why don't
we simply not generate the paths at all? There is no case AFAIR
where that would prevent us from generating a valid plan.

(I do seem to recall that index-only paths are built on top of regular
index paths, so that there might be implementation issues with trying
to build the former and not the latter. But you've probably looked
at that far more recently than I.)

regards, tom lane

#65

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#64)

Re: On disable_cost

On Tue, Jul 2, 2024 at 2:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

What happens right now is:

- If you set enable_indexscan=false, then disable_cost is added to the
cost of index scan paths and the cost of index-only scan paths.

- If you set enable_indexonlyscan=false, then index-only scan paths
are not generated at all.

Hm. The first part of that seems pretty weird to me --- why don't
we simply not generate the paths at all? There is no case AFAIR
where that would prevent us from generating a valid plan.

Well, yeah.

What the patch does is: if you set either enable_indexscan=false or
enable_indexonlyscan=false, then the corresponding path type is not
generated, and the other is unaffected. To me, that seems like the
logical way to clean this up.

One could argue for other things, of course. And maybe those other
things are fine, if they're properly justified and documented.

--
Robert Haas
EDB: http://www.enterprisedb.com

#66

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#65)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

What the patch does is: if you set either enable_indexscan=false or
enable_indexonlyscan=false, then the corresponding path type is not
generated, and the other is unaffected. To me, that seems like the
logical way to clean this up.

One could argue for other things, of course. And maybe those other
things are fine, if they're properly justified and documented.

[ shrug... ] This isn't a hill that I'm prepared to die on.
But I see no good reason to change the very long-standing
behaviors of these GUCs.

regards, tom lane

#67

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#66)

Re: On disable_cost

On Tue, Jul 2, 2024 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

One could argue for other things, of course. And maybe those other
things are fine, if they're properly justified and documented.

[ shrug... ] This isn't a hill that I'm prepared to die on.
But I see no good reason to change the very long-standing
behaviors of these GUCs.

Well, I don't really know where to go from here. I mean, I think that
three committers (David, Heikki, yourself) have expressed some
concerns about changing the behavior. So maybe we shouldn't. But I
don't understand how it's reasonable to have two very similarly named
GUCs behave (1) inconsistently with each other and (2) in a way that
cannot be guessed from the documentation.

I feel like we're just clinging to legacy behavior on the theory that
somebody, somewhere might be relying on it in some way, which they
certainly might be. But that doesn't seem like a great reason, either.

--
Robert Haas
EDB: http://www.enterprisedb.com

#68

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#67)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

Well, I don't really know where to go from here. I mean, I think that
three committers (David, Heikki, yourself) have expressed some
concerns about changing the behavior. So maybe we shouldn't. But I
don't understand how it's reasonable to have two very similarly named
GUCs behave (1) inconsistently with each other and (2) in a way that
cannot be guessed from the documentation.

If the documentation isn't adequate, that's certainly an improvable
situation. It doesn't seem hard:

-        Enables or disables the query planner's use of index-scan plan
-        types. The default is <literal>on</literal>.
+        Enables or disables the query planner's use of index-scan plan
+        types (including index-only scans).
+        The default is <literal>on</literal>.

More to the point, if we do change the longstanding meaning of this
GUC, that will *also* require documentation work IMO.

regards, tom lane

#69

Heikki Linnakangas

hlinnaka@iki.fi

over 1 year ago

In reply to: Robert Haas (#67)

Re: On disable_cost

On 02/07/2024 22:54, Robert Haas wrote:

On Tue, Jul 2, 2024 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

One could argue for other things, of course. And maybe those other
things are fine, if they're properly justified and documented.

[ shrug... ] This isn't a hill that I'm prepared to die on.
But I see no good reason to change the very long-standing
behaviors of these GUCs.

Well, I don't really know where to go from here. I mean, I think that
three committers (David, Heikki, yourself) have expressed some
concerns about changing the behavior. So maybe we shouldn't. But I
don't understand how it's reasonable to have two very similarly named
GUCs behave (1) inconsistently with each other and (2) in a way that
cannot be guessed from the documentation.

I feel like we're just clinging to legacy behavior on the theory that
somebody, somewhere might be relying on it in some way, which they
certainly might be. But that doesn't seem like a great reason, either.

I agree the status quo is weird too. I'd be OK to break
backwards-compatibility if we can make it better.

Tom mentioned enable_bitmapscan, and it reminded me that the current
behavior with that is actually a bit annoying. I go through this pattern
very often when I'm investigating query plans:

1. Hmm, let's see what this query plan looks like:

postgres=# explain analyze select * from foo where i=10;
QUERY PLAN

----------------------------------------------------------------------------------------------------------------
Index Scan using foo_i_idx on foo (cost=0.29..8.31 rows=1 width=36)
(actual time=0.079..0.090 rows=2 loops=1)
Index Cond: (i = 10)
Planning Time: 2.220 ms
Execution Time: 0.337 ms
(4 rows)

2. Ok, and how long would it take with a seq scan? Let's see:

postgres=# set enable_indexscan=off;
SET
postgres=# explain analyze select * from foo where i=10;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (cost=4.30..8.31 rows=1 width=36) (actual
time=0.102..0.113 rows=2 loops=1)
Recheck Cond: (i = 10)
Heap Blocks: exact=2
-> Bitmap Index Scan on foo_i_idx (cost=0.00..4.30 rows=1 width=0)
(actual time=0.067..0.068 rows=2 loops=1)
Index Cond: (i = 10)
Planning Time: 0.211 ms
Execution Time: 0.215 ms
(7 rows)

3. Oh right, bitmap scan, I forgot about that one. Let's disable that too:

postgres=# set enable_bitmapscan=off;
SET
postgres=# explain analyze select * from foo where i=10;
QUERY PLAN

--------------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..1862.00 rows=1 width=36) (actual
time=0.042..39.226 rows=2 loops=1)
Filter: (i = 10)
Rows Removed by Filter: 109998
Planning Time: 0.118 ms
Execution Time: 39.272 ms
(5 rows)

I would be somewhat annoyed if we add another step to that, to also
disable index-only scans separately. It would be nice if
enable_indexscan=off would also disable bitmap scans, that would
eliminate one step from the above. Almost always when I want to disable
index scans, I really want to disable the use of the index altogether.
The problem then of course is, how do you force a bitmap scan without
allowing other index scans, when you want to test them both?

It almost feels like we should have yet another GUC to disable index
scans, index-only scans and bitmap index scans. "enable_indexes=off" or
something.

--
Heikki Linnakangas
Neon (https://neon.tech)

#70

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Heikki Linnakangas (#69)

Re: On disable_cost

Heikki Linnakangas <hlinnaka@iki.fi> writes:

3. Oh right, bitmap scan, I forgot about that one. Let's disable that too:

Yeah, I've hit that too, although more often (for me) it's the first
choice of plan. In any case, it usually takes more than one change
to get to a seqscan.

It almost feels like we should have yet another GUC to disable index
scans, index-only scans and bitmap index scans. "enable_indexes=off" or
something.

There's something to be said for that idea. Breaking compatibility is
a little easier to stomach if there's a clear convenience win, and
this'd offer that.

regards, tom lane

#71

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Heikki Linnakangas (#69)

Re: On disable_cost

On Tue, Jul 2, 2024 at 5:39 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I would be somewhat annoyed if we add another step to that, to also
disable index-only scans separately. It would be nice if
enable_indexscan=off would also disable bitmap scans, that would
eliminate one step from the above. Almost always when I want to disable
index scans, I really want to disable the use of the index altogether.
The problem then of course is, how do you force a bitmap scan without
allowing other index scans, when you want to test them both?

It almost feels like we should have yet another GUC to disable index
scans, index-only scans and bitmap index scans. "enable_indexes=off" or
something.

This is an interesting idea, and it seems like it could be convenient.
However, the fact that it's so non-orthogonal is definitely not great.
One problem I've had with going through regression tests that rely on
the enable_* GUCs is that it's often not quite clear what values all
of those GUCs have at a certain point in the test file, because the
statements that set them may be quite a bit higher up in the file and
some changes may also have been rolled back. I've found recently that
the addition of EXPLAIN (SETTINGS) helps with this quite a bit,
because you can adjust the .sql file to use that option and then see
what shows up in the output file. Still, it's annoying, and the same
issue could occur in any other situation where you're using these
GUCs. It's just more confusing when there are multiple ways of turning
something off.

Would we consider merging enable_indexscan, enable_indexonlyscan, and
enable_bitmapscan into something like:

enable_indexes = on | off | { plain | indexonly | bitmap } [, ...]

I feel like that would solve the usability concern that you raise here
while also (1) preserving orthogonality and (2) reducing the number of
GUCs rather than first increasing it. When I first joined the project
there were a decent number of enable_* GUCs, but there's way more now.
Some of them are a little random (which is a conversation for another
day) but just cutting down on the number seems like it might not be
such a bad idea.

--
Robert Haas
EDB: http://www.enterprisedb.com

#72

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#70)

Re: On disable_cost

On Wed, 3 Jul 2024 at 09:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

3. Oh right, bitmap scan, I forgot about that one. Let's disable that too:

Yeah, I've hit that too, although more often (for me) it's the first
choice of plan. In any case, it usually takes more than one change
to get to a seqscan.

I commonly hit this too.

I think the current behaviour is born out of the fact that we don't
produce both an Index Scan and an Index Only Scan for the same index.
We'll just make the IndexPath an index only scan, if possible based
on:

index_only_scan = (scantype != ST_BITMAPSCAN &&
check_index_only(rel, index));

The same isn't true for Bitmap Index Scans. We'll create both
IndexPaths and BitmapHeapPaths and let them battle it out in
add_path().

I suspect this is why it's been coded that enable_indexscan also
disables Index Only Scans. Now, of course, it could work another way,
but I also still think that doing so is changing well-established
behaviour that I don't recall anyone ever complaining about besides
Robert. Robert's complaint seems to have originated from something he
noticed while hacking on code rather than actually using the database
for something. I think the argument for changing it should have less
weight due to that.

I understand that we do have inconsistencies around this stuff. For
example, enable_sort has no influence on Incremental Sorts like
enable_indexscan has over Index Only Scan. That might come from the
fact that we used to, up until a couple of releases ago, produce both
sort path types and let them compete in add_path(). That's no longer
the case, we now just do incremental sort when we can, just like we do
Index Only Scans when we can. Despite those inconsistencies, I
wouldn't vote for changing either of them to align with the other. It
just feels too long-established behaviour to be messing with.

I feel it might be best to move this patch to the back of the series
or just drop it for now as it seems to be holding up the other stuff
from moving forward, and that stuff looks useful and worth changing.

David

#73

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#72)

2 attachment(s)

Re: On disable_cost

OK, here's a new patch version. I earlier committed the refactoring to
avoid using disable_cost to force WHERE CURRENT OF to be implemented
by a TID scan. In this version, I've dropped everything related to
reworking enable_indexscan or any other enable_* GUC. Hence, this
version of the patch set just focuses on adding the count of disabled
nodes and removing the use of disable_cost. In addition to dropping
the controversial patches, I've also found and squashed a few bugs in
this version.

Behavior: With the patch, whenever an enable_* GUC would cause
disable_cost to be added, disabled_nodes is incremented instead. There
is one remaining use of disable_cost which is not triggered by an
enable_* GUC but by the desire to avoid plans that we think will
overflow work_mem. I welcome thoughts on what to do about that case;
for now, I do nothing. As before, 0001 adds the disabled_nodes field
to paths and 0002 adds it to plans. I think we could plausibly commit
only 0001, both patches separately, or both patches squashed.

Notes:

- I favor committing both patches. Tom stated that he didn't think
that we needed to show anything related to disabled nodes, and that
could be true. However, today, you can tell which nodes are disabled
as long as you print out the costs; if we don't propagate disabled
nodes into the plan and print them out, that will no longer be
possible. I found working on the patches that it was really hard to
debug the patch set without this, so my guess is that we'll find not
having it pretty annoying, but we can also just commit 0001 for
starters and see how long it takes for the lack of 0002 to become
annoying. If the answer is "infinite time," that's cool; if it isn't,
we can reconsider committing 0002.

- If we do commit 0002, I think it's a good idea to have the number of
disabled nodes displayed even with COSTS OFF, because it's stable, and
it's pretty useful to be able to see this in the regression output. I
have found while working on this that I often need to adjust the .sql
files to say EXPLAIN (COSTS ON) instead of EXPLAIN (COSTS OFF) in
order to understand what's happening. Right now, there's no real
alternative because costs aren't stable, but disabled-node counts
should be stable, so I feel this would be a step forward. Apart from
that, I also think it's good for features to have regression test
coverage, and since we use COSTS OFF everywhere or at least nearly
everywhere in the regression test, if we don't print out the disabled
node counts when COSTS OFF is used, then we don't cover that case in
our tests. Bummer.

Regression test changes in 0001:

- btree_index.sql executes a query "select proname from pg_proc where
proname ilike 'ri%foo' order by 1" with everything but bitmap scans
disabled. Currently, that produces an index-only scan; with the patch,
it produces a sort over a sequential scan. That's a little odd,
because the test seems to be aimed at demonstrating that we can use a
bitmap scan, and it doesn't, because we apparently can't. But, why
does the patch change the plan?
At least on my machine, the index-only scan is significantly more
costly than the sequential scan. I think what's happening here is that
when you add disable_cost to the cost of both paths, they compare
fuzzily the same; without that, the cheaper one wins.

- select_parallel.out executes a query with sequential scans disabled
but tenk2 must nevertheless be sequential-scanned. With the patch,
that changes to a parallel sequential scan. I think the explanation
here is the same as in the preceding case.

- horizons.spec currently sets enable_seqscan=false,
enable_indexscan=false, and enable_bitmapscan=false. I suspect that
Andres thought that this would force the use of an index-only scan,
since nothing sets enable_indexonlyscan=false. But as discussed
upthread, that is not true. Instead everything is disabled. For the
same reasons as in the previous two examples, this caused an
assortment of plan changes which in turn caused the test to fail to
test what it was intended to test. So I removed enable_indexscan=false
from the spec file, and now it gets index-only scans everywhere again,
as desired.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v4-0002-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchapplication/octet-stream; name=v4-0002-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchDownload

From a3e049962767c15de1e4dd2756d8a303145e6c6b Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 31 Jul 2024 11:35:53 -0400
Subject: [PATCH v4 2/2] Show number of disabled nodes in EXPLAIN ANALYZE
 output.

Now that disable_cost is not included in the cost estimate, there's
no visible sign in EXPLAIN output of which plan nodes are disabled.
Fix that by propagating the number of disabled nodes from Path to
Plan, and then showing it in the EXPLAIN output.
---
 src/backend/commands/explain.c                |  4 ++++
 src/backend/optimizer/plan/createplan.c       |  8 +++++--
 src/include/nodes/plannodes.h                 |  1 +
 src/test/regress/expected/aggregates.out      | 21 ++++++++++++++++---
 src/test/regress/expected/btree_index.out     |  4 +++-
 .../regress/expected/collate.icu.utf8.out     |  6 ++++--
 .../regress/expected/incremental_sort.out     |  5 ++++-
 src/test/regress/expected/inherit.out         |  4 +++-
 src/test/regress/expected/join.out            |  4 +++-
 src/test/regress/expected/memoize.out         |  8 +++++--
 src/test/regress/expected/select_parallel.out |  6 +++++-
 src/test/regress/expected/union.out           |  3 ++-
 12 files changed, 59 insertions(+), 15 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5771aabf40..11df4a04d4 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1894,6 +1894,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
+	if (plan->disabled_nodes != 0)
+		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
+							   es);
+
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
 	{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index b19b46159c..8428e93d2d 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -2572,6 +2572,7 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
 								   0, NULL, NULL, NULL);
 
 		/* Must apply correct cost/width data to Limit node */
+		plan->disabled_nodes = mminfo->path->disabled_nodes;
 		plan->startup_cost = mminfo->path->startup_cost;
 		plan->total_cost = mminfo->pathcost;
 		plan->plan_rows = 1;
@@ -5404,6 +5405,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
 static void
 copy_generic_path_info(Plan *dest, Path *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->rows;
@@ -5419,6 +5421,7 @@ copy_generic_path_info(Plan *dest, Path *src)
 static void
 copy_plan_costsize(Plan *dest, Plan *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->plan_rows;
@@ -5452,7 +5455,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
-			  0,				/* a Plan contains no count of disabled nodes */
+			  plan->plan.disabled_nodes,
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6547,11 +6550,12 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
-				  0,			/* a Plan contains no count of disabled nodes */
+				  subplan->disabled_nodes,
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
 				  subplan->plan_width);
+	matplan->disabled_nodes = subplan->disabled_nodes;
 	matplan->startup_cost = matpath.startup_cost + initplan_cost;
 	matplan->total_cost = matpath.total_cost + initplan_cost;
 	matplan->plan_rows = subplan->plan_rows;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1aeeaec95e..62cd6a6666 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -125,6 +125,7 @@ typedef struct Plan
 	/*
 	 * estimated execution costs for plan (see costsize.c for more info)
 	 */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index a5596ab210..8ac13b562c 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2920,18 +2920,23 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
+   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
+         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(12 rows)
+                           Disabled Nodes: 1
+(17 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2953,19 +2958,24 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2975,19 +2985,24 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 092233cc9d..b350efe128 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -335,10 +335,12 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                   QUERY PLAN                  
 ----------------------------------------------
  Sort
+   Disabled Nodes: 1
    Sort Key: proname
    ->  Seq Scan on pg_proc
+         Disabled Nodes: 1
          Filter: (proname ~~* 'ri%foo'::text)
-(4 rows)
+(6 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb4431..31345295c1 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -989,8 +989,9 @@ select * from collate_test1 where b ilike 'abc';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'abc'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'abc';
  a |  b  
@@ -1004,8 +1005,9 @@ select * from collate_test1 where b ilike 'ABC';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'ABC'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'ABC';
  a |  b  
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..79f0d37a87 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -701,16 +701,19 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
+   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
+         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
+               Disabled Nodes: 1
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(10 rows)
+(13 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index ad73213414..dbb748a2d2 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,6 +1614,7 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
+   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1623,10 +1624,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
+               Disabled Nodes: 1
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(13 rows)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 53f70d72ed..31fb7d142e 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8000,13 +8000,15 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
+   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
+         Disabled Nodes: 1
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(7 rows)
+(9 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 96906104d7..df2ca5ba4e 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -333,14 +333,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(8 rows)
+(10 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -348,14 +350,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(8 rows)
+(10 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 9bad3fc464..c2e9458c35 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -537,10 +537,14 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
+   Disabled Nodes: 1
    ->  Nested Loop
+         Disabled Nodes: 1
          ->  Gather
+               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
+                     Disabled Nodes: 1
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -548,7 +552,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(12 rows)
+(16 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0fd0e1c38b..0456d48c93 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,11 +822,12 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
+   Disabled Nodes: 1
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
          ->  Result
-(5 rows)
+(6 rows)
 
 reset enable_hashagg;
 --
-- 
2.39.3 (Apple Git-145)

v4-0001-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchapplication/octet-stream; name=v4-0001-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchDownload

From ac121b972c6c3cf01793ea975f08fe8894259fd9 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 31 Jul 2024 11:17:25 -0400
Subject: [PATCH v4 1/2] Treat number of disabled nodes in a path as a separate
 cost metric.

Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the cost. Hence, the
path list is now sorted by disabled_nodes and then by total_cost,
instead of just by the latter, and likewise for the partial path list.
It is important that this number is a count and not simply a Boolean;
else, as soon as we're unable to respect disabled path types in all
portions of the path, we stop trying to avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
---
 contrib/file_fdw/file_fdw.c                   |   1 +
 contrib/postgres_fdw/postgres_fdw.c           |  44 +++-
 contrib/postgres_fdw/postgres_fdw.h           |   1 +
 src/backend/optimizer/path/costsize.c         | 161 ++++++++++----
 src/backend/optimizer/path/joinpath.c         |  15 +-
 src/backend/optimizer/plan/createplan.c       |   3 +
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/prep/prepunion.c        |   6 +-
 src/backend/optimizer/util/pathnode.c         | 208 +++++++++++++-----
 src/include/nodes/pathnodes.h                 |   2 +
 src/include/optimizer/cost.h                  |  10 +-
 src/include/optimizer/pathnode.h              |  12 +-
 src/test/isolation/specs/horizons.spec        |   1 -
 src/test/regress/expected/btree_index.out     |  12 +-
 src/test/regress/expected/select_parallel.out |   8 +-
 15 files changed, 361 insertions(+), 124 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 249d82d3a0..d16821f8e1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -576,6 +576,7 @@ fileGetForeignPaths(PlannerInfo *root,
 			 create_foreignscan_path(root, baserel,
 									 NULL,	/* default pathtarget */
 									 baserel->rows,
+									 0,
 									 startup_cost,
 									 total_cost,
 									 NIL,	/* no pathkeys */
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index fc65d81e21..cc0fb7e122 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -430,6 +430,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
 									List *pathkeys,
 									PgFdwPathExtraData *fpextra,
 									double *p_rows, int *p_width,
+									int *p_disabled_nodes,
 									Cost *p_startup_cost, Cost *p_total_cost);
 static void get_remote_estimate(const char *sql,
 								PGconn *conn,
@@ -442,6 +443,7 @@ static void adjust_foreign_grouping_path_cost(PlannerInfo *root,
 											  double retrieved_rows,
 											  double width,
 											  double limit_tuples,
+											  int *disabled_nodes,
 											  Cost *p_startup_cost,
 											  Cost *p_run_cost);
 static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
@@ -735,6 +737,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		 */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 
 		/* Report estimated baserel size to planner. */
@@ -765,6 +768,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		/* Fill in basically-bogus cost estimates for use later. */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 	}
 
@@ -1030,6 +1034,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 	path = create_foreignscan_path(root, baserel,
 								   NULL,	/* default pathtarget */
 								   fpinfo->rows,
+								   fpinfo->disabled_nodes,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
@@ -1184,13 +1189,14 @@ postgresGetForeignPaths(PlannerInfo *root,
 		ParamPathInfo *param_info = (ParamPathInfo *) lfirst(lc);
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 
 		/* Get a cost estimate from the remote */
 		estimate_path_cost_size(root, baserel,
 								param_info->ppi_clauses, NIL, NULL,
-								&rows, &width,
+								&rows, &width, &disabled_nodes,
 								&startup_cost, &total_cost);
 
 		/*
@@ -1203,6 +1209,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 		path = create_foreignscan_path(root, baserel,
 									   NULL,	/* default pathtarget */
 									   rows,
+									   disabled_nodes,
 									   startup_cost,
 									   total_cost,
 									   NIL, /* no pathkeys */
@@ -3088,12 +3095,14 @@ estimate_path_cost_size(PlannerInfo *root,
 						List *pathkeys,
 						PgFdwPathExtraData *fpextra,
 						double *p_rows, int *p_width,
+						int *p_disabled_nodes,
 						Cost *p_startup_cost, Cost *p_total_cost)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
 	double		rows;
 	double		retrieved_rows;
 	int			width;
+	int			disabled_nodes = 0;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -3483,6 +3492,7 @@ estimate_path_cost_size(PlannerInfo *root,
 				adjust_foreign_grouping_path_cost(root, pathkeys,
 												  retrieved_rows, width,
 												  fpextra->limit_tuples,
+												  &disabled_nodes,
 												  &startup_cost, &run_cost);
 			}
 			else
@@ -3577,6 +3587,7 @@ estimate_path_cost_size(PlannerInfo *root,
 	/* Return results. */
 	*p_rows = rows;
 	*p_width = width;
+	*p_disabled_nodes = disabled_nodes;
 	*p_startup_cost = startup_cost;
 	*p_total_cost = total_cost;
 }
@@ -3637,6 +3648,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 								  double retrieved_rows,
 								  double width,
 								  double limit_tuples,
+								  int *p_disabled_nodes,
 								  Cost *p_startup_cost,
 								  Cost *p_run_cost)
 {
@@ -3656,6 +3668,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  0,
 				  *p_startup_cost + *p_run_cost,
 				  retrieved_rows,
 				  width,
@@ -6147,13 +6160,15 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 	{
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 		List	   *useful_pathkeys = lfirst(lc);
 		Path	   *sorted_epq_path;
 
 		estimate_path_cost_size(root, rel, NIL, useful_pathkeys, NULL,
-								&rows, &width, &startup_cost, &total_cost);
+								&rows, &width, &disabled_nodes,
+								&startup_cost, &total_cost);
 
 		/*
 		 * The EPQ path must be at least as well sorted as the path itself, in
@@ -6175,6 +6190,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreignscan_path(root, rel,
 											 NULL,
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 useful_pathkeys,
@@ -6188,6 +6204,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreign_join_path(root, rel,
 											  NULL,
 											  rows,
+											  disabled_nodes,
 											  startup_cost,
 											  total_cost,
 											  useful_pathkeys,
@@ -6335,6 +6352,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 	ForeignPath *joinpath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	Path	   *epq_path;		/* Path to create plan to be executed when
@@ -6424,12 +6442,14 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 
 	/* Estimate costs for bare join relation */
 	estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	/* Now update this information in the joinrel */
 	joinrel->rows = rows;
 	joinrel->reltarget->width = width;
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6441,6 +6461,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 										joinrel,
 										NULL,	/* default pathtarget */
 										rows,
+										disabled_nodes,
 										startup_cost,
 										total_cost,
 										NIL,	/* no pathkeys */
@@ -6768,6 +6789,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	ForeignPath *grouppath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -6818,11 +6840,13 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the cost of push down */
 	estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/* Now update this information in the fpinfo */
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6831,6 +6855,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										  grouped_rel,
 										  grouped_rel->reltarget,
 										  rows,
+										  disabled_nodes,
 										  startup_cost,
 										  total_cost,
 										  NIL,	/* no pathkeys */
@@ -6859,6 +6884,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	PgFdwPathExtraData *fpextra;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -6952,7 +6978,8 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the costs of performing the final sort remotely */
 	estimate_path_cost_size(root, input_rel, NIL, root->sort_pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/*
 	 * Build the fdw_private list that will be used by postgresGetForeignPlan.
@@ -6965,6 +6992,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 											 input_rel,
 											 root->upper_targets[UPPERREL_ORDERED],
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 root->sort_pathkeys,
@@ -6998,6 +7026,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	bool		save_use_remote_estimate = false;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -7082,6 +7111,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 													   path->parent,
 													   path->pathtarget,
 													   path->rows,
+													   path->disabled_nodes,
 													   path->startup_cost,
 													   path->total_cost,
 													   path->pathkeys,
@@ -7199,7 +7229,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 		ifpinfo->use_remote_estimate = false;
 	}
 	estimate_path_cost_size(root, input_rel, NIL, pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	if (!fpextra->has_final_sort)
 		ifpinfo->use_remote_estimate = save_use_remote_estimate;
 
@@ -7218,6 +7249,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										   input_rel,
 										   root->upper_targets[UPPERREL_FINAL],
 										   rows,
+										   disabled_nodes,
 										   startup_cost,
 										   total_cost,
 										   pathkeys,
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 37c1575af6..9e501660d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -62,6 +62,7 @@ typedef struct PgFdwRelationInfo
 	/* Estimated size and cost for a scan, join, or grouping/aggregation. */
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 79991b1980..e79623d687 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -50,6 +50,17 @@
  * so beware of division-by-zero.)	The LIMIT is applied as a top-level
  * plan node.
  *
+ * Each path stores the total number of disabled nodes that exist at or
+ * below that point in the plan tree. This is regarded as a component of
+ * the cost, and paths with fewer disabled nodes should be regarded as
+ * cheaper than those with more. Disabled nodes occur when the user sets
+ * a GUC like enable_seqscan=false. We can't necessarily respect such a
+ * setting in every part of the plan tree, but we want to respect in as many
+ * parts of the plan tree as possible. Simpler schemes like storing a Boolean
+ * here rather than a count fail to do that. We used to disable nodes by
+ * adding a large constant to the startup cost, but that distorted planning
+ * in other ways.
+ *
  * For largely historical reasons, most of the routines in this module use
  * the passed result Path only to store their results (rows, startup_cost and
  * total_cost) into.  All the input data they need is passed as separate
@@ -301,9 +312,6 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_seqscan)
-		startup_cost += disable_cost;
-
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
 							  NULL,
@@ -346,6 +354,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
 		path->rows = clamp_row_est(path->rows / parallel_divisor);
 	}
 
+	path->disabled_nodes = enable_seqscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;
 }
@@ -418,6 +427,7 @@ cost_samplescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -456,6 +466,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows;
 
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = (startup_cost + run_cost);
 }
@@ -473,6 +484,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 void
 cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double *rows)
 {
@@ -490,9 +502,6 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	else
 		path->path.rows = rel->rows;
 
-	if (!enable_gathermerge)
-		startup_cost += disable_cost;
-
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
 	 * be overgenerous since the leader will do less work than other workers
@@ -523,6 +532,8 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows * 1.05;
 
+	path->path.disabled_nodes = input_disabled_nodes
+		+ (enable_gathermerge ? 0 : 1);
 	path->path.startup_cost = startup_cost + input_startup_cost;
 	path->path.total_cost = (startup_cost + run_cost + input_total_cost);
 }
@@ -603,9 +614,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
 	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
+	path->path.disabled_nodes = enable_indexscan ? 0 : 1;
 
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
@@ -1038,9 +1048,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
-
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
 										 &tuples_fetched);
@@ -1102,6 +1109,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = enable_bitmapscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1187,6 +1195,7 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
 	}
 	path->bitmapselectivity = selec;
 	path->path.rows = 0;		/* per above, not used */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = totalCost;
 	path->path.total_cost = totalCost;
 }
@@ -1261,6 +1270,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	/* Should only be applied to base relations */
 	Assert(baserel->relid > 0);
 	Assert(baserel->rtekind == RTE_RELATION);
+	Assert(tidquals != NIL);
 
 	/* Mark the path with the correct row estimate */
 	if (param_info)
@@ -1275,6 +1285,14 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
 		Expr	   *qual = rinfo->clause;
 
+		/*
+		 * We must use a TID scan for CurrentOfExpr; in any other case, we
+		 * should be generating a TID scan only if enable_tidscan=true. Also,
+		 * if CurrentOfExpr is the qual, there should be only one.
+		 */
+		Assert(enable_tidscan || IsA(qual, CurrentOfExpr));
+		Assert(list_length(tidquals) == 1 || !IsA(qual, CurrentOfExpr));
+
 		if (IsA(qual, ScalarArrayOpExpr))
 		{
 			/* Each element of the array yields 1 tuple */
@@ -1322,6 +1340,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/*
+	 * There are assertions above verifying that we only reach this function
+	 * either when enable_tidscan=true or when the TID scan is the only legal
+	 * path, so it's safe to set disabled_nodes to zero here.
+	 */
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1414,6 +1438,9 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/* we should not generate this path type when enable_tidscan=false */
+	Assert(enable_tidscan);
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1466,6 +1493,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
 	 * SubqueryScan node, plus cpu_tuple_cost to account for selection and
 	 * projection overhead.
 	 */
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = path->subpath->startup_cost;
 	path->path.total_cost = path->subpath->total_cost;
 
@@ -1556,6 +1584,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1612,6 +1641,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1659,6 +1689,7 @@ cost_valuesscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1706,6 +1737,7 @@ cost_ctescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1743,6 +1775,7 @@ cost_namedtuplestorescan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1777,6 +1810,7 @@ cost_resultscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1816,6 +1850,7 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm)
 	 */
 	total_cost += cpu_tuple_cost * total_rows;
 
+	runion->disabled_nodes = nrterm->disabled_nodes + rterm->disabled_nodes;
 	runion->startup_cost = startup_cost;
 	runion->total_cost = total_cost;
 	runion->rows = total_rows;
@@ -1964,6 +1999,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
 void
 cost_incremental_sort(Path *path,
 					  PlannerInfo *root, List *pathkeys, int presorted_keys,
+					  int input_disabled_nodes,
 					  Cost input_startup_cost, Cost input_total_cost,
 					  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 					  double limit_tuples)
@@ -2083,6 +2119,11 @@ cost_incremental_sort(Path *path,
 	run_cost += 2.0 * cpu_tuple_cost * input_groups;
 
 	path->rows = input_tuples;
+
+	/* should not generate these paths when enable_incremental_sort=false */
+	Assert(enable_incremental_sort);
+	path->disabled_nodes = input_disabled_nodes;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2101,7 +2142,8 @@ cost_incremental_sort(Path *path,
  */
 void
 cost_sort(Path *path, PlannerInfo *root,
-		  List *pathkeys, Cost input_cost, double tuples, int width,
+		  List *pathkeys, int input_disabled_nodes,
+		  Cost input_cost, double tuples, int width,
 		  Cost comparison_cost, int sort_mem,
 		  double limit_tuples)
 
@@ -2114,12 +2156,10 @@ cost_sort(Path *path, PlannerInfo *root,
 				   comparison_cost, sort_mem,
 				   limit_tuples);
 
-	if (!enable_sort)
-		startup_cost += disable_cost;
-
 	startup_cost += input_cost;
 
 	path->rows = tuples;
+	path->disabled_nodes = input_disabled_nodes + (enable_sort ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2211,6 +2251,7 @@ cost_append(AppendPath *apath)
 {
 	ListCell   *l;
 
+	apath->path.disabled_nodes = 0;
 	apath->path.startup_cost = 0;
 	apath->path.total_cost = 0;
 	apath->path.rows = 0;
@@ -2232,12 +2273,16 @@ cost_append(AppendPath *apath)
 			 */
 			apath->path.startup_cost = firstsubpath->startup_cost;
 
-			/* Compute rows and costs as sums of subplan rows and costs. */
+			/*
+			 * Compute rows, number of disabled nodes, and total cost as sums
+			 * of underlying subplan values.
+			 */
 			foreach(l, apath->subpaths)
 			{
 				Path	   *subpath = (Path *) lfirst(l);
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.total_cost += subpath->total_cost;
 			}
 		}
@@ -2277,6 +2322,7 @@ cost_append(AppendPath *apath)
 					cost_sort(&sort_path,
 							  NULL, /* doesn't currently need root */
 							  pathkeys,
+							  subpath->disabled_nodes,
 							  subpath->total_cost,
 							  subpath->rows,
 							  subpath->pathtarget->width,
@@ -2287,6 +2333,7 @@ cost_append(AppendPath *apath)
 				}
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.startup_cost += subpath->startup_cost;
 				apath->path.total_cost += subpath->total_cost;
 			}
@@ -2335,6 +2382,7 @@ cost_append(AppendPath *apath)
 				apath->path.total_cost += subpath->total_cost;
 			}
 
+			apath->path.disabled_nodes += subpath->disabled_nodes;
 			apath->path.rows = clamp_row_est(apath->path.rows);
 
 			i++;
@@ -2375,6 +2423,7 @@ cost_append(AppendPath *apath)
  *
  * 'pathkeys' is a list of sort keys
  * 'n_streams' is the number of input streams
+ * 'input_disabled_nodes' is the sum of the input streams' disabled node counts
  * 'input_startup_cost' is the sum of the input streams' startup costs
  * 'input_total_cost' is the sum of the input streams' total costs
  * 'tuples' is the number of tuples in all the streams
@@ -2382,6 +2431,7 @@ cost_append(AppendPath *apath)
 void
 cost_merge_append(Path *path, PlannerInfo *root,
 				  List *pathkeys, int n_streams,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double tuples)
 {
@@ -2412,6 +2462,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
 	 */
 	run_cost += cpu_tuple_cost * APPEND_CPU_COST_MULTIPLIER * tuples;
 
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost + input_startup_cost;
 	path->total_cost = startup_cost + run_cost + input_total_cost;
 }
@@ -2430,6 +2481,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
  */
 void
 cost_material(Path *path,
+			  int input_disabled_nodes,
 			  Cost input_startup_cost, Cost input_total_cost,
 			  double tuples, int width)
 {
@@ -2467,6 +2519,7 @@ cost_material(Path *path,
 		run_cost += seq_page_cost * npages;
 	}
 
+	path->disabled_nodes = input_disabled_nodes + (enable_material ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2630,6 +2683,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 		 int numGroupCols, double numGroups,
 		 List *quals,
+		 int disabled_nodes,
 		 Cost input_startup_cost, Cost input_total_cost,
 		 double input_tuples, double input_width)
 {
@@ -2685,10 +2739,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		startup_cost = input_startup_cost;
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
-		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
-		}
+			++disabled_nodes;
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
 		total_cost += aggcosts->transCost.per_tuple * input_tuples;
@@ -2703,7 +2754,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			++disabled_nodes;
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		/* cost of computing hash value */
@@ -2812,6 +2863,7 @@ cost_agg(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3046,6 +3098,7 @@ get_windowclause_startup_tuples(PlannerInfo *root, WindowClause *wc,
 void
 cost_windowagg(Path *path, PlannerInfo *root,
 			   List *windowFuncs, WindowClause *winclause,
+			   int input_disabled_nodes,
 			   Cost input_startup_cost, Cost input_total_cost,
 			   double input_tuples)
 {
@@ -3111,6 +3164,7 @@ cost_windowagg(Path *path, PlannerInfo *root,
 	total_cost += cpu_tuple_cost * input_tuples;
 
 	path->rows = input_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 
@@ -3142,6 +3196,7 @@ void
 cost_group(Path *path, PlannerInfo *root,
 		   int numGroupCols, double numGroups,
 		   List *quals,
+		   int input_disabled_nodes,
 		   Cost input_startup_cost, Cost input_total_cost,
 		   double input_tuples)
 {
@@ -3180,6 +3235,7 @@ cost_group(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3214,6 +3270,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  Path *outer_path, Path *inner_path,
 					  JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3222,6 +3279,11 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Cost		inner_run_cost;
 	Cost		inner_rescan_run_cost;
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_nestloop ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* estimate costs to rescan the inner relation */
 	cost_rescan(root, inner_path,
 				&inner_rescan_start_cost,
@@ -3269,6 +3331,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_nestloop */
@@ -3298,6 +3361,9 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	QualCost	restrict_qual_cost;
 	double		ntuples;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (outer_path_rows <= 0)
 		outer_path_rows = 1;
@@ -3318,13 +3384,10 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_nestloop)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_nestloop ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3497,6 +3560,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					   List *outersortkeys, List *innersortkeys,
 					   JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3617,6 +3681,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Assert(outerstartsel <= outerendsel);
 	Assert(innerstartsel <= innerendsel);
 
+	disabled_nodes = enable_mergejoin ? 0 : 1;
+
 	/* cost of source data */
 
 	if (outersortkeys)			/* do we need to sort outer? */
@@ -3624,12 +3690,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  outersortkeys,
+				  outer_path->disabled_nodes,
 				  outer_path->total_cost,
 				  outer_path_rows,
 				  outer_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* outerstartsel;
@@ -3638,6 +3706,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += outer_path->disabled_nodes;
 		startup_cost += outer_path->startup_cost;
 		startup_cost += (outer_path->total_cost - outer_path->startup_cost)
 			* outerstartsel;
@@ -3650,12 +3719,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  innersortkeys,
+				  inner_path->disabled_nodes,
 				  inner_path->total_cost,
 				  inner_path_rows,
 				  inner_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* innerstartsel;
@@ -3664,6 +3735,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += inner_path->disabled_nodes;
 		startup_cost += inner_path->startup_cost;
 		startup_cost += (inner_path->total_cost - inner_path->startup_cost)
 			* innerstartsel;
@@ -3682,6 +3754,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost + inner_run_cost;
 	/* Save private data for final_cost_mergejoin */
@@ -3746,6 +3819,9 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 				rescannedtuples;
 	double		rescanratio;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
@@ -3765,14 +3841,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_mergejoin)
-		startup_cost += disable_cost;
-
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
 	 * separately.
@@ -4056,6 +4124,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  JoinPathExtraData *extra,
 					  bool parallel_hash)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -4067,6 +4136,11 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	int			num_skew_mcvs;
 	size_t		space_allowed;	/* unused */
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_hashjoin ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* cost of source data */
 	startup_cost += outer_path->startup_cost;
 	run_cost += outer_path->total_cost - outer_path->startup_cost;
@@ -4136,6 +4210,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_hashjoin */
@@ -4180,6 +4255,9 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	Selectivity innermcvfreq;
 	ListCell   *hcl;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Mark the path with the correct row estimate */
 	if (path->jpath.path.param_info)
 		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
@@ -4195,13 +4273,10 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_hashjoin)
-		startup_cost += disable_cost;
+	/* Count up disabled nodes. */
+	path->jpath.path.disabled_nodes = enable_hashjoin ? 0 : 1;
+	path->jpath.path.disabled_nodes += inner_path->disabled_nodes;
+	path->jpath.path.disabled_nodes += outer_path->disabled_nodes;
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index e858f59600..b0e8c94dfc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -915,7 +915,7 @@ try_nestloop_path(PlannerInfo *root,
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -999,7 +999,8 @@ try_partial_nestloop_path(PlannerInfo *root,
 	 */
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1096,7 +1097,7 @@ try_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -1168,7 +1169,8 @@ try_partial_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1237,7 +1239,7 @@ try_hashjoin_path(PlannerInfo *root,
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, false);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  NIL, required_outer))
 	{
@@ -1298,7 +1300,8 @@ try_partial_hashjoin_path(PlannerInfo *root,
 	 */
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, parallel_hash);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, NIL))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fe5a323cfd..b19b46159c 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -25,6 +25,7 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/print.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
@@ -5451,6 +5452,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
+			  0,				/* a Plan contains no count of disabled nodes */
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6545,6 +6547,7 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
+				  0,			/* a Plan contains no count of disabled nodes */
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 948afd9094..b5827d3980 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6748,6 +6748,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	/* Estimate the cost of seq scan + sort */
 	seqScanPath = create_seqscan_path(root, rel, NULL, 0);
 	cost_sort(&seqScanAndSortPath, root, NIL,
+			  seqScanPath->disabled_nodes,
 			  seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
 			  comparisonCost, maintenance_work_mem, -1.0);
 
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 1c69c6e97e..a0baf6d4a1 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1346,6 +1346,7 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	cost_agg(&hashed_p, root, AGG_HASHED, NULL,
 			 numGroupCols, dNumGroups,
 			 NIL,
+			 input_path->disabled_nodes,
 			 input_path->startup_cost, input_path->total_cost,
 			 input_path->rows, input_path->pathtarget->width);
 
@@ -1353,14 +1354,17 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	 * Now for the sorted case.  Note that the input is *always* unsorted,
 	 * since it was made by appending unrelated sub-relations together.
 	 */
+	sorted_p.disabled_nodes = input_path->disabled_nodes;
 	sorted_p.startup_cost = input_path->startup_cost;
 	sorted_p.total_cost = input_path->total_cost;
 	/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
-	cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
+	cost_sort(&sorted_p, root, NIL, sorted_p.disabled_nodes,
+			  sorted_p.total_cost,
 			  input_path->rows, input_path->pathtarget->width,
 			  0.0, work_mem, -1.0);
 	cost_group(&sorted_p, root, numGroupCols, dNumGroups,
 			   NIL,
+			   sorted_p.disabled_nodes,
 			   sorted_p.startup_cost, sorted_p.total_cost,
 			   input_path->rows);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 54e042a8a5..73c5e83aff 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
 int
 compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
 {
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (criterion == STARTUP_COST)
 	{
 		if (path1->startup_cost < path2->startup_cost)
@@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 	Cost		cost1,
 				cost2;
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (fraction <= 0.0 || fraction >= 1.0)
 		return compare_path_costs(path1, path2, TOTAL_COST);
 	cost1 = path1->startup_cost +
@@ -166,6 +184,15 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 #define CONSIDER_PATH_STARTUP_COST(p)  \
 	((p)->param_info == NULL ? (p)->parent->consider_startup : (p)->parent->consider_param_startup)
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return COSTS_BETTER1;
+		else
+			return COSTS_BETTER2;
+	}
+
 	/*
 	 * Check total cost first since it's more likely to be different; many
 	 * paths have zero startup cost.
@@ -362,15 +389,29 @@ set_cheapest(RelOptInfo *parent_rel)
  * add_path
  *	  Consider a potential implementation path for the specified parent rel,
  *	  and add it to the rel's pathlist if it is worthy of consideration.
+ *
  *	  A path is worthy if it has a better sort order (better pathkeys) or
- *	  cheaper cost (on either dimension), or generates fewer rows, than any
- *	  existing path that has the same or superset parameterization rels.
- *	  We also consider parallel-safe paths more worthy than others.
+ *	  cheaper cost (as defined below), or generates fewer rows, than any
+ *    existing path that has the same or superset parameterization rels.  We
+ *    also consider parallel-safe paths more worthy than others.
+ *
+ *    Cheaper cost can mean either a cheaper total cost or a cheaper startup
+ *    cost; if one path is cheaper in one of these aspects and another is
+ *    cheaper in the other, we keep both. However, when some path type is
+ *    disabled (e.g. due to enable_seqscan=false), the number of times that
+ *    a disabled path type is used is considered to be a higher-order
+ *    component of the cost. Hence, if path A uses no disabled path type,
+ *    and path B uses 1 or more disabled path types, A is cheaper, no matter
+ *    what we estimate for the startup and total costs. The startup and total
+ *    cost essentially act as a tiebreak when comparing paths that use equal
+ *    numbers of disabled path nodes; but in practice this tiebreak is almost
+ *    always used, since normally no path types are disabled.
  *
- *	  We also remove from the rel's pathlist any old paths that are dominated
- *	  by new_path --- that is, new_path is cheaper, at least as well ordered,
- *	  generates no more rows, requires no outer rels not required by the old
- *	  path, and is no less parallel-safe.
+ *	  In addition to possibly adding new_path, we also remove from the rel's
+ *    pathlist any old paths that are dominated by new_path --- that is,
+ *    new_path is cheaper, at least as well ordered, generates no more rows,
+ *    requires no outer rels not required by the old path, and is no less
+ *    parallel-safe.
  *
  *	  In most cases, a path with a superset parameterization will generate
  *	  fewer rows (since it has more join clauses to apply), so that those two
@@ -389,10 +430,10 @@ set_cheapest(RelOptInfo *parent_rel)
  *	  parent_rel->consider_param_startup is true for a parameterized one.
  *	  Again, this allows discarding useless paths sooner.
  *
- *	  The pathlist is kept sorted by total_cost, with cheaper paths
- *	  at the front.  Within this routine, that's simply a speed hack:
- *	  doing it that way makes it more likely that we will reject an inferior
- *	  path after a few comparisons, rather than many comparisons.
+ *	  The pathlist is kept sorted by disabled_nodes and then by total_cost,
+ *    with cheaper paths at the front.  Within this routine, that's simply a
+ *    speed hack: doing it that way makes it more likely that we will reject
+ *    an inferior path after a few comparisons, rather than many comparisons.
  *	  However, add_path_precheck relies on this ordering to exit early
  *	  when possible.
  *
@@ -593,8 +634,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 		}
 		else
 		{
-			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			/*
+			 * new belongs after this old path if it has more disabled nodes
+			 * or if it has the same number of nodes but a greater total cost
+			 */
+			if (new_path->disabled_nodes > old_path->disabled_nodes ||
+				(new_path->disabled_nodes == old_path->disabled_nodes &&
+				 new_path->total_cost >= old_path->total_cost))
 				insert_at = foreach_current_index(p1) + 1;
 		}
 
@@ -639,7 +685,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
  * so the required information has to be passed piecemeal.
  */
 bool
-add_path_precheck(RelOptInfo *parent_rel,
+add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 				  Cost startup_cost, Cost total_cost,
 				  List *pathkeys, Relids required_outer)
 {
@@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
 		Path	   *old_path = (Path *) lfirst(p1);
 		PathKeysComparison keyscmp;
 
+		/*
+		 * Since the pathlist is sorted by disabled_nodes and then by
+		 * total_cost, we can stop looking once we reach a path with more
+		 * disabled nodes, or the same number of disabled nodes plus a
+		 * total_cost larger than the new path's.
+		 */
+		if (unlikely(old_path->disabled_nodes != disabled_nodes))
+		{
+			if (disabled_nodes < old_path->disabled_nodes)
+				break;
+		}
+		else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
+			break;
+
 		/*
 		 * We are looking for an old_path with the same parameterization (and
 		 * by assumption the same rowcount) that dominates the new path on
@@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
 		 *
 		 * Cost comparisons here should match compare_path_costs_fuzzily.
 		 */
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		/* new path can win on startup cost only if consider_startup */
+		if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
+			!consider_startup)
 		{
-			/* new path can win on startup cost only if consider_startup */
-			if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
-				!consider_startup)
+			/* new path loses on cost, so check pathkeys... */
+			List	   *old_path_pathkeys;
+
+			old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
+			keyscmp = compare_pathkeys(new_path_pathkeys,
+									   old_path_pathkeys);
+			if (keyscmp == PATHKEYS_EQUAL ||
+				keyscmp == PATHKEYS_BETTER2)
 			{
-				/* new path loses on cost, so check pathkeys... */
-				List	   *old_path_pathkeys;
-
-				old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
-				keyscmp = compare_pathkeys(new_path_pathkeys,
-										   old_path_pathkeys);
-				if (keyscmp == PATHKEYS_EQUAL ||
-					keyscmp == PATHKEYS_BETTER2)
+				/* new path does not win on pathkeys... */
+				if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
 				{
-					/* new path does not win on pathkeys... */
-					if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
-					{
-						/* Found an old path that dominates the new one */
-						return false;
-					}
+					/* Found an old path that dominates the new one */
+					return false;
 				}
 			}
 		}
-		else
-		{
-			/*
-			 * Since the pathlist is sorted by total_cost, we can stop looking
-			 * once we reach a path with a total_cost larger than the new
-			 * path's.
-			 */
-			break;
-		}
 	}
 
 	return true;
@@ -734,7 +782,7 @@ add_path_precheck(RelOptInfo *parent_rel,
  *	  produce the same number of rows.  Neither do we need to consider startup
  *	  costs: parallelism is only used for plans that will be run to completion.
  *	  Therefore, this routine is much simpler than add_path: it needs to
- *	  consider only pathkeys and total cost.
+ *	  consider only disabled nodes, pathkeys and total cost.
  *
  *	  As with add_path, we pfree paths that are found to be dominated by
  *	  another partial path; this requires that there be no other references to
@@ -775,7 +823,15 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Unless pathkeys are incompatible, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+			if (unlikely(new_path->disabled_nodes != old_path->disabled_nodes))
+			{
+				if (new_path->disabled_nodes > old_path->disabled_nodes)
+					accept_new = false;
+				else
+					remove_old = true;
+			}
+			else if (new_path->total_cost > old_path->total_cost
+					 * STD_FUZZ_FACTOR)
 			{
 				/* New path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER1)
@@ -862,8 +918,8 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
  * is surely a loser.
  */
 bool
-add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
-						  List *pathkeys)
+add_partial_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
+						  Cost total_cost, List *pathkeys)
 {
 	ListCell   *p1;
 
@@ -906,8 +962,8 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 	 * partial path; the resulting plans, if run in parallel, will be run to
 	 * completion.
 	 */
-	if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
-						   NULL))
+	if (!add_path_precheck(parent_rel, disabled_nodes, total_cost, total_cost,
+						   pathkeys, NULL))
 		return false;
 
 	return true;
@@ -1419,6 +1475,7 @@ create_merge_append_path(PlannerInfo *root,
 						 Relids required_outer)
 {
 	MergeAppendPath *pathnode = makeNode(MergeAppendPath);
+	int			input_disabled_nodes;
 	Cost		input_startup_cost;
 	Cost		input_total_cost;
 	ListCell   *l;
@@ -1452,6 +1509,7 @@ create_merge_append_path(PlannerInfo *root,
 	 * Add up the sizes and costs of the input paths.
 	 */
 	pathnode->path.rows = 0;
+	input_disabled_nodes = 0;
 	input_startup_cost = 0;
 	input_total_cost = 0;
 	foreach(l, subpaths)
@@ -1468,6 +1526,7 @@ create_merge_append_path(PlannerInfo *root,
 		if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 		{
 			/* Subpath is adequately ordered, we won't need to sort it */
+			input_disabled_nodes += subpath->disabled_nodes;
 			input_startup_cost += subpath->startup_cost;
 			input_total_cost += subpath->total_cost;
 		}
@@ -1479,12 +1538,14 @@ create_merge_append_path(PlannerInfo *root,
 			cost_sort(&sort_path,
 					  root,
 					  pathkeys,
+					  subpath->disabled_nodes,
 					  subpath->total_cost,
 					  subpath->rows,
 					  subpath->pathtarget->width,
 					  0.0,
 					  work_mem,
 					  pathnode->limit_tuples);
+			input_disabled_nodes += sort_path.disabled_nodes;
 			input_startup_cost += sort_path.startup_cost;
 			input_total_cost += sort_path.total_cost;
 		}
@@ -1500,12 +1561,14 @@ create_merge_append_path(PlannerInfo *root,
 		((Path *) linitial(subpaths))->parallel_aware ==
 		pathnode->path.parallel_aware)
 	{
+		pathnode->path.disabled_nodes = input_disabled_nodes;
 		pathnode->path.startup_cost = input_startup_cost;
 		pathnode->path.total_cost = input_total_cost;
 	}
 	else
 		cost_merge_append(&pathnode->path, root,
 						  pathkeys, list_length(subpaths),
+						  input_disabled_nodes,
 						  input_startup_cost, input_total_cost,
 						  pathnode->path.rows);
 
@@ -1587,6 +1650,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 	pathnode->subpath = subpath;
 
 	cost_material(&pathnode->path,
+				  subpath->disabled_nodes,
 				  subpath->startup_cost,
 				  subpath->total_cost,
 				  subpath->rows,
@@ -1633,6 +1697,10 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	 */
 	pathnode->est_entries = 0;
 
+	/* we should not generate this path type when enable_memoize=false */
+	Assert(enable_memoize);
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
+
 	/*
 	 * Add a small additional charge for caching the first entry.  All the
 	 * harder calculations for rescans are performed in cost_memoize_rescan().
@@ -1732,6 +1800,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	{
 		pathnode->umethod = UNIQUE_PATH_NOOP;
 		pathnode->path.rows = rel->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost;
 		pathnode->path.total_cost = subpath->total_cost;
 		pathnode->path.pathkeys = subpath->pathkeys;
@@ -1770,6 +1839,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 			{
 				pathnode->umethod = UNIQUE_PATH_NOOP;
 				pathnode->path.rows = rel->rows;
+				pathnode->path.disabled_nodes = subpath->disabled_nodes;
 				pathnode->path.startup_cost = subpath->startup_cost;
 				pathnode->path.total_cost = subpath->total_cost;
 				pathnode->path.pathkeys = subpath->pathkeys;
@@ -1797,6 +1867,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		 * Estimate cost for sort+unique implementation
 		 */
 		cost_sort(&sort_path, root, NIL,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  rel->rows,
 				  subpath->pathtarget->width,
@@ -1834,6 +1905,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 					 AGG_HASHED, NULL,
 					 numCols, pathnode->path.rows,
 					 NIL,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 rel->rows,
@@ -1842,7 +1914,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (sjinfo->semi_can_btree && sjinfo->semi_can_hash)
 	{
-		if (agg_path.total_cost < sort_path.total_cost)
+		if (agg_path.disabled_nodes < sort_path.disabled_nodes ||
+			(agg_path.disabled_nodes == sort_path.disabled_nodes &&
+			 agg_path.total_cost < sort_path.total_cost))
 			pathnode->umethod = UNIQUE_PATH_HASH;
 		else
 			pathnode->umethod = UNIQUE_PATH_SORT;
@@ -1860,11 +1934,13 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (pathnode->umethod == UNIQUE_PATH_HASH)
 	{
+		pathnode->path.disabled_nodes = agg_path.disabled_nodes;
 		pathnode->path.startup_cost = agg_path.startup_cost;
 		pathnode->path.total_cost = agg_path.total_cost;
 	}
 	else
 	{
+		pathnode->path.disabled_nodes = sort_path.disabled_nodes;
 		pathnode->path.startup_cost = sort_path.startup_cost;
 		pathnode->path.total_cost = sort_path.total_cost;
 	}
@@ -1888,6 +1964,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 						 Relids required_outer, double *rows)
 {
 	GatherMergePath *pathnode = makeNode(GatherMergePath);
+	int			input_disabled_nodes = 0;
 	Cost		input_startup_cost = 0;
 	Cost		input_total_cost = 0;
 
@@ -1915,11 +1992,13 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->path.pathtarget = target ? target : rel->reltarget;
 
+	input_disabled_nodes += subpath->disabled_nodes;
 	input_startup_cost += subpath->startup_cost;
 	input_total_cost += subpath->total_cost;
 
 	cost_gather_merge(pathnode, root, rel, pathnode->path.param_info,
-					  input_startup_cost, input_total_cost, rows);
+					  input_disabled_nodes, input_startup_cost,
+					  input_total_cost, rows);
 
 	return pathnode;
 }
@@ -2227,7 +2306,8 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 						PathTarget *target,
-						double rows, Cost startup_cost, Cost total_cost,
+						double rows, int disabled_nodes,
+						Cost startup_cost, Cost total_cost,
 						List *pathkeys,
 						Relids required_outer,
 						Path *fdw_outerpath,
@@ -2248,6 +2328,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2273,7 +2354,8 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 						 PathTarget *target,
-						 double rows, Cost startup_cost, Cost total_cost,
+						 double rows, int disabled_nodes,
+						 Cost startup_cost, Cost total_cost,
 						 List *pathkeys,
 						 Relids required_outer,
 						 Path *fdw_outerpath,
@@ -2300,6 +2382,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2325,7 +2408,8 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 						  PathTarget *target,
-						  double rows, Cost startup_cost, Cost total_cost,
+						  double rows, int disabled_nodes,
+						  Cost startup_cost, Cost total_cost,
 						  List *pathkeys,
 						  Path *fdw_outerpath,
 						  List *fdw_restrictinfo,
@@ -2347,6 +2431,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2734,6 +2819,7 @@ create_projection_path(PlannerInfo *root,
 		 * Set cost of plan as subpath's cost, adjusted for tlist replacement.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			(target->cost.startup - oldtarget->cost.startup);
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2750,6 +2836,7 @@ create_projection_path(PlannerInfo *root,
 		 * evaluating the tlist.  There is no qual to worry about.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			target->cost.startup;
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2967,6 +3054,7 @@ create_incremental_sort_path(PlannerInfo *root,
 
 	cost_incremental_sort(&pathnode->path,
 						  root, pathkeys, presorted_keys,
+						  subpath->disabled_nodes,
 						  subpath->startup_cost,
 						  subpath->total_cost,
 						  subpath->rows,
@@ -3013,6 +3101,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->subpath = subpath;
 
 	cost_sort(&pathnode->path, root, pathkeys,
+			  subpath->disabled_nodes,
 			  subpath->total_cost,
 			  subpath->rows,
 			  subpath->pathtarget->width,
@@ -3065,6 +3154,7 @@ create_group_path(PlannerInfo *root,
 			   list_length(groupClause),
 			   numGroups,
 			   qual,
+			   subpath->disabled_nodes,
 			   subpath->startup_cost, subpath->total_cost,
 			   subpath->rows);
 
@@ -3122,6 +3212,7 @@ create_upper_unique_path(PlannerInfo *root,
 	 * all columns get compared at most of the tuples.  (XXX probably this is
 	 * an overestimate.)
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_operator_cost * subpath->rows * numCols;
@@ -3200,6 +3291,7 @@ create_agg_path(PlannerInfo *root,
 			 aggstrategy, aggcosts,
 			 list_length(groupClause), numGroups,
 			 qual,
+			 subpath->disabled_nodes,
 			 subpath->startup_cost, subpath->total_cost,
 			 subpath->rows, subpath->pathtarget->width);
 
@@ -3308,6 +3400,7 @@ create_groupingsets_path(PlannerInfo *root,
 					 numGroupCols,
 					 rollup->numGroups,
 					 having_qual,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 subpath->rows,
@@ -3333,7 +3426,7 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
-						 0.0, 0.0,
+						 0, 0.0, 0.0,
 						 subpath->rows,
 						 subpath->pathtarget->width);
 				if (!rollup->is_hashed)
@@ -3342,7 +3435,7 @@ create_groupingsets_path(PlannerInfo *root,
 			else
 			{
 				/* Account for cost of sort, but don't charge input cost again */
-				cost_sort(&sort_path, root, NIL,
+				cost_sort(&sort_path, root, NIL, 0,
 						  0.0,
 						  subpath->rows,
 						  subpath->pathtarget->width,
@@ -3358,12 +3451,14 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
+						 sort_path.disabled_nodes,
 						 sort_path.startup_cost,
 						 sort_path.total_cost,
 						 sort_path.rows,
 						 subpath->pathtarget->width);
 			}
 
+			pathnode->path.disabled_nodes += agg_path.disabled_nodes;
 			pathnode->path.total_cost += agg_path.total_cost;
 			pathnode->path.rows += agg_path.rows;
 		}
@@ -3395,6 +3490,7 @@ create_minmaxagg_path(PlannerInfo *root,
 {
 	MinMaxAggPath *pathnode = makeNode(MinMaxAggPath);
 	Cost		initplan_cost;
+	int			initplan_disabled_nodes = 0;
 	ListCell   *lc;
 
 	/* The topmost generated Plan node will be a Result */
@@ -3419,12 +3515,14 @@ create_minmaxagg_path(PlannerInfo *root,
 	{
 		MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);
 
+		initplan_disabled_nodes += mminfo->path->disabled_nodes;
 		initplan_cost += mminfo->pathcost;
 		if (!mminfo->path->parallel_safe)
 			pathnode->path.parallel_safe = false;
 	}
 
 	/* add tlist eval cost for each output row, plus cpu_tuple_cost */
+	pathnode->path.disabled_nodes = initplan_disabled_nodes;
 	pathnode->path.startup_cost = initplan_cost + target->cost.startup;
 	pathnode->path.total_cost = initplan_cost + target->cost.startup +
 		target->cost.per_tuple + cpu_tuple_cost;
@@ -3517,6 +3615,7 @@ create_windowagg_path(PlannerInfo *root,
 	cost_windowagg(&pathnode->path, root,
 				   windowFuncs,
 				   winclause,
+				   subpath->disabled_nodes,
 				   subpath->startup_cost,
 				   subpath->total_cost,
 				   subpath->rows);
@@ -3835,6 +3934,7 @@ create_limit_path(PlannerInfo *root, RelOptInfo *rel,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.rows = subpath->rows;
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost;
 	pathnode->path.pathkeys = subpath->pathkeys;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 14ccfc1ac1..540d021592 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1658,6 +1658,7 @@ typedef struct Path
 
 	/* estimated size/costs for path (see costsize.c for more info) */
 	Cardinality rows;			/* estimated number of result tuples */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
@@ -3333,6 +3334,7 @@ typedef struct
 typedef struct JoinCostWorkspace
 {
 	/* Preliminary cost estimates --- must not be larger than final ones! */
+	int			disabled_nodes;
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 57861bfb44..854a782944 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -108,35 +108,42 @@ extern void cost_resultscan(Path *path, PlannerInfo *root,
 							RelOptInfo *baserel, ParamPathInfo *param_info);
 extern void cost_recursive_union(Path *runion, Path *nrterm, Path *rterm);
 extern void cost_sort(Path *path, PlannerInfo *root,
-					  List *pathkeys, Cost input_cost, double tuples, int width,
+					  List *pathkeys, int disabled_nodes,
+					  Cost input_cost, double tuples, int width,
 					  Cost comparison_cost, int sort_mem,
 					  double limit_tuples);
 extern void cost_incremental_sort(Path *path,
 								  PlannerInfo *root, List *pathkeys, int presorted_keys,
+								  int input_disabled_nodes,
 								  Cost input_startup_cost, Cost input_total_cost,
 								  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 								  double limit_tuples);
 extern void cost_append(AppendPath *apath);
 extern void cost_merge_append(Path *path, PlannerInfo *root,
 							  List *pathkeys, int n_streams,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double tuples);
 extern void cost_material(Path *path,
+						  int input_disabled_nodes,
 						  Cost input_startup_cost, Cost input_total_cost,
 						  double tuples, int width);
 extern void cost_agg(Path *path, PlannerInfo *root,
 					 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 					 int numGroupCols, double numGroups,
 					 List *quals,
+					 int input_disabled_nodes,
 					 Cost input_startup_cost, Cost input_total_cost,
 					 double input_tuples, double input_width);
 extern void cost_windowagg(Path *path, PlannerInfo *root,
 						   List *windowFuncs, WindowClause *winclause,
+						   int input_disabled_nodes,
 						   Cost input_startup_cost, Cost input_total_cost,
 						   double input_tuples);
 extern void cost_group(Path *path, PlannerInfo *root,
 					   int numGroupCols, double numGroups,
 					   List *quals,
+					   int input_disabled_nodes,
 					   Cost input_startup_cost, Cost input_total_cost,
 					   double input_tuples);
 extern void initial_cost_nestloop(PlannerInfo *root,
@@ -171,6 +178,7 @@ extern void cost_gather(GatherPath *path, PlannerInfo *root,
 						RelOptInfo *rel, ParamPathInfo *param_info, double *rows);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 							  RelOptInfo *rel, ParamPathInfo *param_info,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index f00bd55f39..1035e6560c 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -27,11 +27,12 @@ extern int	compare_fractional_path_costs(Path *path1, Path *path2,
 										  double fraction);
 extern void set_cheapest(RelOptInfo *parent_rel);
 extern void add_path(RelOptInfo *parent_rel, Path *new_path);
-extern bool add_path_precheck(RelOptInfo *parent_rel,
+extern bool add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 							  Cost startup_cost, Cost total_cost,
 							  List *pathkeys, Relids required_outer);
 extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
 extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
+									  int disabled_nodes,
 									  Cost total_cost, List *pathkeys);
 
 extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
@@ -124,7 +125,8 @@ extern Path *create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 									   Relids required_outer);
 extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											PathTarget *target,
-											double rows, Cost startup_cost, Cost total_cost,
+											double rows, int disabled_nodes,
+											Cost startup_cost, Cost total_cost,
 											List *pathkeys,
 											Relids required_outer,
 											Path *fdw_outerpath,
@@ -132,7 +134,8 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											List *fdw_private);
 extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 PathTarget *target,
-											 double rows, Cost startup_cost, Cost total_cost,
+											 double rows, int disabled_nodes,
+											 Cost startup_cost, Cost total_cost,
 											 List *pathkeys,
 											 Relids required_outer,
 											 Path *fdw_outerpath,
@@ -140,7 +143,8 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 List *fdw_private);
 extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 											  PathTarget *target,
-											  double rows, Cost startup_cost, Cost total_cost,
+											  double rows, int disabled_nodes,
+											  Cost startup_cost, Cost total_cost,
 											  List *pathkeys,
 											  Path *fdw_outerpath,
 											  List *fdw_restrictinfo,
diff --git a/src/test/isolation/specs/horizons.spec b/src/test/isolation/specs/horizons.spec
index d5239ff228..3f987f943d 100644
--- a/src/test/isolation/specs/horizons.spec
+++ b/src/test/isolation/specs/horizons.spec
@@ -40,7 +40,6 @@ session pruner
 setup
 {
     SET enable_seqscan = false;
-    SET enable_indexscan = false;
     SET enable_bitmapscan = false;
 }
 
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 510646cbce..092233cc9d 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -332,11 +332,13 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
-                           QUERY PLAN                            
------------------------------------------------------------------
- Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc
-   Filter: (proname ~~* 'ri%foo'::text)
-(2 rows)
+                  QUERY PLAN                  
+----------------------------------------------
+ Sort
+   Sort Key: proname
+   ->  Seq Scan on pg_proc
+         Filter: (proname ~~* 'ri%foo'::text)
+(4 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 5a603f86b7..9bad3fc464 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -538,15 +538,17 @@ explain (costs off)
 ------------------------------------------------------------
  Aggregate
    ->  Nested Loop
-         ->  Seq Scan on tenk2
-               Filter: (thousand = 0)
+         ->  Gather
+               Workers Planned: 4
+               ->  Parallel Seq Scan on tenk2
+                     Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
                ->  Parallel Bitmap Heap Scan on tenk1
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(10 rows)
+(12 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
-- 
2.39.3 (Apple Git-145)

#74

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#73)

Re: On disable_cost

On Thu, 1 Aug 2024 at 04:23, Robert Haas <robertmhaas@gmail.com> wrote:

OK, here's a new patch version.

I think we're going down the right path here.

I've reviewed both patches, here's what I noted down during my review:

0. I've not seen any mention so far about postgres_fdw's
use_remote_estimate. Maybe changing the costs is fixing an issue that
existed before. I'm just not 100% sure on that.

Consider:

CREATE EXTENSION postgres_fdw;

DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (use_remote_estimate 'true',
dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;

create table t (a int);
create foreign table ft (a int) server loopback OPTIONS (table_name 't');

alter system set enable_seqscan=0;
select pg_Reload_conf();
set enable_seqscan=1;
explain select * from ft;

patched:
Foreign Scan on ft (cost=100.00..671.00 rows=2550 width=4)

master:
Foreign Scan on ft (cost=10000000100.00..10000000671.00 rows=2550 width=4)

I kinda think that might be fixing an issue that I don't recall being
reported before. I think we shouldn't really care that much about what
nodes are disabled on the remote server and not having disabled_cost
applied to that gives us that.

1. The final sentence of the function header comment needs to be
updated in estimate_path_cost_size().

2. Does cost_tidscan() need to update the header comment to say
tidquals must not be empty?

3. final_cost_nestloop() seems to initially use the disabled_nodes
from initial_cost_nestloop() but then it goes off and calculates it
again itself. One of these seems redundant. The "We could include
disable_cost in the preliminary estimate" comment explains why it was
originally left to final_cost_nestloop(), so maybe worth sticking to
that? I don't quite know the full implications, but it does not seem
worth risking a behaviour change here.

4. I wonder if it's worth doing a quick refactor of the code in
initial_cost_mergejoin() to get rid of the duplicate code in the "if
(outersortkeys)" and "if (innersortkeys)" branches. It seems ok to do
outer_path = &sort_path. Likewise for inner_path.

5. final_cost_hashjoin() does the same thing as #3

6. createplan.c adds #include "nodes/print.h" but doesn't seem to add
any code that might use anything in there.

7. create_lockrows_path() needs to propagate disabled_nodes.

create table a (a int);
set enable_seqscan=0;

explain select * from a for update limit 1;

Limit (cost=0.00..0.02 rows=1 width=10)
-> LockRows (cost=0.00..61.00 rows=2550 width=10)
-> Seq Scan on a (cost=0.00..35.50 rows=2550 width=10)
Disabled Nodes: 1
(4 rows)

explain select * from a limit 1;

Limit (cost=0.00..0.01 rows=1 width=4)
Disabled Nodes: 1
-> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4)
Disabled Nodes: 1
(4 rows)

8. There's something weird with CTEs too.

create table b(a int);
set enable_sort=0;

Patched:

explain with cte as materialized (select * from b order by a) select *
from cte order by a desc;

Sort (cost=381.44..387.82 rows=2550 width=4)
Disabled Nodes: 1
Sort Key: cte.a DESC
CTE cte
-> Sort (cost=179.78..186.16 rows=2550 width=4)
Disabled Nodes: 1
Sort Key: b.a
-> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4)
-> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4)
(9 rows)

master:

explain with cte as materialized (select * from a order by a) select *
from cte order by a desc;

Sort (cost=20000000381.44..20000000387.82 rows=2550 width=4)
Sort Key: cte.a DESC
CTE cte
-> Sort (cost=10000000179.78..10000000186.16 rows=2550 width=4)
Sort Key: a.a
-> Seq Scan on a (cost=0.00..35.50 rows=2550 width=4)
-> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4)
(7 rows)

I'd expect the final sort to have disabled_nodes == 2 since
disabled_cost has been added twice in master.

9. create_set_projection_path() needs to propagate disabled_nodes too:

explain select b from (select a,generate_series(1,2) as b from b) a limit 1;

Limit (cost=0.00..0.03 rows=1 width=4)
-> Subquery Scan on a (cost=0.00..131.12 rows=5100 width=4)
-> ProjectSet (cost=0.00..80.12 rows=5100 width=8)
-> Seq Scan on b (cost=0.00..35.50 rows=2550 width=0)
Disabled Nodes: 1

10. create_setop_path() needs to propagate disabled_nodes.

explain select * from b except select * from b limit 1;

Limit (cost=0.00..0.80 rows=1 width=8)
-> HashSetOp Except (cost=0.00..160.25 rows=200 width=8)
-> Append (cost=0.00..147.50 rows=5100 width=8)
Disabled Nodes: 2
-> Subquery Scan on "*SELECT* 1" (cost=0.00..61.00
rows=2550 width=8)
Disabled Nodes: 1
-> Seq Scan on b (cost=0.00..35.50 rows=2550 width=4)
Disabled Nodes: 1
-> Subquery Scan on "*SELECT* 2" (cost=0.00..61.00
rows=2550 width=8)
Disabled Nodes: 1
-> Seq Scan on b b_1 (cost=0.00..35.50 rows=2550 width=4)
Disabled Nodes: 1
(12 rows)

11. create_modifytable_path() needs to propagate disabled_nodes.

explain with cte as (update b set a = a+1 returning *) select * from
cte limit 1;

Limit (cost=41.88..41.90 rows=1 width=4)
CTE cte
-> Update on b (cost=0.00..41.88 rows=2550 width=10)
-> Seq Scan on b (cost=0.00..41.88 rows=2550 width=10)
Disabled Nodes: 1
-> CTE Scan on cte (cost=0.00..51.00 rows=2550 width=4)
(6 rows)

12. For the 0002 patch, I do agree that having this visible in EXPLAIN
is a must. I'd much rather see: Disabled: true/false. And just
display this when the disabled_nodes is greater than the sum of the
subpaths. That might be much more complex to implement, but it's
going to make it much easier to track down the disabled nodes in very
large plans.

David

#75

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#74)

2 attachment(s)

Re: On disable_cost

On Wed, Jul 31, 2024 at 10:01 PM David Rowley <dgrowleyml@gmail.com> wrote:

I've reviewed both patches, here's what I noted down during my review:

Thanks.

0. I've not seen any mention so far about postgres_fdw's
use_remote_estimate. Maybe changing the costs is fixing an issue that
existed before. I'm just not 100% sure on that.

patched:
Foreign Scan on ft (cost=100.00..671.00 rows=2550 width=4)

master:
Foreign Scan on ft (cost=10000000100.00..10000000671.00 rows=2550 width=4)

I kinda think that might be fixing an issue that I don't recall being
reported before. I think we shouldn't really care that much about what
nodes are disabled on the remote server and not having disabled_cost
applied to that gives us that.

Hmm, I think it's subjective which behavior is better. If somebody
thought the new behavior was worse, they might want the remote side's
count of disabled nodes to be propagated to the local side, but I'm
disinclined to go there. My guess is that it doesn't matter much
either way what we do here, so I'd rather not add more code.

1. The final sentence of the function header comment needs to be
updated in estimate_path_cost_size().

Fixed.

2. Does cost_tidscan() need to update the header comment to say
tidquals must not be empty?

IMHO, no. The assertions I added to that function were intended as
documentation of what that function was already assuming about the
behavior of its caller. I had to trace through the logic in tidpath.c
for quite a while to understand why cost_tidscan() was not completely
broken. To spare the next person the trouble of working that out, I
added assertions. Now we could additionally add commentary in English
that restates what the assertions already say, but I feel like having
the assertions is good enough. If somebody ever whacks around
tidpath.c such that these assertions start failing, I think it will be
fairly clear to them that they either need to revert their changes in
tidpath.c or upgrade the logic in this function to cope.

3. final_cost_nestloop() seems to initially use the disabled_nodes
from initial_cost_nestloop() but then it goes off and calculates it
again itself. One of these seems redundant.

Oops. Fixed.

The "We could include
disable_cost in the preliminary estimate" comment explains why it was
originally left to final_cost_nestloop(), so maybe worth sticking to
that? I don't quite know the full implications, but it does not seem
worth risking a behaviour change here.

I don't really see how there could be a behavior change here, unless
there's a bug. Dealing with the enable_* flags in initial_cost_XXX
rather than final_cost_XXX could be better or worse from a performance
standpoint and it could make for cleaner or less clean code, but the
user-facing behavior should be identical unless there are bugs.

The reason why I changed this is because of the logic in
add_path_precheck(): it exits early as soon as it sees a path whose
total cost is greater than the cost of the proposed new path. Since
the patch's aim is to treat disabled_node as a high-order component of
the cost, we need to make the same decision by comparing the count of
disabled_nodes first and then if that is equal, we need to compare the
total_cost. We can't do that if we don't have the count of
disabled_nodes for the proposed new path.

I think this may be a bit hard to understand, so let me give a
concrete example. Suppose we're planning some join where one side can
only be planned with a sequential scan and sequential scans are
disabled. We have ten paths in the path list and they have costs of
1e10+100, 1e10+200, ..., 1e10+1000. Now add_path_precheck() is asked
to consider a new path where there is a disabled node on BOTH sides of
the join -- the one side has the disabled sequential scan, but now the
other side also has something disabled, so the cost is let's say
2e10+79. add_path_precheck() can see at once that this path is a
loser: it can't possibly dominate any path that already exists,
because it costs more than any of them. But when you take disable_cost
out, things look quite different. Now you have a proposed path with a
total_cost of 79 and a path list with costs of 100, ..., 1000. If
you're not allowed to know anything about disabled_nodes, the new path
looks like it might be valuable. You might decide to construct it and
try inserting into the pathlist, which will end up being useless, and
even if you don't, you're going to compare its pathkeys and
parameterization to each of the 10 existing paths before giving up.
Bummer.

So, to avoid getting much stupider than it is currently,
add_path_precheck() needs a preliminary estimate of the number of
disabled nodes just like it needs a preliminary estimate of the total
cost. And to avoid regressions, that estimate needs to be pretty good.
A naive estimate would be to just add up the number of disabled_nodes
on the inner and outer paths, but that would be a regression in the
merge-join case, because initial_cost_mergejoin() calls cost_sort()
for the inner and outer sides and that will add disable_cost if sorts
are disabled. If you didn't take the effect of cost_sort() into
account, you might think that your number of disabled_nodes was going
to be substantially lower than it really would be, leading to wasted
work as described in the last paragraph. Plus, since
initial_cost_mergejoin() is incurring the overhead of calling
cost_sort() anyway to get the total cost numbers anyway, it would be
silly not to save the count of disabled nodes: if we did, we'd have to
redo the cost_sort() call in final_cost_mergejoin(), which would be
expensive.

If we wanted to make our estimate of the # of disabled nodes exactly
comparable to what we now do with disable_cost, we would postpone if
(!enable_WHATEVERjoin) ++disabled_nodes to the final_cost_XXX
functions and do all of the other accounting related to disabled nodes
at the initial_cost_XXX phase. But I do not like that approach.
Postponing one trivial portion of the disabled_nodes calculation to a
later time won't save any significant number of CPU cycles, but it
might confuse people reading the code. You then have to know that the
disabled_nodes count that gets passed to final_cost_XXX is not yet the
final count, but that you may still need to add 1 for the join itself
(but not for the implicit sorts that the join requires, which have
already been accounted for). That's the kind of odd definition that
breeds bugs. Besides, it's not as if moving that tiny bit of logic to
the initial_cost_XXX functions has no upside: it could allow
add_path_precheck() to exit earlier, thus saving cycles.

(For the record, the explanation above took about 3 hours to write, so
I hope it's managed to be both correct and convincing. This stuff is
really complicated.)

4. I wonder if it's worth doing a quick refactor of the code in
initial_cost_mergejoin() to get rid of the duplicate code in the "if
(outersortkeys)" and "if (innersortkeys)" branches. It seems ok to do
outer_path = &sort_path. Likewise for inner_path.

I don't think that's better.

5. final_cost_hashjoin() does the same thing as #3

Argh. Fixed.

6. createplan.c adds #include "nodes/print.h" but doesn't seem to add
any code that might use anything in there.

Fixed.

8. There's something weird with CTEs too.

I'd expect the final sort to have disabled_nodes == 2 since
disabled_cost has been added twice in master.

Right now, disabled node counts don't propagate through SubPlans (see
SS_process_ctes). Maybe that needs to be changed, but aside from
looking weird, does it do any harm?

7. create_lockrows_path() needs to propagate disabled_nodes.
9. create_set_projection_path() needs to propagate disabled_nodes too:
10. create_setop_path() needs to propagate disabled_nodes.
11. create_modifytable_path() needs to propagate disabled_nodes.

I changed all of these, but I think these examples only establish that
those nodes DO NOT propagate disabled_nodes, not that they need to. If
we're past the point of making any choices based on costs, then
maintaining disabled_nodes or not doing so won't affect correctness.
That's not to say these aren't good to tidy up, and some of them may
well be bugs, but I don't think your test cases prove that. What
primarily matters is whether the enable_BLAH GUCs get respected; the
exact contents of the EXPLAIN output are somewhat more arguable.

12. For the 0002 patch, I do agree that having this visible in EXPLAIN
is a must. I'd much rather see: Disabled: true/false. And just
display this when the disabled_nodes is greater than the sum of the
subpaths. That might be much more complex to implement, but it's
going to make it much easier to track down the disabled nodes in very
large plans.

I think it's going to be very unpleasant if we have the planner add
things up and then try to have EXPLAIN subtract them back out again.
One problem with that is that all of the test cases where you just
showed disabled_nodes not propagating upward wouldn't actually show
anything any more, because disabled_nodes would not have been greater
in the parent than in the child. So those are oversights in the code
that are easy to spot now but would become hard to spot with this
implementation. Another problem is that the EXPLAIN code itself could
contain bugs, or slightly more broadly, get out of sync with the logic
that decides what to add up. It won't be obvious what's happening:
some node that is actually disabled just won't appear to be, or the
other way around, and it will be hard to understand what happened,
because you won't be able to see the raw counts of disabled nodes that
would allow you to deduce where the error actually is.

One idea that occurs to me is to store TWO counts in each path node
and each plan node: the count of self-exclusive disabled nodes, and
the count of self-include disabled nodes. Then explain can just test
if they are different. If the answer is 1, the node is disabled; if 0,
it's enabled; if anything else, there's a bug (and it could print the
delta, or each value separately, to help localize such bugs). The
problem with that is that it eats up more space in
performance-critical data structures, but perhaps that's OK: I don't
know.

Another thought is that right now you just see the disable_cost values
added up with the rest of the cost. So maybe propagating upward is not
really such a bad behavior; it's what we have now.

This point probably needs more thought and discussion, but I'm out of
time to work on this for today, and out of mental energy too. So for
now here's v5 as I have it.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v5-0001-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchapplication/octet-stream; name=v5-0001-Treat-number-of-disabled-nodes-in-a-path-as-a-sep.patchDownload

From f22345a1346ae82bc1422812a39daa29ab081eab Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 31 Jul 2024 11:17:25 -0400
Subject: [PATCH v5 1/2] Treat number of disabled nodes in a path as a separate
 cost metric.

Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the cost. Hence, the
path list is now sorted by disabled_nodes and then by total_cost,
instead of just by the latter, and likewise for the partial path list.
It is important that this number is a count and not simply a Boolean;
else, as soon as we're unable to respect disabled path types in all
portions of the path, we stop trying to avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
---
 contrib/file_fdw/file_fdw.c                   |   1 +
 contrib/postgres_fdw/postgres_fdw.c           |  46 +++-
 contrib/postgres_fdw/postgres_fdw.h           |   1 +
 src/backend/optimizer/path/costsize.c         | 155 +++++++++----
 src/backend/optimizer/path/joinpath.c         |  15 +-
 src/backend/optimizer/plan/createplan.c       |   2 +
 src/backend/optimizer/plan/planner.c          |   1 +
 src/backend/optimizer/prep/prepunion.c        |   6 +-
 src/backend/optimizer/util/pathnode.c         | 212 +++++++++++++-----
 src/include/nodes/pathnodes.h                 |   2 +
 src/include/optimizer/cost.h                  |  10 +-
 src/include/optimizer/pathnode.h              |  12 +-
 src/test/isolation/specs/horizons.spec        |   1 -
 src/test/regress/expected/btree_index.out     |  12 +-
 src/test/regress/expected/select_parallel.out |   8 +-
 15 files changed, 357 insertions(+), 127 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 249d82d3a0..d16821f8e1 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -576,6 +576,7 @@ fileGetForeignPaths(PlannerInfo *root,
 			 create_foreignscan_path(root, baserel,
 									 NULL,	/* default pathtarget */
 									 baserel->rows,
+									 0,
 									 startup_cost,
 									 total_cost,
 									 NIL,	/* no pathkeys */
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index fc65d81e21..adc62576d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -430,6 +430,7 @@ static void estimate_path_cost_size(PlannerInfo *root,
 									List *pathkeys,
 									PgFdwPathExtraData *fpextra,
 									double *p_rows, int *p_width,
+									int *p_disabled_nodes,
 									Cost *p_startup_cost, Cost *p_total_cost);
 static void get_remote_estimate(const char *sql,
 								PGconn *conn,
@@ -442,6 +443,7 @@ static void adjust_foreign_grouping_path_cost(PlannerInfo *root,
 											  double retrieved_rows,
 											  double width,
 											  double limit_tuples,
+											  int *disabled_nodes,
 											  Cost *p_startup_cost,
 											  Cost *p_run_cost);
 static bool ec_member_matches_foreign(PlannerInfo *root, RelOptInfo *rel,
@@ -735,6 +737,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		 */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 
 		/* Report estimated baserel size to planner. */
@@ -765,6 +768,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		/* Fill in basically-bogus cost estimates for use later. */
 		estimate_path_cost_size(root, baserel, NIL, NIL, NULL,
 								&fpinfo->rows, &fpinfo->width,
+								&fpinfo->disabled_nodes,
 								&fpinfo->startup_cost, &fpinfo->total_cost);
 	}
 
@@ -1030,6 +1034,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 	path = create_foreignscan_path(root, baserel,
 								   NULL,	/* default pathtarget */
 								   fpinfo->rows,
+								   fpinfo->disabled_nodes,
 								   fpinfo->startup_cost,
 								   fpinfo->total_cost,
 								   NIL, /* no pathkeys */
@@ -1184,13 +1189,14 @@ postgresGetForeignPaths(PlannerInfo *root,
 		ParamPathInfo *param_info = (ParamPathInfo *) lfirst(lc);
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 
 		/* Get a cost estimate from the remote */
 		estimate_path_cost_size(root, baserel,
 								param_info->ppi_clauses, NIL, NULL,
-								&rows, &width,
+								&rows, &width, &disabled_nodes,
 								&startup_cost, &total_cost);
 
 		/*
@@ -1203,6 +1209,7 @@ postgresGetForeignPaths(PlannerInfo *root,
 		path = create_foreignscan_path(root, baserel,
 									   NULL,	/* default pathtarget */
 									   rows,
+									   disabled_nodes,
 									   startup_cost,
 									   total_cost,
 									   NIL, /* no pathkeys */
@@ -3079,7 +3086,7 @@ postgresExecForeignTruncate(List *rels,
  * final sort and the LIMIT restriction.
  *
  * The function returns the cost and size estimates in p_rows, p_width,
- * p_startup_cost and p_total_cost variables.
+ * p_disabled_nodes, p_startup_cost and p_total_cost variables.
  */
 static void
 estimate_path_cost_size(PlannerInfo *root,
@@ -3088,12 +3095,14 @@ estimate_path_cost_size(PlannerInfo *root,
 						List *pathkeys,
 						PgFdwPathExtraData *fpextra,
 						double *p_rows, int *p_width,
+						int *p_disabled_nodes,
 						Cost *p_startup_cost, Cost *p_total_cost)
 {
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;
 	double		rows;
 	double		retrieved_rows;
 	int			width;
+	int			disabled_nodes = 0;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -3483,6 +3492,7 @@ estimate_path_cost_size(PlannerInfo *root,
 				adjust_foreign_grouping_path_cost(root, pathkeys,
 												  retrieved_rows, width,
 												  fpextra->limit_tuples,
+												  &disabled_nodes,
 												  &startup_cost, &run_cost);
 			}
 			else
@@ -3577,6 +3587,7 @@ estimate_path_cost_size(PlannerInfo *root,
 	/* Return results. */
 	*p_rows = rows;
 	*p_width = width;
+	*p_disabled_nodes = disabled_nodes;
 	*p_startup_cost = startup_cost;
 	*p_total_cost = total_cost;
 }
@@ -3637,6 +3648,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 								  double retrieved_rows,
 								  double width,
 								  double limit_tuples,
+								  int *p_disabled_nodes,
 								  Cost *p_startup_cost,
 								  Cost *p_run_cost)
 {
@@ -3656,6 +3668,7 @@ adjust_foreign_grouping_path_cost(PlannerInfo *root,
 		cost_sort(&sort_path,
 				  root,
 				  pathkeys,
+				  0,
 				  *p_startup_cost + *p_run_cost,
 				  retrieved_rows,
 				  width,
@@ -6147,13 +6160,15 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 	{
 		double		rows;
 		int			width;
+		int			disabled_nodes;
 		Cost		startup_cost;
 		Cost		total_cost;
 		List	   *useful_pathkeys = lfirst(lc);
 		Path	   *sorted_epq_path;
 
 		estimate_path_cost_size(root, rel, NIL, useful_pathkeys, NULL,
-								&rows, &width, &startup_cost, &total_cost);
+								&rows, &width, &disabled_nodes,
+								&startup_cost, &total_cost);
 
 		/*
 		 * The EPQ path must be at least as well sorted as the path itself, in
@@ -6175,6 +6190,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreignscan_path(root, rel,
 											 NULL,
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 useful_pathkeys,
@@ -6188,6 +6204,7 @@ add_paths_with_pathkeys_for_rel(PlannerInfo *root, RelOptInfo *rel,
 					 create_foreign_join_path(root, rel,
 											  NULL,
 											  rows,
+											  disabled_nodes,
 											  startup_cost,
 											  total_cost,
 											  useful_pathkeys,
@@ -6335,6 +6352,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 	ForeignPath *joinpath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	Path	   *epq_path;		/* Path to create plan to be executed when
@@ -6424,12 +6442,14 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 
 	/* Estimate costs for bare join relation */
 	estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	/* Now update this information in the joinrel */
 	joinrel->rows = rows;
 	joinrel->reltarget->width = width;
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6441,6 +6461,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 										joinrel,
 										NULL,	/* default pathtarget */
 										rows,
+										disabled_nodes,
 										startup_cost,
 										total_cost,
 										NIL,	/* no pathkeys */
@@ -6768,6 +6789,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	ForeignPath *grouppath;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
@@ -6818,11 +6840,13 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the cost of push down */
 	estimate_path_cost_size(root, grouped_rel, NIL, NIL, NULL,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/* Now update this information in the fpinfo */
 	fpinfo->rows = rows;
 	fpinfo->width = width;
+	fpinfo->disabled_nodes = disabled_nodes;
 	fpinfo->startup_cost = startup_cost;
 	fpinfo->total_cost = total_cost;
 
@@ -6831,6 +6855,7 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										  grouped_rel,
 										  grouped_rel->reltarget,
 										  rows,
+										  disabled_nodes,
 										  startup_cost,
 										  total_cost,
 										  NIL,	/* no pathkeys */
@@ -6859,6 +6884,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	PgFdwPathExtraData *fpextra;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -6952,7 +6978,8 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Estimate the costs of performing the final sort remotely */
 	estimate_path_cost_size(root, input_rel, NIL, root->sort_pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 
 	/*
 	 * Build the fdw_private list that will be used by postgresGetForeignPlan.
@@ -6965,6 +6992,7 @@ add_foreign_ordered_paths(PlannerInfo *root, RelOptInfo *input_rel,
 											 input_rel,
 											 root->upper_targets[UPPERREL_ORDERED],
 											 rows,
+											 disabled_nodes,
 											 startup_cost,
 											 total_cost,
 											 root->sort_pathkeys,
@@ -6998,6 +7026,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 	bool		save_use_remote_estimate = false;
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 	List	   *fdw_private;
@@ -7082,6 +7111,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 													   path->parent,
 													   path->pathtarget,
 													   path->rows,
+													   path->disabled_nodes,
 													   path->startup_cost,
 													   path->total_cost,
 													   path->pathkeys,
@@ -7199,7 +7229,8 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 		ifpinfo->use_remote_estimate = false;
 	}
 	estimate_path_cost_size(root, input_rel, NIL, pathkeys, fpextra,
-							&rows, &width, &startup_cost, &total_cost);
+							&rows, &width, &disabled_nodes,
+							&startup_cost, &total_cost);
 	if (!fpextra->has_final_sort)
 		ifpinfo->use_remote_estimate = save_use_remote_estimate;
 
@@ -7218,6 +7249,7 @@ add_foreign_final_paths(PlannerInfo *root, RelOptInfo *input_rel,
 										   input_rel,
 										   root->upper_targets[UPPERREL_FINAL],
 										   rows,
+										   disabled_nodes,
 										   startup_cost,
 										   total_cost,
 										   pathkeys,
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 37c1575af6..9e501660d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -62,6 +62,7 @@ typedef struct PgFdwRelationInfo
 	/* Estimated size and cost for a scan, join, or grouping/aggregation. */
 	double		rows;
 	int			width;
+	int			disabled_nodes;
 	Cost		startup_cost;
 	Cost		total_cost;
 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 79991b1980..e1523d15df 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -50,6 +50,17 @@
  * so beware of division-by-zero.)	The LIMIT is applied as a top-level
  * plan node.
  *
+ * Each path stores the total number of disabled nodes that exist at or
+ * below that point in the plan tree. This is regarded as a component of
+ * the cost, and paths with fewer disabled nodes should be regarded as
+ * cheaper than those with more. Disabled nodes occur when the user sets
+ * a GUC like enable_seqscan=false. We can't necessarily respect such a
+ * setting in every part of the plan tree, but we want to respect in as many
+ * parts of the plan tree as possible. Simpler schemes like storing a Boolean
+ * here rather than a count fail to do that. We used to disable nodes by
+ * adding a large constant to the startup cost, but that distorted planning
+ * in other ways.
+ *
  * For largely historical reasons, most of the routines in this module use
  * the passed result Path only to store their results (rows, startup_cost and
  * total_cost) into.  All the input data they need is passed as separate
@@ -301,9 +312,6 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_seqscan)
-		startup_cost += disable_cost;
-
 	/* fetch estimated page cost for tablespace containing table */
 	get_tablespace_page_costs(baserel->reltablespace,
 							  NULL,
@@ -346,6 +354,7 @@ cost_seqscan(Path *path, PlannerInfo *root,
 		path->rows = clamp_row_est(path->rows / parallel_divisor);
 	}
 
+	path->disabled_nodes = enable_seqscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;
 }
@@ -418,6 +427,7 @@ cost_samplescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -456,6 +466,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows;
 
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = (startup_cost + run_cost);
 }
@@ -473,6 +484,7 @@ cost_gather(GatherPath *path, PlannerInfo *root,
 void
 cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double *rows)
 {
@@ -490,9 +502,6 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	else
 		path->path.rows = rel->rows;
 
-	if (!enable_gathermerge)
-		startup_cost += disable_cost;
-
 	/*
 	 * Add one to the number of workers to account for the leader.  This might
 	 * be overgenerous since the leader will do less work than other workers
@@ -523,6 +532,8 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	startup_cost += parallel_setup_cost;
 	run_cost += parallel_tuple_cost * path->path.rows * 1.05;
 
+	path->path.disabled_nodes = input_disabled_nodes
+		+ (enable_gathermerge ? 0 : 1);
 	path->path.startup_cost = startup_cost + input_startup_cost;
 	path->path.total_cost = (startup_cost + run_cost + input_total_cost);
 }
@@ -603,9 +614,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
 											  path->indexclauses);
 	}
 
-	if (!enable_indexscan)
-		startup_cost += disable_cost;
 	/* we don't need to check enable_indexonlyscan; indxpath.c does that */
+	path->path.disabled_nodes = enable_indexscan ? 0 : 1;
 
 	/*
 	 * Call index-access-method-specific code to estimate the processing cost
@@ -1038,9 +1048,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	else
 		path->rows = baserel->rows;
 
-	if (!enable_bitmapscan)
-		startup_cost += disable_cost;
-
 	pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
 										 loop_count, &indexTotalCost,
 										 &tuples_fetched);
@@ -1102,6 +1109,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = enable_bitmapscan ? 0 : 1;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1187,6 +1195,7 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
 	}
 	path->bitmapselectivity = selec;
 	path->path.rows = 0;		/* per above, not used */
+	path->path.disabled_nodes = 0;
 	path->path.startup_cost = totalCost;
 	path->path.total_cost = totalCost;
 }
@@ -1261,6 +1270,7 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	/* Should only be applied to base relations */
 	Assert(baserel->relid > 0);
 	Assert(baserel->rtekind == RTE_RELATION);
+	Assert(tidquals != NIL);
 
 	/* Mark the path with the correct row estimate */
 	if (param_info)
@@ -1275,6 +1285,14 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		RestrictInfo *rinfo = lfirst_node(RestrictInfo, l);
 		Expr	   *qual = rinfo->clause;
 
+		/*
+		 * We must use a TID scan for CurrentOfExpr; in any other case, we
+		 * should be generating a TID scan only if enable_tidscan=true. Also,
+		 * if CurrentOfExpr is the qual, there should be only one.
+		 */
+		Assert(enable_tidscan || IsA(qual, CurrentOfExpr));
+		Assert(list_length(tidquals) == 1 || !IsA(qual, CurrentOfExpr));
+
 		if (IsA(qual, ScalarArrayOpExpr))
 		{
 			/* Each element of the array yields 1 tuple */
@@ -1322,6 +1340,12 @@ cost_tidscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/*
+	 * There are assertions above verifying that we only reach this function
+	 * either when enable_tidscan=true or when the TID scan is the only legal
+	 * path, so it's safe to set disabled_nodes to zero here.
+	 */
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1414,6 +1438,9 @@ cost_tidrangescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	/* we should not generate this path type when enable_tidscan=false */
+	Assert(enable_tidscan);
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1466,6 +1493,7 @@ cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
 	 * SubqueryScan node, plus cpu_tuple_cost to account for selection and
 	 * projection overhead.
 	 */
+	path->path.disabled_nodes = path->subpath->disabled_nodes;
 	path->path.startup_cost = path->subpath->startup_cost;
 	path->path.total_cost = path->subpath->total_cost;
 
@@ -1556,6 +1584,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1612,6 +1641,7 @@ cost_tablefuncscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1659,6 +1689,7 @@ cost_valuesscan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1706,6 +1737,7 @@ cost_ctescan(Path *path, PlannerInfo *root,
 	startup_cost += path->pathtarget->cost.startup;
 	run_cost += path->pathtarget->cost.per_tuple * path->rows;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1743,6 +1775,7 @@ cost_namedtuplestorescan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1777,6 +1810,7 @@ cost_resultscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
 
+	path->disabled_nodes = 0;
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -1816,6 +1850,7 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm)
 	 */
 	total_cost += cpu_tuple_cost * total_rows;
 
+	runion->disabled_nodes = nrterm->disabled_nodes + rterm->disabled_nodes;
 	runion->startup_cost = startup_cost;
 	runion->total_cost = total_cost;
 	runion->rows = total_rows;
@@ -1964,6 +1999,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
 void
 cost_incremental_sort(Path *path,
 					  PlannerInfo *root, List *pathkeys, int presorted_keys,
+					  int input_disabled_nodes,
 					  Cost input_startup_cost, Cost input_total_cost,
 					  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 					  double limit_tuples)
@@ -2083,6 +2119,11 @@ cost_incremental_sort(Path *path,
 	run_cost += 2.0 * cpu_tuple_cost * input_groups;
 
 	path->rows = input_tuples;
+
+	/* should not generate these paths when enable_incremental_sort=false */
+	Assert(enable_incremental_sort);
+	path->disabled_nodes = input_disabled_nodes;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2101,7 +2142,8 @@ cost_incremental_sort(Path *path,
  */
 void
 cost_sort(Path *path, PlannerInfo *root,
-		  List *pathkeys, Cost input_cost, double tuples, int width,
+		  List *pathkeys, int input_disabled_nodes,
+		  Cost input_cost, double tuples, int width,
 		  Cost comparison_cost, int sort_mem,
 		  double limit_tuples)
 
@@ -2114,12 +2156,10 @@ cost_sort(Path *path, PlannerInfo *root,
 				   comparison_cost, sort_mem,
 				   limit_tuples);
 
-	if (!enable_sort)
-		startup_cost += disable_cost;
-
 	startup_cost += input_cost;
 
 	path->rows = tuples;
+	path->disabled_nodes = input_disabled_nodes + (enable_sort ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2211,6 +2251,7 @@ cost_append(AppendPath *apath)
 {
 	ListCell   *l;
 
+	apath->path.disabled_nodes = 0;
 	apath->path.startup_cost = 0;
 	apath->path.total_cost = 0;
 	apath->path.rows = 0;
@@ -2232,12 +2273,16 @@ cost_append(AppendPath *apath)
 			 */
 			apath->path.startup_cost = firstsubpath->startup_cost;
 
-			/* Compute rows and costs as sums of subplan rows and costs. */
+			/*
+			 * Compute rows, number of disabled nodes, and total cost as sums
+			 * of underlying subplan values.
+			 */
 			foreach(l, apath->subpaths)
 			{
 				Path	   *subpath = (Path *) lfirst(l);
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.total_cost += subpath->total_cost;
 			}
 		}
@@ -2277,6 +2322,7 @@ cost_append(AppendPath *apath)
 					cost_sort(&sort_path,
 							  NULL, /* doesn't currently need root */
 							  pathkeys,
+							  subpath->disabled_nodes,
 							  subpath->total_cost,
 							  subpath->rows,
 							  subpath->pathtarget->width,
@@ -2287,6 +2333,7 @@ cost_append(AppendPath *apath)
 				}
 
 				apath->path.rows += subpath->rows;
+				apath->path.disabled_nodes += subpath->disabled_nodes;
 				apath->path.startup_cost += subpath->startup_cost;
 				apath->path.total_cost += subpath->total_cost;
 			}
@@ -2335,6 +2382,7 @@ cost_append(AppendPath *apath)
 				apath->path.total_cost += subpath->total_cost;
 			}
 
+			apath->path.disabled_nodes += subpath->disabled_nodes;
 			apath->path.rows = clamp_row_est(apath->path.rows);
 
 			i++;
@@ -2375,6 +2423,7 @@ cost_append(AppendPath *apath)
  *
  * 'pathkeys' is a list of sort keys
  * 'n_streams' is the number of input streams
+ * 'input_disabled_nodes' is the sum of the input streams' disabled node counts
  * 'input_startup_cost' is the sum of the input streams' startup costs
  * 'input_total_cost' is the sum of the input streams' total costs
  * 'tuples' is the number of tuples in all the streams
@@ -2382,6 +2431,7 @@ cost_append(AppendPath *apath)
 void
 cost_merge_append(Path *path, PlannerInfo *root,
 				  List *pathkeys, int n_streams,
+				  int input_disabled_nodes,
 				  Cost input_startup_cost, Cost input_total_cost,
 				  double tuples)
 {
@@ -2412,6 +2462,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
 	 */
 	run_cost += cpu_tuple_cost * APPEND_CPU_COST_MULTIPLIER * tuples;
 
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost + input_startup_cost;
 	path->total_cost = startup_cost + run_cost + input_total_cost;
 }
@@ -2430,6 +2481,7 @@ cost_merge_append(Path *path, PlannerInfo *root,
  */
 void
 cost_material(Path *path,
+			  int input_disabled_nodes,
 			  Cost input_startup_cost, Cost input_total_cost,
 			  double tuples, int width)
 {
@@ -2467,6 +2519,7 @@ cost_material(Path *path,
 		run_cost += seq_page_cost * npages;
 	}
 
+	path->disabled_nodes = input_disabled_nodes + (enable_material ? 0 : 1);
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@@ -2630,6 +2683,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 		 int numGroupCols, double numGroups,
 		 List *quals,
+		 int disabled_nodes,
 		 Cost input_startup_cost, Cost input_total_cost,
 		 double input_tuples, double input_width)
 {
@@ -2685,10 +2739,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		startup_cost = input_startup_cost;
 		total_cost = input_total_cost;
 		if (aggstrategy == AGG_MIXED && !enable_hashagg)
-		{
-			startup_cost += disable_cost;
-			total_cost += disable_cost;
-		}
+			++disabled_nodes;
 		/* calcs phrased this way to match HASHED case, see note above */
 		total_cost += aggcosts->transCost.startup;
 		total_cost += aggcosts->transCost.per_tuple * input_tuples;
@@ -2703,7 +2754,7 @@ cost_agg(Path *path, PlannerInfo *root,
 		/* must be AGG_HASHED */
 		startup_cost = input_total_cost;
 		if (!enable_hashagg)
-			startup_cost += disable_cost;
+			++disabled_nodes;
 		startup_cost += aggcosts->transCost.startup;
 		startup_cost += aggcosts->transCost.per_tuple * input_tuples;
 		/* cost of computing hash value */
@@ -2812,6 +2863,7 @@ cost_agg(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3046,6 +3098,7 @@ get_windowclause_startup_tuples(PlannerInfo *root, WindowClause *wc,
 void
 cost_windowagg(Path *path, PlannerInfo *root,
 			   List *windowFuncs, WindowClause *winclause,
+			   int input_disabled_nodes,
 			   Cost input_startup_cost, Cost input_total_cost,
 			   double input_tuples)
 {
@@ -3111,6 +3164,7 @@ cost_windowagg(Path *path, PlannerInfo *root,
 	total_cost += cpu_tuple_cost * input_tuples;
 
 	path->rows = input_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 
@@ -3142,6 +3196,7 @@ void
 cost_group(Path *path, PlannerInfo *root,
 		   int numGroupCols, double numGroups,
 		   List *quals,
+		   int input_disabled_nodes,
 		   Cost input_startup_cost, Cost input_total_cost,
 		   double input_tuples)
 {
@@ -3180,6 +3235,7 @@ cost_group(Path *path, PlannerInfo *root,
 	}
 
 	path->rows = output_tuples;
+	path->disabled_nodes = input_disabled_nodes;
 	path->startup_cost = startup_cost;
 	path->total_cost = total_cost;
 }
@@ -3214,6 +3270,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  Path *outer_path, Path *inner_path,
 					  JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3222,6 +3279,11 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Cost		inner_run_cost;
 	Cost		inner_rescan_run_cost;
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_nestloop ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* estimate costs to rescan the inner relation */
 	cost_rescan(root, inner_path,
 				&inner_rescan_start_cost,
@@ -3269,6 +3331,7 @@ initial_cost_nestloop(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_nestloop */
@@ -3298,6 +3361,9 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	QualCost	restrict_qual_cost;
 	double		ntuples;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (outer_path_rows <= 0)
 		outer_path_rows = 1;
@@ -3318,14 +3384,6 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_nestloop)
-		startup_cost += disable_cost;
-
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
 	if (path->jpath.jointype == JOIN_SEMI || path->jpath.jointype == JOIN_ANTI ||
@@ -3497,6 +3555,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					   List *outersortkeys, List *innersortkeys,
 					   JoinPathExtraData *extra)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -3617,6 +3676,8 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	Assert(outerstartsel <= outerendsel);
 	Assert(innerstartsel <= innerendsel);
 
+	disabled_nodes = enable_mergejoin ? 0 : 1;
+
 	/* cost of source data */
 
 	if (outersortkeys)			/* do we need to sort outer? */
@@ -3624,12 +3685,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  outersortkeys,
+				  outer_path->disabled_nodes,
 				  outer_path->total_cost,
 				  outer_path_rows,
 				  outer_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* outerstartsel;
@@ -3638,6 +3701,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += outer_path->disabled_nodes;
 		startup_cost += outer_path->startup_cost;
 		startup_cost += (outer_path->total_cost - outer_path->startup_cost)
 			* outerstartsel;
@@ -3650,12 +3714,14 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 		cost_sort(&sort_path,
 				  root,
 				  innersortkeys,
+				  inner_path->disabled_nodes,
 				  inner_path->total_cost,
 				  inner_path_rows,
 				  inner_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
+		disabled_nodes += sort_path.disabled_nodes;
 		startup_cost += sort_path.startup_cost;
 		startup_cost += (sort_path.total_cost - sort_path.startup_cost)
 			* innerstartsel;
@@ -3664,6 +3730,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	}
 	else
 	{
+		disabled_nodes += inner_path->disabled_nodes;
 		startup_cost += inner_path->startup_cost;
 		startup_cost += (inner_path->total_cost - inner_path->startup_cost)
 			* innerstartsel;
@@ -3682,6 +3749,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost + inner_run_cost;
 	/* Save private data for final_cost_mergejoin */
@@ -3746,6 +3814,9 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 				rescannedtuples;
 	double		rescanratio;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
@@ -3765,14 +3836,6 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_mergejoin)
-		startup_cost += disable_cost;
-
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
 	 * separately.
@@ -4056,6 +4119,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 					  JoinPathExtraData *extra,
 					  bool parallel_hash)
 {
+	int			disabled_nodes;
 	Cost		startup_cost = 0;
 	Cost		run_cost = 0;
 	double		outer_path_rows = outer_path->rows;
@@ -4067,6 +4131,11 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	int			num_skew_mcvs;
 	size_t		space_allowed;	/* unused */
 
+	/* Count up disabled nodes. */
+	disabled_nodes = enable_hashjoin ? 0 : 1;
+	disabled_nodes += inner_path->disabled_nodes;
+	disabled_nodes += outer_path->disabled_nodes;
+
 	/* cost of source data */
 	startup_cost += outer_path->startup_cost;
 	run_cost += outer_path->total_cost - outer_path->startup_cost;
@@ -4136,6 +4205,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	/* CPU costs left for later */
 
 	/* Public result fields */
+	workspace->disabled_nodes = disabled_nodes;
 	workspace->startup_cost = startup_cost;
 	workspace->total_cost = startup_cost + run_cost;
 	/* Save private data for final_cost_hashjoin */
@@ -4180,6 +4250,9 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	Selectivity innermcvfreq;
 	ListCell   *hcl;
 
+	/* Set the number of disabled nodes. */
+	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
+
 	/* Mark the path with the correct row estimate */
 	if (path->jpath.path.param_info)
 		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
@@ -4195,14 +4268,6 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 			clamp_row_est(path->jpath.path.rows / parallel_divisor);
 	}
 
-	/*
-	 * We could include disable_cost in the preliminary estimate, but that
-	 * would amount to optimizing for the case where the join method is
-	 * disabled, which doesn't seem like the way to bet.
-	 */
-	if (!enable_hashjoin)
-		startup_cost += disable_cost;
-
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
 
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index e858f59600..b0e8c94dfc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -915,7 +915,7 @@ try_nestloop_path(PlannerInfo *root,
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -999,7 +999,8 @@ try_partial_nestloop_path(PlannerInfo *root,
 	 */
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1096,7 +1097,7 @@ try_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  pathkeys, required_outer))
 	{
@@ -1168,7 +1169,8 @@ try_partial_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -1237,7 +1239,7 @@ try_hashjoin_path(PlannerInfo *root,
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, false);
 
-	if (add_path_precheck(joinrel,
+	if (add_path_precheck(joinrel, workspace.disabled_nodes,
 						  workspace.startup_cost, workspace.total_cost,
 						  NIL, required_outer))
 	{
@@ -1298,7 +1300,8 @@ try_partial_hashjoin_path(PlannerInfo *root,
 	 */
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, parallel_hash);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+	if (!add_partial_path_precheck(joinrel, workspace.disabled_nodes,
+								   workspace.total_cost, NIL))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index fe5a323cfd..43123e6ce2 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -5451,6 +5451,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
+			  0,				/* a Plan contains no count of disabled nodes */
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6545,6 +6546,7 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
+				  0,			/* a Plan contains no count of disabled nodes */
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 948afd9094..b5827d3980 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -6748,6 +6748,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	/* Estimate the cost of seq scan + sort */
 	seqScanPath = create_seqscan_path(root, rel, NULL, 0);
 	cost_sort(&seqScanAndSortPath, root, NIL,
+			  seqScanPath->disabled_nodes,
 			  seqScanPath->total_cost, rel->tuples, rel->reltarget->width,
 			  comparisonCost, maintenance_work_mem, -1.0);
 
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 1c69c6e97e..a0baf6d4a1 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1346,6 +1346,7 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	cost_agg(&hashed_p, root, AGG_HASHED, NULL,
 			 numGroupCols, dNumGroups,
 			 NIL,
+			 input_path->disabled_nodes,
 			 input_path->startup_cost, input_path->total_cost,
 			 input_path->rows, input_path->pathtarget->width);
 
@@ -1353,14 +1354,17 @@ choose_hashed_setop(PlannerInfo *root, List *groupClauses,
 	 * Now for the sorted case.  Note that the input is *always* unsorted,
 	 * since it was made by appending unrelated sub-relations together.
 	 */
+	sorted_p.disabled_nodes = input_path->disabled_nodes;
 	sorted_p.startup_cost = input_path->startup_cost;
 	sorted_p.total_cost = input_path->total_cost;
 	/* XXX cost_sort doesn't actually look at pathkeys, so just pass NIL */
-	cost_sort(&sorted_p, root, NIL, sorted_p.total_cost,
+	cost_sort(&sorted_p, root, NIL, sorted_p.disabled_nodes,
+			  sorted_p.total_cost,
 			  input_path->rows, input_path->pathtarget->width,
 			  0.0, work_mem, -1.0);
 	cost_group(&sorted_p, root, numGroupCols, dNumGroups,
 			   NIL,
+			   sorted_p.disabled_nodes,
 			   sorted_p.startup_cost, sorted_p.total_cost,
 			   input_path->rows);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 54e042a8a5..fc97bf6ee2 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -68,6 +68,15 @@ static bool pathlist_is_reparameterizable_by_child(List *pathlist,
 int
 compare_path_costs(Path *path1, Path *path2, CostSelector criterion)
 {
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (criterion == STARTUP_COST)
 	{
 		if (path1->startup_cost < path2->startup_cost)
@@ -118,6 +127,15 @@ compare_fractional_path_costs(Path *path1, Path *path2,
 	Cost		cost1,
 				cost2;
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return -1;
+		else
+			return +1;
+	}
+
 	if (fraction <= 0.0 || fraction >= 1.0)
 		return compare_path_costs(path1, path2, TOTAL_COST);
 	cost1 = path1->startup_cost +
@@ -166,6 +184,15 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 #define CONSIDER_PATH_STARTUP_COST(p)  \
 	((p)->param_info == NULL ? (p)->parent->consider_startup : (p)->parent->consider_param_startup)
 
+	/* Number of disabled nodes, if different, trumps all else. */
+	if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
+	{
+		if (path1->disabled_nodes < path2->disabled_nodes)
+			return COSTS_BETTER1;
+		else
+			return COSTS_BETTER2;
+	}
+
 	/*
 	 * Check total cost first since it's more likely to be different; many
 	 * paths have zero startup cost.
@@ -362,15 +389,29 @@ set_cheapest(RelOptInfo *parent_rel)
  * add_path
  *	  Consider a potential implementation path for the specified parent rel,
  *	  and add it to the rel's pathlist if it is worthy of consideration.
+ *
  *	  A path is worthy if it has a better sort order (better pathkeys) or
- *	  cheaper cost (on either dimension), or generates fewer rows, than any
- *	  existing path that has the same or superset parameterization rels.
- *	  We also consider parallel-safe paths more worthy than others.
+ *	  cheaper cost (as defined below), or generates fewer rows, than any
+ *    existing path that has the same or superset parameterization rels.  We
+ *    also consider parallel-safe paths more worthy than others.
+ *
+ *    Cheaper cost can mean either a cheaper total cost or a cheaper startup
+ *    cost; if one path is cheaper in one of these aspects and another is
+ *    cheaper in the other, we keep both. However, when some path type is
+ *    disabled (e.g. due to enable_seqscan=false), the number of times that
+ *    a disabled path type is used is considered to be a higher-order
+ *    component of the cost. Hence, if path A uses no disabled path type,
+ *    and path B uses 1 or more disabled path types, A is cheaper, no matter
+ *    what we estimate for the startup and total costs. The startup and total
+ *    cost essentially act as a tiebreak when comparing paths that use equal
+ *    numbers of disabled path nodes; but in practice this tiebreak is almost
+ *    always used, since normally no path types are disabled.
  *
- *	  We also remove from the rel's pathlist any old paths that are dominated
- *	  by new_path --- that is, new_path is cheaper, at least as well ordered,
- *	  generates no more rows, requires no outer rels not required by the old
- *	  path, and is no less parallel-safe.
+ *	  In addition to possibly adding new_path, we also remove from the rel's
+ *    pathlist any old paths that are dominated by new_path --- that is,
+ *    new_path is cheaper, at least as well ordered, generates no more rows,
+ *    requires no outer rels not required by the old path, and is no less
+ *    parallel-safe.
  *
  *	  In most cases, a path with a superset parameterization will generate
  *	  fewer rows (since it has more join clauses to apply), so that those two
@@ -389,10 +430,10 @@ set_cheapest(RelOptInfo *parent_rel)
  *	  parent_rel->consider_param_startup is true for a parameterized one.
  *	  Again, this allows discarding useless paths sooner.
  *
- *	  The pathlist is kept sorted by total_cost, with cheaper paths
- *	  at the front.  Within this routine, that's simply a speed hack:
- *	  doing it that way makes it more likely that we will reject an inferior
- *	  path after a few comparisons, rather than many comparisons.
+ *	  The pathlist is kept sorted by disabled_nodes and then by total_cost,
+ *    with cheaper paths at the front.  Within this routine, that's simply a
+ *    speed hack: doing it that way makes it more likely that we will reject
+ *    an inferior path after a few comparisons, rather than many comparisons.
  *	  However, add_path_precheck relies on this ordering to exit early
  *	  when possible.
  *
@@ -593,8 +634,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
 		}
 		else
 		{
-			/* new belongs after this old path if it has cost >= old's */
-			if (new_path->total_cost >= old_path->total_cost)
+			/*
+			 * new belongs after this old path if it has more disabled nodes
+			 * or if it has the same number of nodes but a greater total cost
+			 */
+			if (new_path->disabled_nodes > old_path->disabled_nodes ||
+				(new_path->disabled_nodes == old_path->disabled_nodes &&
+				 new_path->total_cost >= old_path->total_cost))
 				insert_at = foreach_current_index(p1) + 1;
 		}
 
@@ -639,7 +685,7 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
  * so the required information has to be passed piecemeal.
  */
 bool
-add_path_precheck(RelOptInfo *parent_rel,
+add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 				  Cost startup_cost, Cost total_cost,
 				  List *pathkeys, Relids required_outer)
 {
@@ -658,6 +704,20 @@ add_path_precheck(RelOptInfo *parent_rel,
 		Path	   *old_path = (Path *) lfirst(p1);
 		PathKeysComparison keyscmp;
 
+		/*
+		 * Since the pathlist is sorted by disabled_nodes and then by
+		 * total_cost, we can stop looking once we reach a path with more
+		 * disabled nodes, or the same number of disabled nodes plus a
+		 * total_cost larger than the new path's.
+		 */
+		if (unlikely(old_path->disabled_nodes != disabled_nodes))
+		{
+			if (disabled_nodes < old_path->disabled_nodes)
+				break;
+		}
+		else if (total_cost <= old_path->total_cost * STD_FUZZ_FACTOR)
+			break;
+
 		/*
 		 * We are looking for an old_path with the same parameterization (and
 		 * by assumption the same rowcount) that dominates the new path on
@@ -666,39 +726,27 @@ add_path_precheck(RelOptInfo *parent_rel,
 		 *
 		 * Cost comparisons here should match compare_path_costs_fuzzily.
 		 */
-		if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+		/* new path can win on startup cost only if consider_startup */
+		if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
+			!consider_startup)
 		{
-			/* new path can win on startup cost only if consider_startup */
-			if (startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR ||
-				!consider_startup)
+			/* new path loses on cost, so check pathkeys... */
+			List	   *old_path_pathkeys;
+
+			old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
+			keyscmp = compare_pathkeys(new_path_pathkeys,
+									   old_path_pathkeys);
+			if (keyscmp == PATHKEYS_EQUAL ||
+				keyscmp == PATHKEYS_BETTER2)
 			{
-				/* new path loses on cost, so check pathkeys... */
-				List	   *old_path_pathkeys;
-
-				old_path_pathkeys = old_path->param_info ? NIL : old_path->pathkeys;
-				keyscmp = compare_pathkeys(new_path_pathkeys,
-										   old_path_pathkeys);
-				if (keyscmp == PATHKEYS_EQUAL ||
-					keyscmp == PATHKEYS_BETTER2)
+				/* new path does not win on pathkeys... */
+				if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
 				{
-					/* new path does not win on pathkeys... */
-					if (bms_equal(required_outer, PATH_REQ_OUTER(old_path)))
-					{
-						/* Found an old path that dominates the new one */
-						return false;
-					}
+					/* Found an old path that dominates the new one */
+					return false;
 				}
 			}
 		}
-		else
-		{
-			/*
-			 * Since the pathlist is sorted by total_cost, we can stop looking
-			 * once we reach a path with a total_cost larger than the new
-			 * path's.
-			 */
-			break;
-		}
 	}
 
 	return true;
@@ -734,7 +782,7 @@ add_path_precheck(RelOptInfo *parent_rel,
  *	  produce the same number of rows.  Neither do we need to consider startup
  *	  costs: parallelism is only used for plans that will be run to completion.
  *	  Therefore, this routine is much simpler than add_path: it needs to
- *	  consider only pathkeys and total cost.
+ *	  consider only disabled nodes, pathkeys and total cost.
  *
  *	  As with add_path, we pfree paths that are found to be dominated by
  *	  another partial path; this requires that there be no other references to
@@ -775,7 +823,15 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Unless pathkeys are incompatible, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
+			if (unlikely(new_path->disabled_nodes != old_path->disabled_nodes))
+			{
+				if (new_path->disabled_nodes > old_path->disabled_nodes)
+					accept_new = false;
+				else
+					remove_old = true;
+			}
+			else if (new_path->total_cost > old_path->total_cost
+					 * STD_FUZZ_FACTOR)
 			{
 				/* New path costs more; keep it only if pathkeys are better. */
 				if (keyscmp != PATHKEYS_BETTER1)
@@ -862,8 +918,8 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
  * is surely a loser.
  */
 bool
-add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
-						  List *pathkeys)
+add_partial_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
+						  Cost total_cost, List *pathkeys)
 {
 	ListCell   *p1;
 
@@ -906,8 +962,8 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 	 * partial path; the resulting plans, if run in parallel, will be run to
 	 * completion.
 	 */
-	if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
-						   NULL))
+	if (!add_path_precheck(parent_rel, disabled_nodes, total_cost, total_cost,
+						   pathkeys, NULL))
 		return false;
 
 	return true;
@@ -1419,6 +1475,7 @@ create_merge_append_path(PlannerInfo *root,
 						 Relids required_outer)
 {
 	MergeAppendPath *pathnode = makeNode(MergeAppendPath);
+	int			input_disabled_nodes;
 	Cost		input_startup_cost;
 	Cost		input_total_cost;
 	ListCell   *l;
@@ -1452,6 +1509,7 @@ create_merge_append_path(PlannerInfo *root,
 	 * Add up the sizes and costs of the input paths.
 	 */
 	pathnode->path.rows = 0;
+	input_disabled_nodes = 0;
 	input_startup_cost = 0;
 	input_total_cost = 0;
 	foreach(l, subpaths)
@@ -1468,6 +1526,7 @@ create_merge_append_path(PlannerInfo *root,
 		if (pathkeys_contained_in(pathkeys, subpath->pathkeys))
 		{
 			/* Subpath is adequately ordered, we won't need to sort it */
+			input_disabled_nodes += subpath->disabled_nodes;
 			input_startup_cost += subpath->startup_cost;
 			input_total_cost += subpath->total_cost;
 		}
@@ -1479,12 +1538,14 @@ create_merge_append_path(PlannerInfo *root,
 			cost_sort(&sort_path,
 					  root,
 					  pathkeys,
+					  subpath->disabled_nodes,
 					  subpath->total_cost,
 					  subpath->rows,
 					  subpath->pathtarget->width,
 					  0.0,
 					  work_mem,
 					  pathnode->limit_tuples);
+			input_disabled_nodes += sort_path.disabled_nodes;
 			input_startup_cost += sort_path.startup_cost;
 			input_total_cost += sort_path.total_cost;
 		}
@@ -1500,12 +1561,14 @@ create_merge_append_path(PlannerInfo *root,
 		((Path *) linitial(subpaths))->parallel_aware ==
 		pathnode->path.parallel_aware)
 	{
+		pathnode->path.disabled_nodes = input_disabled_nodes;
 		pathnode->path.startup_cost = input_startup_cost;
 		pathnode->path.total_cost = input_total_cost;
 	}
 	else
 		cost_merge_append(&pathnode->path, root,
 						  pathkeys, list_length(subpaths),
+						  input_disabled_nodes,
 						  input_startup_cost, input_total_cost,
 						  pathnode->path.rows);
 
@@ -1587,6 +1650,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 	pathnode->subpath = subpath;
 
 	cost_material(&pathnode->path,
+				  subpath->disabled_nodes,
 				  subpath->startup_cost,
 				  subpath->total_cost,
 				  subpath->rows,
@@ -1633,6 +1697,10 @@ create_memoize_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	 */
 	pathnode->est_entries = 0;
 
+	/* we should not generate this path type when enable_memoize=false */
+	Assert(enable_memoize);
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
+
 	/*
 	 * Add a small additional charge for caching the first entry.  All the
 	 * harder calculations for rescans are performed in cost_memoize_rescan().
@@ -1732,6 +1800,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	{
 		pathnode->umethod = UNIQUE_PATH_NOOP;
 		pathnode->path.rows = rel->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost;
 		pathnode->path.total_cost = subpath->total_cost;
 		pathnode->path.pathkeys = subpath->pathkeys;
@@ -1770,6 +1839,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 			{
 				pathnode->umethod = UNIQUE_PATH_NOOP;
 				pathnode->path.rows = rel->rows;
+				pathnode->path.disabled_nodes = subpath->disabled_nodes;
 				pathnode->path.startup_cost = subpath->startup_cost;
 				pathnode->path.total_cost = subpath->total_cost;
 				pathnode->path.pathkeys = subpath->pathkeys;
@@ -1797,6 +1867,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		 * Estimate cost for sort+unique implementation
 		 */
 		cost_sort(&sort_path, root, NIL,
+				  subpath->disabled_nodes,
 				  subpath->total_cost,
 				  rel->rows,
 				  subpath->pathtarget->width,
@@ -1834,6 +1905,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 					 AGG_HASHED, NULL,
 					 numCols, pathnode->path.rows,
 					 NIL,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 rel->rows,
@@ -1842,7 +1914,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (sjinfo->semi_can_btree && sjinfo->semi_can_hash)
 	{
-		if (agg_path.total_cost < sort_path.total_cost)
+		if (agg_path.disabled_nodes < sort_path.disabled_nodes ||
+			(agg_path.disabled_nodes == sort_path.disabled_nodes &&
+			 agg_path.total_cost < sort_path.total_cost))
 			pathnode->umethod = UNIQUE_PATH_HASH;
 		else
 			pathnode->umethod = UNIQUE_PATH_SORT;
@@ -1860,11 +1934,13 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 
 	if (pathnode->umethod == UNIQUE_PATH_HASH)
 	{
+		pathnode->path.disabled_nodes = agg_path.disabled_nodes;
 		pathnode->path.startup_cost = agg_path.startup_cost;
 		pathnode->path.total_cost = agg_path.total_cost;
 	}
 	else
 	{
+		pathnode->path.disabled_nodes = sort_path.disabled_nodes;
 		pathnode->path.startup_cost = sort_path.startup_cost;
 		pathnode->path.total_cost = sort_path.total_cost;
 	}
@@ -1888,6 +1964,7 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 						 Relids required_outer, double *rows)
 {
 	GatherMergePath *pathnode = makeNode(GatherMergePath);
+	int			input_disabled_nodes = 0;
 	Cost		input_startup_cost = 0;
 	Cost		input_total_cost = 0;
 
@@ -1915,11 +1992,13 @@ create_gather_merge_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.pathkeys = pathkeys;
 	pathnode->path.pathtarget = target ? target : rel->reltarget;
 
+	input_disabled_nodes += subpath->disabled_nodes;
 	input_startup_cost += subpath->startup_cost;
 	input_total_cost += subpath->total_cost;
 
 	cost_gather_merge(pathnode, root, rel, pathnode->path.param_info,
-					  input_startup_cost, input_total_cost, rows);
+					  input_disabled_nodes, input_startup_cost,
+					  input_total_cost, rows);
 
 	return pathnode;
 }
@@ -2227,7 +2306,8 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 						PathTarget *target,
-						double rows, Cost startup_cost, Cost total_cost,
+						double rows, int disabled_nodes,
+						Cost startup_cost, Cost total_cost,
 						List *pathkeys,
 						Relids required_outer,
 						Path *fdw_outerpath,
@@ -2248,6 +2328,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2273,7 +2354,8 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 						 PathTarget *target,
-						 double rows, Cost startup_cost, Cost total_cost,
+						 double rows, int disabled_nodes,
+						 Cost startup_cost, Cost total_cost,
 						 List *pathkeys,
 						 Relids required_outer,
 						 Path *fdw_outerpath,
@@ -2300,6 +2382,7 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2325,7 +2408,8 @@ create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 ForeignPath *
 create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 						  PathTarget *target,
-						  double rows, Cost startup_cost, Cost total_cost,
+						  double rows, int disabled_nodes,
+						  Cost startup_cost, Cost total_cost,
 						  List *pathkeys,
 						  Path *fdw_outerpath,
 						  List *fdw_restrictinfo,
@@ -2347,6 +2431,7 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 	pathnode->path.parallel_safe = rel->consider_parallel;
 	pathnode->path.parallel_workers = 0;
 	pathnode->path.rows = rows;
+	pathnode->path.disabled_nodes = disabled_nodes;
 	pathnode->path.startup_cost = startup_cost;
 	pathnode->path.total_cost = total_cost;
 	pathnode->path.pathkeys = pathkeys;
@@ -2734,6 +2819,7 @@ create_projection_path(PlannerInfo *root,
 		 * Set cost of plan as subpath's cost, adjusted for tlist replacement.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			(target->cost.startup - oldtarget->cost.startup);
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2750,6 +2836,7 @@ create_projection_path(PlannerInfo *root,
 		 * evaluating the tlist.  There is no qual to worry about.
 		 */
 		pathnode->path.rows = subpath->rows;
+		pathnode->path.disabled_nodes = subpath->disabled_nodes;
 		pathnode->path.startup_cost = subpath->startup_cost +
 			target->cost.startup;
 		pathnode->path.total_cost = subpath->total_cost +
@@ -2917,6 +3004,7 @@ create_set_projection_path(PlannerInfo *root,
 	 * This is slightly bizarre maybe, but it's what 9.6 did; we may revisit
 	 * this estimate later.
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.rows = subpath->rows * tlist_rows;
 	pathnode->path.startup_cost = subpath->startup_cost +
 		target->cost.startup;
@@ -2967,6 +3055,7 @@ create_incremental_sort_path(PlannerInfo *root,
 
 	cost_incremental_sort(&pathnode->path,
 						  root, pathkeys, presorted_keys,
+						  subpath->disabled_nodes,
 						  subpath->startup_cost,
 						  subpath->total_cost,
 						  subpath->rows,
@@ -3013,6 +3102,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->subpath = subpath;
 
 	cost_sort(&pathnode->path, root, pathkeys,
+			  subpath->disabled_nodes,
 			  subpath->total_cost,
 			  subpath->rows,
 			  subpath->pathtarget->width,
@@ -3065,6 +3155,7 @@ create_group_path(PlannerInfo *root,
 			   list_length(groupClause),
 			   numGroups,
 			   qual,
+			   subpath->disabled_nodes,
 			   subpath->startup_cost, subpath->total_cost,
 			   subpath->rows);
 
@@ -3122,6 +3213,7 @@ create_upper_unique_path(PlannerInfo *root,
 	 * all columns get compared at most of the tuples.  (XXX probably this is
 	 * an overestimate.)
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_operator_cost * subpath->rows * numCols;
@@ -3200,6 +3292,7 @@ create_agg_path(PlannerInfo *root,
 			 aggstrategy, aggcosts,
 			 list_length(groupClause), numGroups,
 			 qual,
+			 subpath->disabled_nodes,
 			 subpath->startup_cost, subpath->total_cost,
 			 subpath->rows, subpath->pathtarget->width);
 
@@ -3308,6 +3401,7 @@ create_groupingsets_path(PlannerInfo *root,
 					 numGroupCols,
 					 rollup->numGroups,
 					 having_qual,
+					 subpath->disabled_nodes,
 					 subpath->startup_cost,
 					 subpath->total_cost,
 					 subpath->rows,
@@ -3333,7 +3427,7 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
-						 0.0, 0.0,
+						 0, 0.0, 0.0,
 						 subpath->rows,
 						 subpath->pathtarget->width);
 				if (!rollup->is_hashed)
@@ -3342,7 +3436,7 @@ create_groupingsets_path(PlannerInfo *root,
 			else
 			{
 				/* Account for cost of sort, but don't charge input cost again */
-				cost_sort(&sort_path, root, NIL,
+				cost_sort(&sort_path, root, NIL, 0,
 						  0.0,
 						  subpath->rows,
 						  subpath->pathtarget->width,
@@ -3358,12 +3452,14 @@ create_groupingsets_path(PlannerInfo *root,
 						 numGroupCols,
 						 rollup->numGroups,
 						 having_qual,
+						 sort_path.disabled_nodes,
 						 sort_path.startup_cost,
 						 sort_path.total_cost,
 						 sort_path.rows,
 						 subpath->pathtarget->width);
 			}
 
+			pathnode->path.disabled_nodes += agg_path.disabled_nodes;
 			pathnode->path.total_cost += agg_path.total_cost;
 			pathnode->path.rows += agg_path.rows;
 		}
@@ -3395,6 +3491,7 @@ create_minmaxagg_path(PlannerInfo *root,
 {
 	MinMaxAggPath *pathnode = makeNode(MinMaxAggPath);
 	Cost		initplan_cost;
+	int			initplan_disabled_nodes = 0;
 	ListCell   *lc;
 
 	/* The topmost generated Plan node will be a Result */
@@ -3419,12 +3516,14 @@ create_minmaxagg_path(PlannerInfo *root,
 	{
 		MinMaxAggInfo *mminfo = (MinMaxAggInfo *) lfirst(lc);
 
+		initplan_disabled_nodes += mminfo->path->disabled_nodes;
 		initplan_cost += mminfo->pathcost;
 		if (!mminfo->path->parallel_safe)
 			pathnode->path.parallel_safe = false;
 	}
 
 	/* add tlist eval cost for each output row, plus cpu_tuple_cost */
+	pathnode->path.disabled_nodes = initplan_disabled_nodes;
 	pathnode->path.startup_cost = initplan_cost + target->cost.startup;
 	pathnode->path.total_cost = initplan_cost + target->cost.startup +
 		target->cost.per_tuple + cpu_tuple_cost;
@@ -3517,6 +3616,7 @@ create_windowagg_path(PlannerInfo *root,
 	cost_windowagg(&pathnode->path, root,
 				   windowFuncs,
 				   winclause,
+				   subpath->disabled_nodes,
 				   subpath->startup_cost,
 				   subpath->total_cost,
 				   subpath->rows);
@@ -3584,6 +3684,7 @@ create_setop_path(PlannerInfo *root,
 	 * Charge one cpu_operator_cost per comparison per input tuple. We assume
 	 * all columns get compared at most of the tuples.
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_operator_cost * subpath->rows * list_length(distinctList);
@@ -3683,6 +3784,7 @@ create_lockrows_path(PlannerInfo *root, RelOptInfo *rel,
 	 * possible refetches, but it's hard to say how much.  For now, use
 	 * cpu_tuple_cost per row.
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost +
 		cpu_tuple_cost * subpath->rows;
@@ -3759,6 +3861,7 @@ create_modifytable_path(PlannerInfo *root, RelOptInfo *rel,
 	 * costs to change any higher-level planning choices.  But we might want
 	 * to make it look better sometime.
 	 */
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost;
 	if (returningLists != NIL)
@@ -3835,6 +3938,7 @@ create_limit_path(PlannerInfo *root, RelOptInfo *rel,
 		subpath->parallel_safe;
 	pathnode->path.parallel_workers = subpath->parallel_workers;
 	pathnode->path.rows = subpath->rows;
+	pathnode->path.disabled_nodes = subpath->disabled_nodes;
 	pathnode->path.startup_cost = subpath->startup_cost;
 	pathnode->path.total_cost = subpath->total_cost;
 	pathnode->path.pathkeys = subpath->pathkeys;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 14ccfc1ac1..540d021592 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1658,6 +1658,7 @@ typedef struct Path
 
 	/* estimated size/costs for path (see costsize.c for more info) */
 	Cardinality rows;			/* estimated number of result tuples */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
@@ -3333,6 +3334,7 @@ typedef struct
 typedef struct JoinCostWorkspace
 {
 	/* Preliminary cost estimates --- must not be larger than final ones! */
+	int			disabled_nodes;
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 57861bfb44..854a782944 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -108,35 +108,42 @@ extern void cost_resultscan(Path *path, PlannerInfo *root,
 							RelOptInfo *baserel, ParamPathInfo *param_info);
 extern void cost_recursive_union(Path *runion, Path *nrterm, Path *rterm);
 extern void cost_sort(Path *path, PlannerInfo *root,
-					  List *pathkeys, Cost input_cost, double tuples, int width,
+					  List *pathkeys, int disabled_nodes,
+					  Cost input_cost, double tuples, int width,
 					  Cost comparison_cost, int sort_mem,
 					  double limit_tuples);
 extern void cost_incremental_sort(Path *path,
 								  PlannerInfo *root, List *pathkeys, int presorted_keys,
+								  int input_disabled_nodes,
 								  Cost input_startup_cost, Cost input_total_cost,
 								  double input_tuples, int width, Cost comparison_cost, int sort_mem,
 								  double limit_tuples);
 extern void cost_append(AppendPath *apath);
 extern void cost_merge_append(Path *path, PlannerInfo *root,
 							  List *pathkeys, int n_streams,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double tuples);
 extern void cost_material(Path *path,
+						  int input_disabled_nodes,
 						  Cost input_startup_cost, Cost input_total_cost,
 						  double tuples, int width);
 extern void cost_agg(Path *path, PlannerInfo *root,
 					 AggStrategy aggstrategy, const AggClauseCosts *aggcosts,
 					 int numGroupCols, double numGroups,
 					 List *quals,
+					 int input_disabled_nodes,
 					 Cost input_startup_cost, Cost input_total_cost,
 					 double input_tuples, double input_width);
 extern void cost_windowagg(Path *path, PlannerInfo *root,
 						   List *windowFuncs, WindowClause *winclause,
+						   int input_disabled_nodes,
 						   Cost input_startup_cost, Cost input_total_cost,
 						   double input_tuples);
 extern void cost_group(Path *path, PlannerInfo *root,
 					   int numGroupCols, double numGroups,
 					   List *quals,
+					   int input_disabled_nodes,
 					   Cost input_startup_cost, Cost input_total_cost,
 					   double input_tuples);
 extern void initial_cost_nestloop(PlannerInfo *root,
@@ -171,6 +178,7 @@ extern void cost_gather(GatherPath *path, PlannerInfo *root,
 						RelOptInfo *rel, ParamPathInfo *param_info, double *rows);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 							  RelOptInfo *rel, ParamPathInfo *param_info,
+							  int input_disabled_nodes,
 							  Cost input_startup_cost, Cost input_total_cost,
 							  double *rows);
 extern void cost_subplan(PlannerInfo *root, SubPlan *subplan, Plan *plan);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index f00bd55f39..1035e6560c 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -27,11 +27,12 @@ extern int	compare_fractional_path_costs(Path *path1, Path *path2,
 										  double fraction);
 extern void set_cheapest(RelOptInfo *parent_rel);
 extern void add_path(RelOptInfo *parent_rel, Path *new_path);
-extern bool add_path_precheck(RelOptInfo *parent_rel,
+extern bool add_path_precheck(RelOptInfo *parent_rel, int disabled_nodes,
 							  Cost startup_cost, Cost total_cost,
 							  List *pathkeys, Relids required_outer);
 extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
 extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
+									  int disabled_nodes,
 									  Cost total_cost, List *pathkeys);
 
 extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
@@ -124,7 +125,8 @@ extern Path *create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,
 									   Relids required_outer);
 extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											PathTarget *target,
-											double rows, Cost startup_cost, Cost total_cost,
+											double rows, int disabled_nodes,
+											Cost startup_cost, Cost total_cost,
 											List *pathkeys,
 											Relids required_outer,
 											Path *fdw_outerpath,
@@ -132,7 +134,8 @@ extern ForeignPath *create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,
 											List *fdw_private);
 extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 PathTarget *target,
-											 double rows, Cost startup_cost, Cost total_cost,
+											 double rows, int disabled_nodes,
+											 Cost startup_cost, Cost total_cost,
 											 List *pathkeys,
 											 Relids required_outer,
 											 Path *fdw_outerpath,
@@ -140,7 +143,8 @@ extern ForeignPath *create_foreign_join_path(PlannerInfo *root, RelOptInfo *rel,
 											 List *fdw_private);
 extern ForeignPath *create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
 											  PathTarget *target,
-											  double rows, Cost startup_cost, Cost total_cost,
+											  double rows, int disabled_nodes,
+											  Cost startup_cost, Cost total_cost,
 											  List *pathkeys,
 											  Path *fdw_outerpath,
 											  List *fdw_restrictinfo,
diff --git a/src/test/isolation/specs/horizons.spec b/src/test/isolation/specs/horizons.spec
index d5239ff228..3f987f943d 100644
--- a/src/test/isolation/specs/horizons.spec
+++ b/src/test/isolation/specs/horizons.spec
@@ -40,7 +40,6 @@ session pruner
 setup
 {
     SET enable_seqscan = false;
-    SET enable_indexscan = false;
     SET enable_bitmapscan = false;
 }
 
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 510646cbce..092233cc9d 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -332,11 +332,13 @@ select proname from pg_proc where proname ilike '00%foo' order by 1;
 
 explain (costs off)
 select proname from pg_proc where proname ilike 'ri%foo' order by 1;
-                           QUERY PLAN                            
------------------------------------------------------------------
- Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc
-   Filter: (proname ~~* 'ri%foo'::text)
-(2 rows)
+                  QUERY PLAN                  
+----------------------------------------------
+ Sort
+   Sort Key: proname
+   ->  Seq Scan on pg_proc
+         Filter: (proname ~~* 'ri%foo'::text)
+(4 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 5a603f86b7..9bad3fc464 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -538,15 +538,17 @@ explain (costs off)
 ------------------------------------------------------------
  Aggregate
    ->  Nested Loop
-         ->  Seq Scan on tenk2
-               Filter: (thousand = 0)
+         ->  Gather
+               Workers Planned: 4
+               ->  Parallel Seq Scan on tenk2
+                     Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
                ->  Parallel Bitmap Heap Scan on tenk1
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(10 rows)
+(12 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
-- 
2.39.3 (Apple Git-145)

v5-0002-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchapplication/octet-stream; name=v5-0002-Show-number-of-disabled-nodes-in-EXPLAIN-ANALYZE-.patchDownload

From 21a3f772937044ffbb5979f25cdad846a25c8359 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Wed, 31 Jul 2024 11:35:53 -0400
Subject: [PATCH v5 2/2] Show number of disabled nodes in EXPLAIN ANALYZE
 output.

Now that disable_cost is not included in the cost estimate, there's
no visible sign in EXPLAIN output of which plan nodes are disabled.
Fix that by propagating the number of disabled nodes from Path to
Plan, and then showing it in the EXPLAIN output.
---
 src/backend/commands/explain.c                |  4 ++++
 src/backend/optimizer/plan/createplan.c       |  8 +++++--
 src/include/nodes/plannodes.h                 |  1 +
 src/test/regress/expected/aggregates.out      | 21 ++++++++++++++++---
 src/test/regress/expected/btree_index.out     |  4 +++-
 .../regress/expected/collate.icu.utf8.out     |  6 ++++--
 .../regress/expected/incremental_sort.out     |  5 ++++-
 src/test/regress/expected/inherit.out         |  4 +++-
 src/test/regress/expected/join.out            |  4 +++-
 src/test/regress/expected/memoize.out         |  8 +++++--
 src/test/regress/expected/select_parallel.out |  6 +++++-
 src/test/regress/expected/union.out           |  3 ++-
 12 files changed, 59 insertions(+), 15 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5771aabf40..11df4a04d4 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1894,6 +1894,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
+	if (plan->disabled_nodes != 0)
+		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
+							   es);
+
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
 	{
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 43123e6ce2..7a5bb98fd5 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -2571,6 +2571,7 @@ create_minmaxagg_plan(PlannerInfo *root, MinMaxAggPath *best_path)
 								   0, NULL, NULL, NULL);
 
 		/* Must apply correct cost/width data to Limit node */
+		plan->disabled_nodes = mminfo->path->disabled_nodes;
 		plan->startup_cost = mminfo->path->startup_cost;
 		plan->total_cost = mminfo->pathcost;
 		plan->plan_rows = 1;
@@ -5403,6 +5404,7 @@ order_qual_clauses(PlannerInfo *root, List *clauses)
 static void
 copy_generic_path_info(Plan *dest, Path *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->rows;
@@ -5418,6 +5420,7 @@ copy_generic_path_info(Plan *dest, Path *src)
 static void
 copy_plan_costsize(Plan *dest, Plan *src)
 {
+	dest->disabled_nodes = src->disabled_nodes;
 	dest->startup_cost = src->startup_cost;
 	dest->total_cost = src->total_cost;
 	dest->plan_rows = src->plan_rows;
@@ -5451,7 +5454,7 @@ label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
 
 	cost_sort(&sort_path, root, NIL,
 			  lefttree->total_cost,
-			  0,				/* a Plan contains no count of disabled nodes */
+			  plan->plan.disabled_nodes,
 			  lefttree->plan_rows,
 			  lefttree->plan_width,
 			  0.0,
@@ -6546,11 +6549,12 @@ materialize_finished_plan(Plan *subplan)
 
 	/* Set cost data */
 	cost_material(&matpath,
-				  0,			/* a Plan contains no count of disabled nodes */
+				  subplan->disabled_nodes,
 				  subplan->startup_cost,
 				  subplan->total_cost,
 				  subplan->plan_rows,
 				  subplan->plan_width);
+	matplan->disabled_nodes = subplan->disabled_nodes;
 	matplan->startup_cost = matpath.startup_cost + initplan_cost;
 	matplan->total_cost = matpath.total_cost + initplan_cost;
 	matplan->plan_rows = subplan->plan_rows;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1aeeaec95e..62cd6a6666 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -125,6 +125,7 @@ typedef struct Plan
 	/*
 	 * estimated execution costs for plan (see costsize.c for more info)
 	 */
+	int			disabled_nodes; /* count of disabled nodes */
 	Cost		startup_cost;	/* cost expended before fetching any tuples */
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index a5596ab210..8ac13b562c 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2920,18 +2920,23 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
+   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
+         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(12 rows)
+                           Disabled Nodes: 1
+(17 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2953,19 +2958,24 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2975,19 +2985,24 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
+   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
+         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
+               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
+                           Disabled Nodes: 1
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-(13 rows)
+                           Disabled Nodes: 1
+(18 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 092233cc9d..b350efe128 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -335,10 +335,12 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                   QUERY PLAN                  
 ----------------------------------------------
  Sort
+   Disabled Nodes: 1
    Sort Key: proname
    ->  Seq Scan on pg_proc
+         Disabled Nodes: 1
          Filter: (proname ~~* 'ri%foo'::text)
-(4 rows)
+(6 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb4431..31345295c1 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -989,8 +989,9 @@ select * from collate_test1 where b ilike 'abc';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'abc'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'abc';
  a |  b  
@@ -1004,8 +1005,9 @@ select * from collate_test1 where b ilike 'ABC';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
+   Disabled Nodes: 1
    Filter: (b ~~* 'ABC'::text)
-(2 rows)
+(3 rows)
 
 select * from collate_test1 where b ilike 'ABC';
  a |  b  
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 5fd54a10b1..79f0d37a87 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -701,16 +701,19 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
+   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
+         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
+               Disabled Nodes: 1
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(10 rows)
+(13 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index ad73213414..dbb748a2d2 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,6 +1614,7 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
+   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1623,10 +1624,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
+               Disabled Nodes: 1
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(13 rows)
+(15 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 53f70d72ed..31fb7d142e 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8000,13 +8000,15 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
+   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
+         Disabled Nodes: 1
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(7 rows)
+(9 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 96906104d7..df2ca5ba4e 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -333,14 +333,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(8 rows)
+(10 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -348,14 +350,16 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
+   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
+         Disabled Nodes: 1
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(8 rows)
+(10 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 9bad3fc464..c2e9458c35 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -537,10 +537,14 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
+   Disabled Nodes: 1
    ->  Nested Loop
+         Disabled Nodes: 1
          ->  Gather
+               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
+                     Disabled Nodes: 1
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -548,7 +552,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(12 rows)
+(16 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0fd0e1c38b..0456d48c93 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,11 +822,12 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
+   Disabled Nodes: 1
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
          ->  Result
-(5 rows)
+(6 rows)
 
 reset enable_hashagg;
 --
-- 
2.39.3 (Apple Git-145)

#76

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#75)

Re: On disable_cost

On Fri, 2 Aug 2024 at 06:03, Robert Haas <robertmhaas@gmail.com> wrote:

I think this may be a bit hard to understand, so let me give a
concrete example. Suppose we're planning some join where one side can
only be planned with a sequential scan and sequential scans are
disabled. We have ten paths in the path list and they have costs of
1e10+100, 1e10+200, ..., 1e10+1000. Now add_path_precheck() is asked
to consider a new path where there is a disabled node on BOTH sides of
the join -- the one side has the disabled sequential scan, but now the
other side also has something disabled, so the cost is let's say
2e10+79. add_path_precheck() can see at once that this path is a
loser: it can't possibly dominate any path that already exists,
because it costs more than any of them. But when you take disable_cost
out, things look quite different. Now you have a proposed path with a
total_cost of 79 and a path list with costs of 100, ..., 1000. If
you're not allowed to know anything about disabled_nodes, the new path
looks like it might be valuable. You might decide to construct it and
try inserting into the pathlist, which will end up being useless, and
even if you don't, you're going to compare its pathkeys and
parameterization to each of the 10 existing paths before giving up.
Bummer.

OK, so it sounds like you'd like to optimise this code so that the
planner does a little less work when node types are disabled. The
existing comment does mention explicitly that we don't want to do
that:

/*
* We could include disable_cost in the preliminary estimate, but that
* would amount to optimizing for the case where the join method is
* disabled, which doesn't seem like the way to bet.
*/

As far as I understand it from reading the comments in that file, I
see no offer of guarantees that the initial cost will be cheaper than
the final cost. So what you're proposing could end up rejecting paths
based on initial cost where the final cost might end up being the
cheapest path. Imagine you're considering a Nested Loop and a Hash
Join, both of which are disabled. Merge Join is unavailable as the
join column types are not sortable. If the hash join costs 99 and the
initial nested loop costs 110, but the final nested loop ends up
costing 90, then the nested loop could be rejected before we even get
to perform the final cost for it. The current code will run
final_cost_nestloop() and find that 90 is cheaper than 99, whereas
what you want to do is stop bothering with nested loop when we see the
initial cost come out at 110.

Perhaps it's actually fine if the initial costs are always less than
the final costs as, if that's the case, we won't ever reject any paths
based on the initial cost that we wouldn't anyway based on the final
cost. However, since there does not seem to be any comments mentioning
this guarantee and if you're just doing this to squeeze more
performance out of the planner, it seems risky to do for that reason
alone.

I'd say if you want to do this, you should be justifying it on its own
merit with some performance numbers and some evidence that we don't
produce inferior plans as a result. But per what I quoted above,
you're not doing that, you're doing this as a performance
optimisation.

I'm not planning on pushing this any further. I've just tried to
highlight that there's the possibility of a behavioural change. You're
claiming there isn't one. I claim there is.

David

#77

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#76)

Re: On disable_cost

On Thu, Aug 1, 2024 at 11:34 PM David Rowley <dgrowleyml@gmail.com> wrote:

I'm not planning on pushing this any further. I've just tried to
highlight that there's the possibility of a behavioural change. You're
claiming there isn't one. I claim there is.

I don't know what to tell you. The original version of the patch
didn't change this stuff, and the result did not work. So I looked
into the problem and fixed it. I may have done that wrongly, or there
may be debatable points, but it seems like your argument is
essentially that I shouldn't have done any of this and I should just
take it all back out, and I know that doesn't work because it's the
first thing I tried.

--
Robert Haas
EDB: http://www.enterprisedb.com

#78

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#77)

Re: On disable_cost

On Sat, 3 Aug 2024 at 00:17, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Aug 1, 2024 at 11:34 PM David Rowley <dgrowleyml@gmail.com> wrote:

I'm not planning on pushing this any further. I've just tried to
highlight that there's the possibility of a behavioural change. You're
claiming there isn't one. I claim there is.

I don't know what to tell you. The original version of the patch
didn't change this stuff, and the result did not work. So I looked
into the problem and fixed it. I may have done that wrongly, or there
may be debatable points, but it seems like your argument is
essentially that I shouldn't have done any of this and I should just
take it all back out, and I know that doesn't work because it's the
first thing I tried.

I've just read what you wrote again and I now realise something I didn't before.

I now think neither of us got it right. I now think what you'd need to
do to be aligned to the current behaviour is have
initial_cost_nestloop() add the disabled_nodes for the join's subnodes
*only* and have final_cost_nestloop() add the additional
disabled_nodes if enable_nestloop = off. That way you maintain the
existing behaviour of not optimising for disabled node types and don't
risk plan changes if the final cost comes out cheaper than the initial
cost.

David

#79

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#78)

Re: On disable_cost

On Fri, Aug 2, 2024 at 9:13 AM David Rowley <dgrowleyml@gmail.com> wrote:

I now think neither of us got it right. I now think what you'd need to
do to be aligned to the current behaviour is have
initial_cost_nestloop() add the disabled_nodes for the join's subnodes
*only* and have final_cost_nestloop() add the additional
disabled_nodes if enable_nestloop = off. That way you maintain the
existing behaviour of not optimising for disabled node types and don't
risk plan changes if the final cost comes out cheaper than the initial
cost.

All three initial_cost_XXX functions have a comment that says "This
must quickly produce lower-bound estimates of the path's startup and
total costs," i.e. the final cost should never be cheaper. I'm pretty
sure that it was the design intention here that no path ever gets
rejected at the initial cost stage that would have been accepted at
the final cost stage.

(You can also see, as a matter of implementation, that they extract
the startup_cost and run_cost from the workspace and then add to those
values.)

--
Robert Haas
EDB: http://www.enterprisedb.com

#80

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#79)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 2, 2024 at 9:13 AM David Rowley <dgrowleyml@gmail.com> wrote:

... That way you maintain the
existing behaviour of not optimising for disabled node types and don't
risk plan changes if the final cost comes out cheaper than the initial
cost.

All three initial_cost_XXX functions have a comment that says "This
must quickly produce lower-bound estimates of the path's startup and
total costs," i.e. the final cost should never be cheaper. I'm pretty
sure that it was the design intention here that no path ever gets
rejected at the initial cost stage that would have been accepted at
the final cost stage.

That absolutely is the expectation, and we'd better be careful not
to break it.

regards, tom lane

#81

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#80)

Re: On disable_cost

On Fri, Aug 2, 2024 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

That absolutely is the expectation, and we'd better be careful not
to break it.

I have every intention of not breaking it. :-)

--
Robert Haas
EDB: http://www.enterprisedb.com

#82

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Haas (#81)

Re: On disable_cost

On Fri, Aug 2, 2024 at 12:53 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Aug 2, 2024 at 12:51 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

That absolutely is the expectation, and we'd better be careful not
to break it.

I have every intention of not breaking it. :-)

I went ahead and committed these patches. I know there's some debate
over whether we want to show the # of disabled nodes and if so whether
it should be controlled by COSTS, and I suspect I haven't completely
allayed David's concerns about the initial_cost_XXX functions although
I think that I did the right thing. But, I don't have the impression
that anyone is desperately opposed to the basic concept, so I think it
makes sense to put these into the tree and see what happens. We have
quite a bit of time left in this release cycle to uncover bugs, hear
from users or other developers, etc. about what problems there may be
with this. If we end up deciding to reverse course or need to fix a
bunch of stuff, so be it, but let's see what the feedback is.

--
Robert Haas
EDB: http://www.enterprisedb.com

#83

Jelte Fennema-Nio

postgres@jeltef.nl

over 1 year ago

In reply to: Robert Haas (#73)

Re: On disable_cost

On Wed, 31 Jul 2024 at 18:23, Robert Haas <robertmhaas@gmail.com> wrote:

- If we do commit 0002, I think it's a good idea to have the number of
disabled nodes displayed even with COSTS OFF, because it's stable, and
it's pretty useful to be able to see this in the regression output. I
have found while working on this that I often need to adjust the .sql
files to say EXPLAIN (COSTS ON) instead of EXPLAIN (COSTS OFF) in
order to understand what's happening. Right now, there's no real
alternative because costs aren't stable, but disabled-node counts
should be stable, so I feel this would be a step forward. Apart from
that, I also think it's good for features to have regression test
coverage, and since we use COSTS OFF everywhere or at least nearly
everywhere in the regression test, if we don't print out the disabled
node counts when COSTS OFF is used, then we don't cover that case in
our tests. Bummer.

Are the disabled node counts still expected to be stable even with
GEQO? If not, maybe we should have a way to turn them off after all.
Although I agree that always disabling them when COSTS OFF is set is
probably also undesirable. How about a new option, e.g. EXPLAIN
(DISABLED OFF)

#84

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Jelte Fennema-Nio (#83)

Re: On disable_cost

On Thu, Aug 22, 2024 at 8:07 AM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

Are the disabled node counts still expected to be stable even with
GEQO? If not, maybe we should have a way to turn them off after all.
Although I agree that always disabling them when COSTS OFF is set is
probably also undesirable. How about a new option, e.g. EXPLAIN
(DISABLED OFF)

Hmm, I hadn't thought about that. There are no GEQO-specific changes
in this patch, which AFAIK is OK, because I think GEQO just relies on
the core planning machinery to decide everything about the cost of
paths, and is really only experimenting with different join orders. So
I think if it picks the same join order, it should get the same count
of disabled nodes everywhere. If it doesn't pick the same order,
you'll get a different plan entirely.

I don't think I quite want to jump into inventing a new EXPLAIN option
right this minute. I'm not against the idea, but I don't want to jump
into engineering solutions before I understand what the problems are,
so I think we should give this a little time. I'll be a bit surprised
if this doesn't elicit a few strong reactions, but I want to see what
people are actually sad (or, potentially, happy) about.

--
Robert Haas
EDB: http://www.enterprisedb.com

#85

Jonathan S. Katz

jkatz@postgresql.org

over 1 year ago

In reply to: Robert Haas (#82)

Re: On disable_cost

On 8/21/24 10:29 AM, Robert Haas wrote:

I went ahead and committed these patches. I know there's some debate
over whether we want to show the # of disabled nodes and if so whether
it should be controlled by COSTS, and I suspect I haven't completely
allayed David's concerns about the initial_cost_XXX functions although
I think that I did the right thing. But, I don't have the impression
that anyone is desperately opposed to the basic concept, so I think it
makes sense to put these into the tree and see what happens. We have
quite a bit of time left in this release cycle to uncover bugs, hear
from users or other developers, etc. about what problems there may be
with this. If we end up deciding to reverse course or need to fix a
bunch of stuff, so be it, but let's see what the feedback is.

We hit an issue with pgvector[0]https://github.com/pgvector/pgvector where a regular `SELECT count(*) FROM
table`[1]https://github.com/pgvector/pgvector/actions/runs/10519052945 is attempting to scan the index on the vector column when
`enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.

I was able to trace this back to e2225346. Here is a reproducer:

Setup
=====

CREATE EXTENSION vector;

CREATE OR REPLACE FUNCTION public.generate_random_normalized_vector(dim
integer)
RETURNS vector
LANGUAGE SQL
AS $$
SELECT public.l2_normalize(array_agg(random()::real)::vector)
FROM generate_series(1, $1);
$$;

CREATE TABLE test (id int, embedding vector(128));
INSERT INTO test
SELECT n, public.generate_random_normalized_vector(128)
FROM generate_series(1,5) n;

CREATE INDEX ON test USING hnsw (embedding vector_cosine_ops);

Test
====

SET enable_seqscan TO off;
EXPLAIN ANALYZE
SELECT count(*) FROM test;

Before e2225346:
----------------

Aggregate (cost=10000041965.00..10000041965.01 rows=1 width=8) (actual
time=189.864..189.864 rows
=1 loops=1)
-> Seq Scan on test (cost=10000000000.00..10000040715.00 rows=5
width=0) (actual time=0.01
8..168.294 rows=5 loops=1)
(4 rows)

With e2225346:
-------------
ERROR: cannot scan hnsw index without order

Some things to note with the ivfflat/hnsw index AMs[3]https://github.com/pgvector/pgvector/blob/master/src/hnsw.c#L192 in pgvector are
that they're used for "ORDER BY" scans exclusively. They currently don't
support index only scans (noting as I tried reproducing the issue with
GIST and couldn't do so because of that), but we wouldn't want to do a
full table "count(*)" on a IVFFlat/HNSW index anyway as it'd be more
expensive than just a full table scan.

Thanks,

Jonathan

[0]: https://github.com/pgvector/pgvector
[1]: https://github.com/pgvector/pgvector/actions/runs/10519052945
[2]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e2225346
[3]: https://github.com/pgvector/pgvector/blob/master/src/hnsw.c#L192

#86

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Jonathan S. Katz (#85)

Re: On disable_cost

On Fri, Aug 23, 2024 at 11:17 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:

We hit an issue with pgvector[0] where a regular `SELECT count(*) FROM
table`[1] is attempting to scan the index on the vector column when
`enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.

I was able to trace this back to e2225346. Here is a reproducer:

If I change EXPLAIN ANALYZE in this test to just EXPLAIN, I get this:

Aggregate (cost=179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00..179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00
rows=1 width=8)
-> Index Only Scan using test_embedding_idx on test
(cost=179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00..179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00
rows=5 width=0)

It took me a moment to wrap my head around this: the cost estimate is
312 decimal digits long. Apparently hnswcostestimate() just returns
DBL_MAX when there are no scan keys because it really, really doesn't
want to do that. Before e2225346, that kept this plan from being
generated because it was (much) larger than disable_cost. But now it
doesn't, because 1 disabled node makes a path more expensive than any
possible non-disabled path. Since that was the whole point of the
patch, I don't feel too bad about it.

I find it a little weird that hnsw thinks itself able to return all
the tuples in an order the user chooses, but unable to return all of
the tuples in an arbitrary order. In core, we have precedent for index
types that can't return individual tuples at all (gin, brin) but not
one that is able to return tuples in concept but has a panic attack if
you don't know how you want them sorted. I don't quite see why you
couldn't just treat that case the same as ORDER BY
the_first_column_of_the_index, or any other arbitrary rule that you
want to make up. Sure, it might be more expensive than a sequential
scan, but the user said they didn't want a sequential scan. I'm not
quite sure why pgvector thinks it gets to decide that it knows better
than the user, or the rest of the optimizer. I don't even think I
really believe it would always be worse: I've seen cases where a table
was badly bloated and mostly empty but its indexes were not bloated,
and in that case an index scan can be a HUGE winner even though it
would normally be a lot worse than a sequential scan.

If you don't want to fix hnsw to work the way the core optimizer
thinks it should, or if there's some reason it can't be done,
alternatives might include (1) having the cost estimate function hack
the count of disabled nodes and (2) adding some kind of core support
for an index cost estimator refusing a path entirely. I haven't tested
(1) so I don't know for sure that there are no issues, but I think we
have to do all of our cost estimating before we can think about adding
the path so I feel like there's a decent chance it would do what you
want.

Also, while I did take the initiative to download pgvector and compile
it and hook up a debugger and figure out what was going on here, I'm
not really too sure that's my job. I do think I have a responsibility
to help maintainers of out-of-core extensions who have problems as a
result of my commits, but I also think it's fair to hope that those
maintainers will try to minimize the amount of time that I need to
spend trying to read code that I did not write and do not maintain.
Fortunately, this wasn't hard to figure out, but in a way that's kind
of the point. That DBL_MAX hack was put there by somebody who must've
understood that they were trying to use a very large cost to disable a
certain path shape completely, and it seems to me that if that person
had studied this case and the commit message for e2225346, they would
have likely understood what had happened pretty quickly. Do you think
that's an unfair feeling on my part?

--
Robert Haas
EDB: http://www.enterprisedb.com

#87

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#86)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 23, 2024 at 11:17 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:

We hit an issue with pgvector[0] where a regular `SELECT count(*) FROM
table`[1] is attempting to scan the index on the vector column when
`enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.

It took me a moment to wrap my head around this: the cost estimate is
312 decimal digits long. Apparently hnswcostestimate() just returns
DBL_MAX when there are no scan keys because it really, really doesn't
want to do that. Before e2225346, that kept this plan from being
generated because it was (much) larger than disable_cost. But now it
doesn't, because 1 disabled node makes a path more expensive than any
possible non-disabled path. Since that was the whole point of the
patch, I don't feel too bad about it.

Yeah, I don't think it's necessary for v18 to be bug-compatible with
this hack.

If you don't want to fix hnsw to work the way the core optimizer
thinks it should, or if there's some reason it can't be done,
alternatives might include (1) having the cost estimate function hack
the count of disabled nodes and (2) adding some kind of core support
for an index cost estimator refusing a path entirely. I haven't tested
(1) so I don't know for sure that there are no issues, but I think we
have to do all of our cost estimating before we can think about adding
the path so I feel like there's a decent chance it would do what you
want.

It looks like amcostestimate could change the path's disabled_nodes
count, since that's set up before invoking amcostestimate. I guess
it could be set to INT_MAX to have a comparable solution to before.

I agree with you that it is not great that hnsw is refusing this case
rather than finding a way to make it work, so I'm not excited about
putting in support for refusing it in a less klugy way.

regards, tom lane

#88

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#87)

Re: On disable_cost

On Fri, Aug 23, 2024 at 1:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It looks like amcostestimate could change the path's disabled_nodes
count, since that's set up before invoking amcostestimate. I guess
it could be set to INT_MAX to have a comparable solution to before.

It's probably better to add a more modest value, to avoid overflow.
You could add a million or so and be far away from overflow while
presumably still being more disabled than any other path.

--
Robert Haas
EDB: http://www.enterprisedb.com

#89

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#88)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 23, 2024 at 1:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It looks like amcostestimate could change the path's disabled_nodes
count, since that's set up before invoking amcostestimate. I guess
it could be set to INT_MAX to have a comparable solution to before.

It's probably better to add a more modest value, to avoid overflow.
You could add a million or so and be far away from overflow while
presumably still being more disabled than any other path.

But that'd only matter if the path survived its first add_path
tournament, which it shouldn't. If it does then you're at risk
of the same run-time failure reported here.

(Having said that, you're likely right that "a million or so"
would be a safer choice, since it doesn't require the assumption
that the path fails instantly.)

regards, tom lane

#90

Heikki Linnakangas

hlinnaka@iki.fi

over 1 year ago

In reply to: Robert Haas (#86)

Re: On disable_cost

On 23/08/2024 20:11, Robert Haas wrote:

I find it a little weird that hnsw thinks itself able to return all
the tuples in an order the user chooses, but unable to return all of
the tuples in an arbitrary order.

HNSW is weird in many ways:

- There is no inherent sort order. It cannot do "ORDER BY column", only
kNN-sort like "ORDER BY column <-> value".

- It's approximate. It's not guaranteed to return the same set of rows
as a sequential scan + sort.

- The number of results it returns is limited by the hnsw.ef_search GUC,
default 100.

- It collects all the results (up to hnsw.ef_search) in memory, and only
then returns them. So if you tried to use it with a large number of
results, it can simply run out of memory.

Arguably all of those are bugs in HNSW, but it is what it is. The
algorithm is inherently approximate. Despite that, it's useful in practice.

In core, we have precedent for index
types that can't return individual tuples at all (gin, brin) but not
one that is able to return tuples in concept but has a panic attack if
you don't know how you want them sorted.

Well, we do also have gin_fuzzy_search_limit. Two wrongs doesn't make it
right, though; I'd love to get rid of that hack too somehow.

I don't quite see why you
couldn't just treat that case the same as ORDER BY
the_first_column_of_the_index, or any other arbitrary rule that you
want to make up. Sure, it might be more expensive than a sequential
scan, but the user said they didn't want a sequential scan. I'm not
quite sure why pgvector thinks it gets to decide that it knows better
than the user, or the rest of the optimizer. I don't even think I
really believe it would always be worse: I've seen cases where a table
was badly bloated and mostly empty but its indexes were not bloated,
and in that case an index scan can be a HUGE winner even though it
would normally be a lot worse than a sequential scan.

Sure, you could make it work. It could construct a vector out of thin
air to compare with, when there's no scan key, or implement a completely
different codepath that traverses the full graph in no particular order.

If you don't want to fix hnsw to work the way the core optimizer
thinks it should, or if there's some reason it can't be done,
alternatives might include (1) having the cost estimate function hack
the count of disabled nodes and (2) adding some kind of core support
for an index cost estimator refusing a path entirely. I haven't tested
(1) so I don't know for sure that there are no issues, but I think we
have to do all of our cost estimating before we can think about adding
the path so I feel like there's a decent chance it would do what you
want.

It would seem useful for an index AM to be able to say "nope, I can't do
this". I don't remember how exactly this stuff works, but I'm surprised
it doesn't already exist.

--
Heikki Linnakangas
Neon (https://neon.tech)

#91

Jonathan S. Katz

jkatz@postgresql.org

over 1 year ago

In reply to: Robert Haas (#86)

Re: On disable_cost

On 8/23/24 1:11 PM, Robert Haas wrote:

On Fri, Aug 23, 2024 at 11:17 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:

We hit an issue with pgvector[0] where a regular `SELECT count(*) FROM
table`[1] is attempting to scan the index on the vector column when
`enable_seqscan` is disabled. Credit to Andrew Kane (CC'd) for flagging it.

I was able to trace this back to e2225346. Here is a reproducer:

If I change EXPLAIN ANALYZE in this test to just EXPLAIN, I get this:

Aggregate (cost=179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00..179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00
rows=1 width=8)
-> Index Only Scan using test_embedding_idx on test
(cost=179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00..179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.00
rows=5 width=0)

It took me a moment to wrap my head around this: the cost estimate is
312 decimal digits long. Apparently hnswcostestimate() just returns
DBL_MAX when there are no scan keys because it really, really doesn't
want to do that. Before e2225346, that kept this plan from being
generated because it was (much) larger than disable_cost. But now it
doesn't, because 1 disabled node makes a path more expensive than any
possible non-disabled path. Since that was the whole point of the
patch, I don't feel too bad about it.

I find it a little weird that hnsw thinks itself able to return all
the tuples in an order the user chooses, but unable to return all of
the tuples in an arbitrary order.

For HNSW, "order" is approximated - even when it's returning "in the
order the user chooses," the scan is making the best guess at what the
correct order is based on the index structure. At the traditional "leaf"
level of an index, you're actually traversing a graph-based neighborhood
of values. And maybe we could say "Hey, if you get the equivalent of a
count(*), just do the count at the bottom layer (Layer 0)" but I think
this would be very expensive.

In core, we have precedent for index
types that can't return individual tuples at all (gin, brin) but not
one that is able to return tuples in concept but has a panic attack if
you don't know how you want them sorted. I don't quite see why you
couldn't just treat that case the same as ORDER BY
the_first_column_of_the_index, or any other arbitrary rule that you
want to make up. Sure, it might be more expensive than a sequential
scan, but the user said they didn't want a sequential scan. I'm not
quite sure why pgvector thinks it gets to decide that it knows better
than the user, or the rest of the optimizer. I don't even think I
really believe it would always be worse: I've seen cases where a table
was badly bloated and mostly empty but its indexes were not bloated,
and in that case an index scan can be a HUGE winner even though it
would normally be a lot worse than a sequential scan.

The challenge here is that HNSW is used specifically for approximating
ordering; it's not used to directly filter results in the traditional
sense (e.g. via. a WHERE clause). It's a bit different than the others
mentioned in that regard. However, maybe there are other options to
consider here based on this work.

If you don't want to fix hnsw to work the way the core optimizer
thinks it should, or if there's some reason it can't be done,
alternatives might include (1) having the cost estimate function hack
the count of disabled nodes and (2) adding some kind of core support
for an index cost estimator refusing a path entirely. I haven't tested
(1) so I don't know for sure that there are no issues, but I think we
have to do all of our cost estimating before we can think about adding
the path so I feel like there's a decent chance it would do what you
want.

Thanks for the options.

Also, while I did take the initiative to download pgvector and compile
it and hook up a debugger and figure out what was going on here, I'm
not really too sure that's my job. I do think I have a responsibility
to help maintainers of out-of-core extensions who have problems as a
result of my commits, but I also think it's fair to hope that those
maintainers will try to minimize the amount of time that I need to
spend trying to read code that I did not write and do not maintain.
Fortunately, this wasn't hard to figure out, but in a way that's kind
of the point. That DBL_MAX hack was put there by somebody who must've
understood that they were trying to use a very large cost to disable a
certain path shape completely, and it seems to me that if that person
had studied this case and the commit message for e2225346, they would
have likely understood what had happened pretty quickly. Do you think
that's an unfair feeling on my part?

I don't think extension maintainers necessarily have the same level of
PostgreSQL internals as you or many of the other people who frequent
-hackers, so I think it's fair for them to ask questions or raise issues
with patches they don't understand. I was able to glean from the commit
message that this was the commit that likely changed the behavior in
pgvector, but I can't immediately glean looking through the code as to
why. (And using your logic, should an extension maintainer understand
the optimizer code when PostgreSQL is providing an interface to the
extension maintainer to encapsulate its interactions)?

You can always push back and say "Well, maybe try this, or try that" -
which would be a mentoring approach that could push it back on the
extension maintainer, which is valid, but I don't see why an extension
maintainer can't raise an issue or ask a question here.

Thanks,

Jonathan

#92

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Jonathan S. Katz (#91)

Re: On disable_cost

On Fri, Aug 23, 2024 at 2:20 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:

I don't think extension maintainers necessarily have the same level of
PostgreSQL internals as you or many of the other people who frequent
-hackers, so I think it's fair for them to ask questions or raise issues
with patches they don't understand. I was able to glean from the commit
message that this was the commit that likely changed the behavior in
pgvector, but I can't immediately glean looking through the code as to
why. (And using your logic, should an extension maintainer understand
the optimizer code when PostgreSQL is providing an interface to the
extension maintainer to encapsulate its interactions)?

You can always push back and say "Well, maybe try this, or try that" -
which would be a mentoring approach that could push it back on the
extension maintainer, which is valid, but I don't see why an extension
maintainer can't raise an issue or ask a question here.

I'm certainly not saying that extension maintainers can't raise issues
or ask questions here. I just feel that the problem could have been
analyzed a bit more before posting.

--
Robert Haas
EDB: http://www.enterprisedb.com

#93

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Heikki Linnakangas (#90)

Re: On disable_cost

On Fri, Aug 23, 2024 at 2:18 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

It would seem useful for an index AM to be able to say "nope, I can't do
this". I don't remember how exactly this stuff works, but I'm surprised
it doesn't already exist.

Yeah, I think so, too. While this particular problem is due to a
problem with an out-of-core AM that may be doing some slightly
questionable things, there's not really any reason why we couldn't
have similar problems in core for some other reason. For example, we
could change amcostestimate's signature so that an extension can
return true or false, with false meaning that the path can't be
supported. We could then change cost_index so that it can also return
true or false, and then change create_index_path so it has the option
to return NULL. Callers of create_index_path could then be adjusted
not to call add_path when NULL is returned.

There might be a more elegant way to do it with more refactoring, but
the above seems good enough.

--
Robert Haas
EDB: http://www.enterprisedb.com

#94

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#93)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Aug 23, 2024 at 2:18 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

It would seem useful for an index AM to be able to say "nope, I can't do
this". I don't remember how exactly this stuff works, but I'm surprised
it doesn't already exist.

Yeah, I think so, too. While this particular problem is due to a
problem with an out-of-core AM that may be doing some slightly
questionable things, there's not really any reason why we couldn't
have similar problems in core for some other reason. For example, we
could change amcostestimate's signature so that an extension can
return true or false, with false meaning that the path can't be
supported. We could then change cost_index so that it can also return
true or false, and then change create_index_path so it has the option
to return NULL. Callers of create_index_path could then be adjusted
not to call add_path when NULL is returned.

If we're going to do this, I'd prefer a solution that doesn't force
API changes onto the vast majority of index AMs that don't have a
problem here.

One way could be to formalize the hack we were just discussing:
"To refuse a proposed path, amcostestimate can set the path's
disabled_nodes value to anything larger than 1". I suspect that
that would actually be sufficient, since the path would then lose
to the seqscan path in add_path even if that were disabled; but
we could put in a hack to prevent it from getting add_path'd at all.

Another way could be to bless what hnsw is already doing:
"To refuse a proposed path, amcostestimate can return an
indexTotalCost of DBL_MAX" (or maybe insisting on +Inf would
be better). That would still require changes comparable to
what you specify above, but only in the core-code call path
not in every AM.

regards, tom lane

#95

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#94)

Re: On disable_cost

On Fri, Aug 23, 2024 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we're going to do this, I'd prefer a solution that doesn't force
API changes onto the vast majority of index AMs that don't have a
problem here.

That's a fair concern.

One way could be to formalize the hack we were just discussing:
"To refuse a proposed path, amcostestimate can set the path's
disabled_nodes value to anything larger than 1". I suspect that
that would actually be sufficient, since the path would then lose
to the seqscan path in add_path even if that were disabled; but
we could put in a hack to prevent it from getting add_path'd at all.

Another way could be to bless what hnsw is already doing:
"To refuse a proposed path, amcostestimate can return an
indexTotalCost of DBL_MAX" (or maybe insisting on +Inf would
be better). That would still require changes comparable to
what you specify above, but only in the core-code call path
not in every AM.

If just setting disabled_nodes to a value larger than one works, I'd
be inclined to not do anything here at all, except possibly document
that you can do that. Otherwise, we should probably change the code
somehow.

I find both of your proposed solutions above to be pretty inelegant,
and I think if this problem occurred with a core AM, I'd push for an
API break rather than accept the ugliness. "This path is not valid
because the AM cannot support it", "this path is crazy expensive", and
"the user told us not to do it this way" are three different things,
and signalling two or more of them in the same way muddies the water
in a way that I don't like. API breaks aren't free, though, so I
certainly understand why you're not very keen to introduce one where
it can reasonably be avoided.

--
Robert Haas
EDB: http://www.enterprisedb.com

#96

Heikki Linnakangas

hlinnaka@iki.fi

over 1 year ago

In reply to: Robert Haas (#95)

Re: On disable_cost

On 23/08/2024 22:05, Robert Haas wrote:

On Fri, Aug 23, 2024 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we're going to do this, I'd prefer a solution that doesn't force
API changes onto the vast majority of index AMs that don't have a
problem here.

That's a fair concern.

Yeah, although I don't think it's too bad. There are not that many
out-of-tree index AM implementations to begin with, and we do change
things often enough that any interesting AM implementation will likely
need a few #ifdef PG_VERSION blocks for each PostgreSQL major version
anyway. pgvector certainly does.

One way could be to formalize the hack we were just discussing:
"To refuse a proposed path, amcostestimate can set the path's
disabled_nodes value to anything larger than 1". I suspect that
that would actually be sufficient, since the path would then lose
to the seqscan path in add_path even if that were disabled; but
we could put in a hack to prevent it from getting add_path'd at all.

Another way could be to bless what hnsw is already doing:
"To refuse a proposed path, amcostestimate can return an
indexTotalCost of DBL_MAX" (or maybe insisting on +Inf would
be better). That would still require changes comparable to
what you specify above, but only in the core-code call path
not in every AM.

If just setting disabled_nodes to a value larger than one works, I'd
be inclined to not do anything here at all, except possibly document
that you can do that. Otherwise, we should probably change the code
somehow.

Modifying the passed-in Path feels hacky. amcostestimate currently
returns all the estimates in *output parameters, it doesn't modify the
Path at all.

I find both of your proposed solutions above to be pretty inelegant,
and I think if this problem occurred with a core AM, I'd push for an
API break rather than accept the ugliness. "This path is not valid
because the AM cannot support it", "this path is crazy expensive", and
"the user told us not to do it this way" are three different things,
and signalling two or more of them in the same way muddies the water
in a way that I don't like. API breaks aren't free, though, so I
certainly understand why you're not very keen to introduce one where
it can reasonably be avoided.

The +Inf approach seems fine to me. Or perhaps NaN. Your proposal would
certainly be the cleanest interface if we don't mind incurring churn to
AM implementations.

--
Heikki Linnakangas
Neon (https://neon.tech)

#97

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#95)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

I find both of your proposed solutions above to be pretty inelegant,

They are that. If we were working in a green field I'd not propose
such things ... but we aren't. I believe there are now a fair number
of out-of-core index AMs, so I'd rather not break all of them if we
don't have to.

and I think if this problem occurred with a core AM, I'd push for an
API break rather than accept the ugliness. "This path is not valid
because the AM cannot support it", "this path is crazy expensive", and
"the user told us not to do it this way" are three different things,
and signalling two or more of them in the same way muddies the water
in a way that I don't like.

I think it's not that bad, because we can limit the knowledge of this
hack to the amcostestimate interface, which doesn't really deal in
"the user told us not to do it this way" at all. That argues against
my first proposal though (having amcostestimate touch disabled_nodes
directly). I now think that a reasonable compromise is to say that
setting indexTotalCost to +Inf signals that "the AM cannot support
it". That's not conflated too much with the other case, since even a
crazy-expensive cost estimate surely ought to be finite. We can have
cost_index untangle that case into a separate failure return so that
the within-the-core-optimizer APIs remain clean.

While that would require hnsw to make a small code change (return
+Inf not DBL_MAX), that coding should work in back branches too,
so they don't even need a version check.

regards, tom lane

#98

Jonathan S. Katz

jkatz@postgresql.org

over 1 year ago

In reply to: Robert Haas (#92)

Re: On disable_cost

On 8/23/24 2:29 PM, Robert Haas wrote:

On Fri, Aug 23, 2024 at 2:20 PM Jonathan S. Katz <jkatz@postgresql.org> wrote:

I don't think extension maintainers necessarily have the same level of
PostgreSQL internals as you or many of the other people who frequent
-hackers, so I think it's fair for them to ask questions or raise issues
with patches they don't understand. I was able to glean from the commit
message that this was the commit that likely changed the behavior in
pgvector, but I can't immediately glean looking through the code as to
why. (And using your logic, should an extension maintainer understand
the optimizer code when PostgreSQL is providing an interface to the
extension maintainer to encapsulate its interactions)?

You can always push back and say "Well, maybe try this, or try that" -
which would be a mentoring approach that could push it back on the
extension maintainer, which is valid, but I don't see why an extension
maintainer can't raise an issue or ask a question here.

I'm certainly not saying that extension maintainers can't raise issues
or ask questions here. I just feel that the problem could have been
analyzed a bit more before posting.

This assumes that the person posting the problem has the requisite
expertise to determine what the issue is. Frankly, I was happy I was
able to at least trace the issue down to the particular commit and
brought what appeared to be a reliable reproducer, in absence of knowing
if 1/ this was actually an issue with PG or pgvector, 2/ does it
actually require a fix, or 3/ what the problem could actually be, given
a lack of understanding of the full inner working of the optimizer.

Based on the above, I'm not sure what bar this needed to clear to begin
a discussion on the mailing list (which further downthread, seems to be
raising some interesting points).

Jonathan

#99

Jonathan S. Katz

jkatz@postgresql.org

over 1 year ago

In reply to: Tom Lane (#97)

Re: On disable_cost

On 8/23/24 3:32 PM, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I find both of your proposed solutions above to be pretty inelegant,

They are that. If we were working in a green field I'd not propose
such things ... but we aren't. I believe there are now a fair number
of out-of-core index AMs, so I'd rather not break all of them if we
don't have to.

For distribution of index AMs in the wild, it's certainly > 1 now, and
increasing. They're not the easiest extension types to build out, so
it's not as widely distributed as some of the other APIs, but there are
a bunch out there, as well as language-specific libs (e.g. pgrx for
Rust) that offer wrappers around them.

and I think if this problem occurred with a core AM, I'd push for an
API break rather than accept the ugliness. "This path is not valid
because the AM cannot support it", "this path is crazy expensive", and
"the user told us not to do it this way" are three different things,
and signalling two or more of them in the same way muddies the water
in a way that I don't like.

I think it's not that bad, because we can limit the knowledge of this
hack to the amcostestimate interface, which doesn't really deal in
"the user told us not to do it this way" at all. That argues against
my first proposal though (having amcostestimate touch disabled_nodes
directly). I now think that a reasonable compromise is to say that
setting indexTotalCost to +Inf signals that "the AM cannot support
it". That's not conflated too much with the other case, since even a
crazy-expensive cost estimate surely ought to be finite. We can have
cost_index untangle that case into a separate failure return so that
the within-the-core-optimizer APIs remain clean.

While that would require hnsw to make a small code change (return
+Inf not DBL_MAX), that coding should work in back branches too,
so they don't even need a version check.

+1 for this approach (I'll do a quick test in my pgvector workspace just
to ensure it gets the same results in the older version).

Jonathan

#100

Jonathan S. Katz

jkatz@postgresql.org

over 1 year ago

In reply to: Jonathan S. Katz (#99)

Re: On disable_cost

On 8/23/24 5:33 PM, Jonathan S. Katz wrote:

On 8/23/24 3:32 PM, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I think it's not that bad, because we can limit the knowledge of this
hack to the amcostestimate interface, which doesn't really deal in
"the user told us not to do it this way" at all. That argues against
my first proposal though (having amcostestimate touch disabled_nodes
directly). I now think that a reasonable compromise is to say that
setting indexTotalCost to +Inf signals that "the AM cannot support
it". That's not conflated too much with the other case, since even a
crazy-expensive cost estimate surely ought to be finite. We can have
cost_index untangle that case into a separate failure return so that
the within-the-core-optimizer APIs remain clean.

While that would require hnsw to make a small code change (return
+Inf not DBL_MAX), that coding should work in back branches too,
so they don't even need a version check.

+1 for this approach (I'll do a quick test in my pgvector workspace just
to ensure it gets the same results in the older version).

...and I confirmed the +inf approach on PG16 +pgvector does still give
the same expected result.

Thanks,

Jonathan

#101

Alexander Lakhin

exclusion@gmail.com

over 1 year ago

In reply to: Robert Haas (#82)

Re: On disable_cost

Hello Robert,

21.08.2024 17:29, Robert Haas wrote:

I went ahead and committed these patches. ...

Please take a look at the following code:
static void
label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
{
...
    cost_sort(&sort_path, root, NIL,
              lefttree->total_cost,
              plan->plan.disabled_nodes,
              lefttree->plan_rows,
              lefttree->plan_width,
              0.0,
              work_mem,
              limit_tuples);

Given the cost_sort() declaration:
void
cost_sort(Path *path, PlannerInfo *root,
          List *pathkeys, int input_disabled_nodes,
          Cost input_cost, double tuples, int width,
          Cost comparison_cost, int sort_mem,
          double limit_tuples)

Aren't the input_disabled_nodes and input_cost arguments swapped in the
above call?

(I've discovered this with UBSan, which complained
createplan.c:5457:6: runtime error: 4.40465e+09 is outside the range of representable values of type 'int'
while executing a query with a large estimated cost.)

Best regards,
Alexander

#102

Richard Guo

guofenglinux@gmail.com

over 1 year ago

In reply to: Alexander Lakhin (#101)

Re: On disable_cost

On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

static void
label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
{
...
cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
plan->plan.disabled_nodes,
lefttree->plan_rows,
lefttree->plan_width,
0.0,
work_mem,
limit_tuples);

Given the cost_sort() declaration:
void
cost_sort(Path *path, PlannerInfo *root,
List *pathkeys, int input_disabled_nodes,
Cost input_cost, double tuples, int width,
Cost comparison_cost, int sort_mem,
double limit_tuples)

Aren't the input_disabled_nodes and input_cost arguments swapped in the
above call?

Nice catch! I checked other callers to cost_sort, and they are all
good.

(I'm a little surprised that this does not cause any plan diffs in the
regression tests.)

Thanks
Richard

#103

Richard Guo

guofenglinux@gmail.com

over 1 year ago

In reply to: Richard Guo (#102)

Re: On disable_cost

On Fri, Sep 6, 2024 at 5:27 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

static void
label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)

(I'm a little surprised that this does not cause any plan diffs in the
regression tests.)

Ah I see. label_sort_with_costsize is only used to label the Sort
node nicely for EXPLAIN, and usually we do not display the cost
numbers in regression tests.

Thanks
Richard

#104

Alexander Lakhin

exclusion@gmail.com

over 1 year ago

In reply to: Richard Guo (#103)

Re: On disable_cost

Hello Richard,

06.09.2024 12:51, Richard Guo wrote:

Ah I see. label_sort_with_costsize is only used to label the Sort
node nicely for EXPLAIN, and usually we do not display the cost
numbers in regression tests.

In fact, I see the error with the following (EXPLAIN-less) query:
create table t (x int);

select * from t natural inner join
(select * from (values(1)) v(x)
union all
select 1 from t t1 full join t t2 using (x),
t t3 full join t t4 using (x)
);

2024-09-06 10:01:48.034 UTC [696535:5] psql LOG: statement: select * from t natural inner join
    (select * from (values(1)) v(x)
      union all
     select 1 from t t1 full join t t2 using (x),
                   t t3 full join t t4 using (x)
    );
createplan.c:5457:6: runtime error: 4.99254e+09 is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior createplan.c:5457:6 in

(An UBSan-enabled build --with-blocksize=32 is required for this query to
trigger the failure.)

Best regards,
Alexander

#105

Richard Guo

guofenglinux@gmail.com

over 1 year ago

In reply to: Richard Guo (#102)

Re: On disable_cost

On Fri, Sep 6, 2024 at 5:27 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Sep 6, 2024 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:

static void
label_sort_with_costsize(PlannerInfo *root, Sort *plan, double limit_tuples)
{
...
cost_sort(&sort_path, root, NIL,
lefttree->total_cost,
plan->plan.disabled_nodes,
lefttree->plan_rows,
lefttree->plan_width,
0.0,
work_mem,
limit_tuples);

Given the cost_sort() declaration:
void
cost_sort(Path *path, PlannerInfo *root,
List *pathkeys, int input_disabled_nodes,
Cost input_cost, double tuples, int width,
Cost comparison_cost, int sort_mem,
double limit_tuples)

Aren't the input_disabled_nodes and input_cost arguments swapped in the
above call?

Nice catch! I checked other callers to cost_sort, and they are all
good.

Fixed.

Thanks
Richard

#106

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Richard Guo (#105)

Re: On disable_cost

On Mon, Sep 9, 2024 at 12:09 AM Richard Guo <guofenglinux@gmail.com> wrote:

Fixed.

Thanks to Alexander for the very good catch and to Richard for pushing the fix.

(I started to respond to this last week but didn't quite get to it
before I ran out of time/energy.)

--
Robert Haas
EDB: http://www.enterprisedb.com

#107

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Robert Haas (#82)

Re: On disable_cost

On Wed, 2024-08-21 at 10:29 -0400, Robert Haas wrote:

I went ahead and committed these patches. I know there's some debate
over whether we want to show the # of disabled nodes and if so whether
it should be controlled by COSTS, and I suspect I haven't completely
allayed David's concerns about the initial_cost_XXX functions although
I think that I did the right thing. But, I don't have the impression
that anyone is desperately opposed to the basic concept, so I think it
makes sense to put these into the tree and see what happens. We have
quite a bit of time left in this release cycle to uncover bugs, hear
from users or other developers, etc. about what problems there may be
with this. If we end up deciding to reverse course or need to fix a
bunch of stuff, so be it, but let's see what the feedback is.

I am somewhat unhappy about the "Disabled Nodes" in EXPLAIN.

First, the commit message confused me: it claims that the information
is displayed with EXPLAIN ANALYZE, but it's shown with every EXPLAIN.

But that's not important. My complaints are:

1. The "disabled nodes" are always displayed.
I'd be happier if it were only shown for COSTS ON, but I think it
would be best if they were only shown with VERBOSE ON.

After all, the messages are pretty verbose...

2. The "disabled nodes" are not only shown at the nodes where nodes
were actually disabled, but also at every nodes above these nodes.

This would be fine:

Sort
-> Nested Loop Join
-> Hash Join
-> Index Scan
Disabled Nodes: 1
-> Hash
-> Index Scan
Disabled Nodes: 1
-> Index Scan
Disabled Nodes: 1

This is annoying:

Sort
Disabled Nodes: 3
-> Nested Loop Join
Disabled Nodes: 3
-> Hash Join
Disabled Nodes: 2
-> Index Scan
Disabled Nodes: 1
-> Hash
-> Index Scan
Disabled Nodes: 1
-> Index Scan
Disabled Nodes: 1

I have no idea how #2 could be implemented, but it would be nice to have.
Please, please, can we show the "disabled nodes" only with VERBOSE?

Yours,
Laurenz Albe

#108

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#107)

1 attachment(s)

Re: On disable_cost

On Fri, 27 Sept 2024 at 20:42, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

2. The "disabled nodes" are not only shown at the nodes where nodes
were actually disabled, but also at every nodes above these nodes.

I'm also not a fan either and I'd like to see this output improved.

It seems like it's easy enough to implement some logic to detect when
a given node is disabled just by checking if the disable_nodes count
is higher than the sum of the disabled_node field of the node's
children. If there are no children (a scan node) and disabed_nodes >
0 then it must be disabled. There's even a nice fast path where we
don't need to check the children if disabled_nodes == 0.

Here's a POC grade patch of how I'd rather see it looking.

I opted to have a boolean field as I didn't see any need for an
integer count. I also changed things around so we always display the
boolean property in non-text EXPLAIN. Normally, we don't mind being
more verbose there.

I also fixed a bug in make_sort() where disabled_nodes isn't being set
properly. I'll do an independent patch for that if this goes nowhere.

David

Attachments:

poc_improve_disabled_nodes_explain_output.patchapplication/octet-stream; name=poc_improve_disabled_nodes_explain_output.patchDownload

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee1bcb84e2..d36b8902c8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1363,6 +1363,67 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 	return planstate_tree_walker(planstate, ExplainPreScanNode, rels_used);
 }
 
+/*
+ * plan_is_disabled
+ *		Checks if the given plan node type was disabled during query planning.
+ *		This is evident by the disable_node field being higher than the sum of
+ *		the disabled_node field from the plan's children.
+ */
+static bool
+plan_is_disabled(Plan *plan)
+{
+	int child_disabled_nodes;
+
+	/* The node is certainly not disabled if this is zero */
+	if (plan->disabled_nodes == 0)
+		return false;
+
+	child_disabled_nodes = 0;
+	if (outerPlan(plan))
+		child_disabled_nodes += outerPlan(plan)->disabled_nodes;
+	if (innerPlan(plan))
+		child_disabled_nodes += innerPlan(plan)->disabled_nodes;
+	else
+	{
+		if (IsA(plan, Append))
+		{
+			ListCell *lc;
+			Append *aplan = (Append *) plan;
+
+			foreach(lc, aplan->appendplans)
+			{
+				Plan *subplan = lfirst(lc);
+
+				child_disabled_nodes += subplan->disabled_nodes;
+			}
+		}
+		else if (IsA(plan, MergeAppend))
+		{
+			ListCell *lc;
+			MergeAppend *maplan = (MergeAppend *) plan;
+
+			foreach(lc, maplan->mergeplans)
+			{
+				Plan *subplan = lfirst(lc);
+
+				child_disabled_nodes += subplan->disabled_nodes;
+			}
+		}
+		else
+		{
+			/*
+			 * Must be a scan plan.  They have no children that can be
+			 * disabled.
+			 */
+		}
+	}
+
+	if (plan->disabled_nodes > child_disabled_nodes)
+		return true;
+
+	return false;
+}
+
 /*
  * ExplainNode -
  *	  Appends a description of a plan tree to es->str
@@ -1399,6 +1460,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	ExplainWorkersState *save_workers_state = es->workers_state;
 	int			save_indent = es->indent;
 	bool		haschildren;
+	bool		isdisabled;
 
 	/*
 	 * Prepare per-worker output buffers, if needed.  We'll append the data in
@@ -1914,9 +1976,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
-	if (plan->disabled_nodes != 0)
-		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
-							   es);
+
+	isdisabled = plan_is_disabled(plan);
+	if (es->format != EXPLAIN_FORMAT_TEXT || isdisabled)
+		ExplainPropertyBool("Disabled", isdisabled, es);
 
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..4070f1b588 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -6076,6 +6076,7 @@ make_sort(Plan *lefttree, int numCols,
 
 	plan = &node->plan;
 	plan->targetlist = lefttree->targetlist;
+	plan->disabled_nodes += lefttree->disabled_nodes + (enable_sort == false);
 	plan->qual = NIL;
 	plan->lefttree = lefttree;
 	plan->righttree = NULL;
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 8ac13b562c..c9235028dc 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2920,23 +2920,20 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
-   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
-         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(17 rows)
+                           Disabled: true
+(14 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2958,24 +2955,21 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2985,24 +2979,21 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index b350efe128..b3ecfa0e81 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -335,12 +335,11 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                   QUERY PLAN                  
 ----------------------------------------------
  Sort
-   Disabled Nodes: 1
    Sort Key: proname
    ->  Seq Scan on pg_proc
-         Disabled Nodes: 1
+         Disabled: true
          Filter: (proname ~~* 'ri%foo'::text)
-(6 rows)
+(5 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index d01c304c24..dcbdaa0388 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -104,6 +104,7 @@ select explain_filter('explain (analyze, buffers, format xml) select * from int8
        <Actual-Total-Time>N.N</Actual-Total-Time>      +
        <Actual-Rows>N</Actual-Rows>                    +
        <Actual-Loops>N</Actual-Loops>                  +
+       <Disabled>false</Disabled>                      +
        <Shared-Hit-Blocks>N</Shared-Hit-Blocks>        +
        <Shared-Read-Blocks>N</Shared-Read-Blocks>      +
        <Shared-Dirtied-Blocks>N</Shared-Dirtied-Blocks>+
@@ -152,6 +153,7 @@ select explain_filter('explain (analyze, serialize, buffers, format yaml) select
      Actual Total Time: N.N   +
      Actual Rows: N           +
      Actual Loops: N          +
+     Disabled: false          +
      Shared Hit Blocks: N     +
      Shared Read Blocks: N    +
      Shared Dirtied Blocks: N +
@@ -213,6 +215,7 @@ select explain_filter('explain (buffers, format json) select * from int8_tbl i8'
        "Total Cost": N.N,          +
        "Plan Rows": N,             +
        "Plan Width": N,            +
+       "Disabled": false,          +
        "Shared Hit Blocks": N,     +
        "Shared Read Blocks": N,    +
        "Shared Dirtied Blocks": N, +
@@ -262,6 +265,7 @@ select explain_filter('explain (analyze, buffers, format json) select * from int
        "Actual Total Time": N.N,    +
        "Actual Rows": N,            +
        "Actual Loops": N,           +
+       "Disabled": false,           +
        "Shared Hit Blocks": N,      +
        "Shared Read Blocks": N,     +
        "Shared Dirtied Blocks": N,  +
@@ -370,6 +374,7 @@ select explain_filter('explain (memory, summary, format yaml) select * from int8
      Total Cost: N.N          +
      Plan Rows: N             +
      Plan Width: N            +
+     Disabled: false          +
    Planning:                  +
      Memory Used: N           +
      Memory Allocated: N      +
@@ -394,7 +399,8 @@ select explain_filter('explain (memory, analyze, format json) select * from int8
        "Actual Startup Time": N.N, +
        "Actual Total Time": N.N,   +
        "Actual Rows": N,           +
-       "Actual Loops": N           +
+       "Actual Loops": N,          +
+       "Disabled": false           +
      },                            +
      "Planning": {                 +
        "Memory Used": N,           +
@@ -497,6 +503,7 @@ select jsonb_pretty(
                                  "string4"                  +
                              ],                             +
                              "Schema": "public",            +
+                             "Disabled": false,             +
                              "Node Type": "Seq Scan",       +
                              "Plan Rows": 0,                +
                              "Plan Width": 0,               +
@@ -540,6 +547,7 @@ select jsonb_pretty(
                          "stringu2",                        +
                          "string4"                          +
                      ],                                     +
+                     "Disabled": false,                     +
                      "Sort Key": [                          +
                          "tenk1.tenthous"                   +
                      ],                                     +
@@ -586,6 +594,7 @@ select jsonb_pretty(
                  "stringu2",                                +
                  "string4"                                  +
              ],                                             +
+             "Disabled": false,                             +
              "Node Type": "Gather Merge",                   +
              "Plan Rows": 0,                                +
              "Plan Width": 0,                               +
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 79f0d37a87..cd9b7b7eea 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -538,6 +538,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
@@ -701,19 +702,17 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
-   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
-         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
-               Disabled Nodes: 1
+               Disabled: true
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(13 rows)
+(11 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
@@ -744,6 +743,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index dbb748a2d2..c9defd7e9d 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,7 +1614,6 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
-   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1624,11 +1623,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
-               Disabled Nodes: 1
+               Disabled: true
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(15 rows)
+(14 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
index 5cb9cde030..fdd0f6c8f2 100644
--- a/src/test/regress/expected/insert_conflict.out
+++ b/src/test/regress/expected/insert_conflict.out
@@ -218,6 +218,7 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
        "Async Capable": false,                                         +
        "Relation Name": "insertconflicttest",                          +
        "Alias": "insertconflicttest",                                  +
+       "Disabled": false,                                              +
        "Conflict Resolution": "UPDATE",                                +
        "Conflict Arbiter Indexes": ["key_index"],                      +
        "Conflict Filter": "(insertconflicttest.fruit <> 'Lime'::text)",+
@@ -226,7 +227,8 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
            "Node Type": "Result",                                      +
            "Parent Relationship": "Outer",                             +
            "Parallel Aware": false,                                    +
-           "Async Capable": false                                      +
+           "Async Capable": false,                                     +
+           "Disabled": false                                           +
          }                                                             +
        ]                                                               +
      }                                                                 +
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 31fb7d142e..e6a3da30f0 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8000,15 +8000,14 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
-   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
-         Disabled Nodes: 1
+         Disabled: true
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(9 rows)
+(8 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 9ee09fe2f5..f6b8329cd6 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -303,16 +303,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(10 rows)
+(9 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -320,16 +319,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(10 rows)
+(9 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2c63aa85a6..d17ade278b 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -537,14 +537,11 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
-   Disabled Nodes: 1
    ->  Nested Loop
-         Disabled Nodes: 1
          ->  Gather
-               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
-                     Disabled Nodes: 1
+                     Disabled: true
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -552,7 +549,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(16 rows)
+(13 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 7a698934ac..d62d32241d 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -474,6 +474,7 @@ SELECT * FROM
        "Async Capable": false,                                                                                                                                                                              +
        "Table Function Name": "json_table",                                                                                                                                                                 +
        "Alias": "json_table_func",                                                                                                                                                                          +
+       "Disabled": false,                                                                                                                                                                                   +
        "Output": ["id", "\"int\"", "text"],                                                                                                                                                                 +
        "Table Function Call": "JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 PASSING 3 AS a, '\"foo\"'::jsonb AS \"b c\" COLUMNS (id FOR ORDINALITY, \"int\" integer PATH '$', text text PATH '$'))"+
      }                                                                                                                                                                                                      +
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0456d48c93..c73631a9a1 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,7 +822,7 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
-   Disabled Nodes: 1
+   Disabled: true
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 361a6f9b27..fb5f345855 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1577,6 +1577,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1588,6 +1589,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1597,6 +1599,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d26e10441e..ef7dc03c69 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1209,6 +1209,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1220,6 +1221,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1229,6 +1231,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 73c2851d3f..4a9cdd2afe 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1563,6 +1563,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1574,6 +1575,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1583,6 +1585,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +

#109

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: David Rowley (#108)

Re: On disable_cost

On Sat, 2024-09-28 at 00:04 +1200, David Rowley wrote:

On Fri, 27 Sept 2024 at 20:42, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

2. The "disabled nodes" are not only shown at the nodes where nodes
were actually disabled, but also at every nodes above these nodes.

I'm also not a fan either and I'd like to see this output improved.

It seems like it's easy enough to implement some logic to detect when
a given node is disabled just by checking if the disable_nodes count
is higher than the sum of the disabled_node field of the node's
children. If there are no children (a scan node) and disabed_nodes >
0 then it must be disabled. There's even a nice fast path where we
don't need to check the children if disabled_nodes == 0.

Here's a POC grade patch of how I'd rather see it looking.

I opted to have a boolean field as I didn't see any need for an
integer count. I also changed things around so we always display the
boolean property in non-text EXPLAIN. Normally, we don't mind being
more verbose there.

I also fixed a bug in make_sort() where disabled_nodes isn't being set
properly. I'll do an independent patch for that if this goes nowhere.

Thanks, and the patch looks good.

Why did you change "Disabled" from an integer to a boolean?
If you see a join where two plans were disabled, that's useful information.

I would still prefer to see the disabled nodes only in VERBOSE explain,
but I'm satisfied if the disabled nodes don't show up all over the place.

Yours,
Laurenz Albe

#110

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#109)

Re: On disable_cost

On Tue, 1 Oct 2024 at 06:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Why did you change "Disabled" from an integer to a boolean?

I just don't think "Disabled Nodes" is all that self-documenting and
I'm also unsure why the full integer value of disabled_nodes is
required over just displaying the boolean value of if the node is
disabled or not. Won't readers look at the remainder of the plan to
determine information about which other nodes are disabled? Do we need
to give them a running total?

If you see a join where two plans were disabled, that's useful information.

I'm not sure if I follow what you mean here. The patch will show
"Disabled: true" for both the inner and outer side of the join if both
of those are disabled. The difference is that my patch does not show
the join itself is disabled like master does. I thought that's what
you were complaining about. Can you show an example of what you mean?

David

#111

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#110)

Re: On disable_cost

On Wed, Oct 2, 2024 at 4:55 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 1 Oct 2024 at 06:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Why did you change "Disabled" from an integer to a boolean?

I just don't think "Disabled Nodes" is all that self-documenting and
I'm also unsure why the full integer value of disabled_nodes is
required over just displaying the boolean value of if the node is
disabled or not. Won't readers look at the remainder of the plan to
determine information about which other nodes are disabled? Do we need
to give them a running total?

I don't think this will produce the right answer in all cases because
disabled node counts don't propagate across subquery levels.

--
Robert Haas
EDB: http://www.enterprisedb.com

#112

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#107)

Re: On disable_cost

On Fri, Sep 27, 2024 at 4:42 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

1. The "disabled nodes" are always displayed.
I'd be happier if it were only shown for COSTS ON, but I think it
would be best if they were only shown with VERBOSE ON.

After all, the messages are pretty verbose...

I agree that the messages are more verbose than what we did before
(add a large value to the cost). But I would have thought it wouldn't
matter much because most of the time nothing will be disabled. And I
would think if you get a plan that has some nodes disabled, you would
want to know about that.

I actually thought it was rather nice that this system lets you show
the disabled-nodes information even when COSTS OFF. Regression tests
need to suppress costs because it can vary by platform, but the count
of disabled nodes is stable enough to display.

--
Robert Haas
EDB: http://www.enterprisedb.com

#113

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: David Rowley (#110)

Re: On disable_cost

On Wed, 2024-10-02 at 21:55 +1300, David Rowley wrote:

On Tue, 1 Oct 2024 at 06:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Why did you change "Disabled" from an integer to a boolean?

I just don't think "Disabled Nodes" is all that self-documenting and
I'm also unsure why the full integer value of disabled_nodes is
required over just displaying the boolean value of if the node is
disabled or not. Won't readers look at the remainder of the plan to
determine information about which other nodes are disabled? Do we need
to give them a running total?

I didn't want a running total, but maybe I misunderstood what a disabled
node is; see below.

If you see a join where two plans were disabled, that's useful information.

I'm not sure if I follow what you mean here. The patch will show
"Disabled: true" for both the inner and outer side of the join if both
of those are disabled. The difference is that my patch does not show
the join itself is disabled like master does. I thought that's what
you were complaining about. Can you show an example of what you mean?

I ran the following example, and now I am confused.

CREATE TABLE tab_a (id integer);

CREATE TABLE tab_b (id integer);

SET enable_nestloop = off;
SET enable_hashjoin = off;

EXPLAIN SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
═════════════════════════════════════════════════════════════════════
Merge Join (cost=359.57..860.00 rows=32512 width=4)
Merge Cond: (tab_a.id = tab_b.id)
-> Sort (cost=179.78..186.16 rows=2550 width=4)
Sort Key: tab_a.id
-> Seq Scan on tab_a (cost=0.00..35.50 rows=2550 width=4)
-> Sort (cost=179.78..186.16 rows=2550 width=4)
Sort Key: tab_b.id
-> Seq Scan on tab_b (cost=0.00..35.50 rows=2550 width=4)

I would have expected to see "Disabled nodes: 2" with the merge join,
because both the nested loop join and the hash join have been disabled.

Why is there no disabled node shown?

Yours,
Laurenz Albe

#114

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Robert Haas (#112)

Re: On disable_cost

On Wed, 2024-10-02 at 10:08 -0400, Robert Haas wrote:

On Fri, Sep 27, 2024 at 4:42 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

1. The "disabled nodes" are always displayed.
    I'd be happier if it were only shown for COSTS ON, but I think it
    would be best if they were only shown with VERBOSE ON.

    After all, the messages are pretty verbose...

I agree that the messages are more verbose than what we did before
(add a large value to the cost). But I would have thought it wouldn't
matter much because most of the time nothing will be disabled. And I
would think if you get a plan that has some nodes disabled, you would
want to know about that.

I'm alright with that, but I certainly don't want to see them propagated
through the tree. If you have a three page execution plan, and now it
is four pages long because some sequential scan at the lower end was
disabled and I get "Disabled nodes: 1" on every third line, that is
going to make me unhappy.

I actually thought it was rather nice that this system lets you show
the disabled-nodes information even when COSTS OFF. Regression tests
need to suppress costs because it can vary by platform, but the count
of disabled nodes is stable enough to display.

VERBOSE can be used with COSTS OFF, so that would work nicely if the
disabled nodes were only shown with EXPLAIN (VERBOSE).

I don't think that the feature is bad, I just would prefer it disabled
by default.

Yours,
Laurenz Albe

#115

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#113)

Re: On disable_cost

Hi!

On 02.10.2024 21:04, Laurenz Albe wrote:

I didn't want a running total, but maybe I misunderstood what a disabled
node is; see below.

If you see a join where two plans were disabled, that's useful information.

I'm not sure if I follow what you mean here. The patch will show
"Disabled: true" for both the inner and outer side of the join if both
of those are disabled. The difference is that my patch does not show
the join itself is disabled like master does. I thought that's what
you were complaining about. Can you show an example of what you mean?

I ran the following example, and now I am confused.

CREATE TABLE tab_a (id integer);

CREATE TABLE tab_b (id integer);

SET enable_nestloop = off;
SET enable_hashjoin = off;

EXPLAIN SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
═════════════════════════════════════════════════════════════════════
Merge Join (cost=359.57..860.00 rows=32512 width=4)
Merge Cond: (tab_a.id = tab_b.id)
-> Sort (cost=179.78..186.16 rows=2550 width=4)
Sort Key: tab_a.id
-> Seq Scan on tab_a (cost=0.00..35.50 rows=2550 width=4)
-> Sort (cost=179.78..186.16 rows=2550 width=4)
Sort Key: tab_b.id
-> Seq Scan on tab_b (cost=0.00..35.50 rows=2550 width=4)

I would have expected to see "Disabled nodes: 2" with the merge join,
because both the nested loop join and the hash join have been disabled.

Why is there no disabled node shown?

Disabled nodes show the number of disabled paths, you simply don’t have
them here in mergejoin, because hashjoin and nestedloop were not
selected. The reason is the compare_path_costs_fuzzily function, because
the function decides which path is better based on fewer disabled nodes.
hashjoin and nestedloop have 1 more nodes compared to mergejoin. you can
disable mergejoin, I think the output about this will appear.

--
Regards,
Alena Rybakina
Postgres Professional

#116

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Alena Rybakina (#115)

Re: On disable_cost

you can disable mergejoin, I think the output about this will appear.

I did it and disabled nodes were displayed in the query explain:

alena@postgres=# CREATE TABLE tab_a (id integer);
alena@postgres=# CREATE TABLE tab_a (id integer);
alena@postgres=# CREATE TABLE tab_b (id integer);
alena@postgres=# CREATE TABLE tab_b (id integer);
alena@postgres=# SET enable_nestloop = off;
alena@postgres=# SET enable_nestloop = off;
alena@postgres=# SET enable_hashjoin = off;
alena@postgres=# SET enable_mergejoin = off;

alena@postgres=# EXPLAIN SELECT * FROM tab_a JOIN tab_b USING (id);
                             QUERY PLAN
---------------------------------------------------------------------
Nested Loop (cost=0.00..97614.88 rows=32512 width=4)
   Disabled Nodes: 1
   Join Filter: (tab_a.id = tab_b.id)
   -> Seq Scan on tab_a (cost=0.00..35.50 rows=2550 width=4)
   -> Materialize (cost=0.00..48.25 rows=2550 width=4)
         -> Seq Scan on tab_b (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

The number of disabled nodes
alena@postgres=# set enable_seqscan =off;
SET
alena@postgres=# EXPLAIN SELECT * FROM tab_a JOIN tab_b USING (id);
                             QUERY PLAN
---------------------------------------------------------------------
Nested Loop (cost=0.00..97614.88 rows=32512 width=4)
   Disabled Nodes: 3
   Join Filter: (tab_a.id = tab_b.id)
   -> Seq Scan on tab_a (cost=0.00..35.50 rows=2550 width=4)
         Disabled Nodes: 1
   -> Materialize (cost=0.00..48.25 rows=2550 width=4)
         Disabled Nodes: 1
         -> Seq Scan on tab_b (cost=0.00..35.50 rows=2550 width=4)
               Disabled Nodes: 1
(9 rows)

Here is an example, if you also disable seqscan. the number of disabled
nodes in a join connection is equal to the sum of all disabled subnodes
and the nestedloop itself (it is also disabled).

Honestly, I like this patch. Before this patch, when disabling any
algorithm in the optimizer, the cost increased significantly and I’m not
sure that this was a reliable solution due to the fact that the cost
even without disabling can be greatly increased because of the high
cardinality, for example.

Right there, the mechanism is simple and more honest in my opinion - we
simply count the number of disabled nodes and discard the paths with the
largest number of them.

--
Regards,
Alena Rybakina
Postgres Professional

#117

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alena Rybakina (#115)

Re: On disable_cost

On Wed, 2024-10-02 at 21:13 +0300, Alena Rybakina wrote:

  CREATE TABLE tab_a (id integer);

  CREATE TABLE tab_b (id integer);

  SET enable_nestloop = off;
  SET enable_hashjoin = off;

  EXPLAIN SELECT * FROM tab_a JOIN tab_b USING (id);

                               QUERY PLAN
  ═════════════════════════════════════════════════════════════════════
   Merge Join (cost=359.57..860.00 rows=32512 width=4)
     Merge Cond: (tab_a.id = tab_b.id)
     -> Sort (cost=179.78..186.16 rows=2550 width=4)
           Sort Key: tab_a.id
           -> Seq Scan on tab_a (cost=0.00..35.50 rows=2550 width=4)
     -> Sort (cost=179.78..186.16 rows=2550 width=4)
           Sort Key: tab_b.id
           -> Seq Scan on tab_b (cost=0.00..35.50 rows=2550 width=4)

I would have expected to see "Disabled nodes: 2" with the merge join,
because both the nested loop join and the hash join have been disabled.

Why is there no disabled node shown?



    Disabled nodes show the number of disabled paths, you simply don’t
    have them here in mergejoin, because hashjoin and nestedloop were
    not selected. The reason is the compare_path_costs_fuzzily function,
    because the function decides which path is better based on fewer
    disabled nodes. hashjoin and nestedloop have 1 more nodes compared
    to mergejoin. you can disable mergejoin, I think the output about
    this will appear.

I see; the merge join happened to be the preferred join path, so nothing
had to be excluded.

/* reset all parameters */

EXPLAIN (COSTS OFF) SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
═════════════════════════════════════
Merge Join
Merge Cond: (tab_a.id = tab_b.id)
-> Sort
Sort Key: tab_a.id
-> Seq Scan on tab_a
-> Sort
Sort Key: tab_b.id
-> Seq Scan on tab_b

So now if I disable merge joins, I should get a different strategy and see
a disabled node, right?

SET enable_mergejoin = off;

EXPLAIN (COSTS OFF) SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
════════════════════════════════════
Hash Join
Hash Cond: (tab_a.id = tab_b.id)
-> Seq Scan on tab_a
-> Hash
-> Seq Scan on tab_b

No disabled node shown... Ok, I still don't get it.

Yours,
Laurenz Albe

#118

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alena Rybakina (#116)

Re: On disable_cost

On Wed, 2024-10-02 at 21:31 +0300, Alena Rybakina wrote:

Honestly, I like this patch. Before this patch, when disabling any algorithm
in the optimizer, the cost increased significantly and I’m not sure that this
was a reliable solution due to the fact that the cost even without disabling
can be greatly increased because of the high cardinality, for example.

Right there, the mechanism is simple and more honest in my opinion - we simply
count the number of disabled nodes and discard the paths with the largest
number of them.

I have no issue with this way of handling disabled plan nodes, I only
complained about the verbosity of the EXPLAIN output.

I don't want to see disabled nodes propagated all the way up the tree,
and I would like the output suppressed by default.

Yours,
Laurenz Albe

#119

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#117)

Re: On disable_cost

I see; the merge join happened to be the preferred join path, so nothing
had to be excluded.

/* reset all parameters */

EXPLAIN (COSTS OFF) SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
═════════════════════════════════════
Merge Join
Merge Cond: (tab_a.id = tab_b.id)
-> Sort
Sort Key: tab_a.id
-> Seq Scan on tab_a
-> Sort
Sort Key: tab_b.id
-> Seq Scan on tab_b

So now if I disable merge joins, I should get a different strategy and see
a disabled node, right?

SET enable_mergejoin = off;

EXPLAIN (COSTS OFF) SELECT * FROM tab_a JOIN tab_b USING (id);

QUERY PLAN
════════════════════════════════════
Hash Join
Hash Cond: (tab_a.id = tab_b.id)
-> Seq Scan on tab_a
-> Hash
-> Seq Scan on tab_b

No disabled node shown... Ok, I still don't get it.

No, you don't see it.

you can see that the compare_path_costs_fuzzily function is fundamental
to determining which path will remain - new path or one of the old paths
added in the pathlist of relation (see add_path function that calls
compare_path_costs_fuzzily function).

One of the signs for it is an assessment based on the number of disabled
paths. This lines from the compare_path_costs_fuzzily function:

/* Number of disabled nodes, if different, trumps all else. */
if (unlikely(path1->disabled_nodes != path2->disabled_nodes))
{
   if (path1->disabled_nodes < path2->disabled_nodes)
        return COSTS_BETTER1;
   else
        return COSTS_BETTER2;

}

Since mergejoin is disabled for optimizer, the number of disabled nodes
are equal to 1. hashjoin is enabled and the number of its disabled nodes
are equal to 0. Thus, a hash join will be chosen since the number of
disabled nodes is less compared to a merge join.

Hashjoin is not disabled, so there are no note in the query plan that it
is disabled.

EXPLAIN (COSTS OFF) SELECT * FROM tab_a JOIN tab_b USING (id);

--
Regards,
Alena Rybakina
Postgres Professional

#120

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#118)

Re: On disable_cost

On 02.10.2024 22:08, Laurenz Albe wrote:

On Wed, 2024-10-02 at 21:31 +0300, Alena Rybakina wrote:

Honestly, I like this patch. Before this patch, when disabling any algorithm
in the optimizer, the cost increased significantly and I’m not sure that this
was a reliable solution due to the fact that the cost even without disabling
can be greatly increased because of the high cardinality, for example.

Right there, the mechanism is simple and more honest in my opinion - we simply
count the number of disabled nodes and discard the paths with the largest
number of them.

I have no issue with this way of handling disabled plan nodes, I only
complained about the verbosity of the EXPLAIN output.

I'm willing to agree with you. I think we should display it not all the
time.

I don't want to see disabled nodes propagated all the way up the tree,
and I would like the output suppressed by default.

I may have misunderstood your message, but disabled nodes number must
propagate up the tree, otherwise it will be incorrect.

Let consider an example. We disabled seqscan, so the hashjoin containing
it cannot be equal to 0, since such a path in principle should not be
generated because a path must be generated that does not contain
seqscan. It should use indexscan, for example. Therefore the hash join
path containing indexscan will have fewer disabled nodes and will
finally be used by the optimizer.

--
Regards,
Alena Rybakina
Postgres Professional

#121

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Alena Rybakina (#120)

Re: On disable_cost

On Thu, 3 Oct 2024 at 08:41, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

I may have misunderstood your message, but disabled nodes number must propagate up the tree, otherwise it will be incorrect.

I think there are two misunderstandings on this thread:

1) You're misunderstanding what Laurenz is complaining about. He's
only concerned with the EXPLAIN output, not how disasbled_nodes works
internally.
2) Laurenz is misunderstanding what "Disabled Nodes" means. It has
nothing to do with other Paths which were considered and rejected. It
might be better named as "Disabled Degree". It tracks how many plan
nodes below and including this node are disabled.

Because of #2, I think I now understand why Laurenz was interested in
only showing this with VERBOSE. If it worked the way Laurenz thought,
I'd probably agree with him.

Overall, I think we need to do something here. There's no
documentation about what Disabled Nodes means so we either need to
make it easier to understand without documenting it or add something
to the documents about it. If Laurenz, who has a huge amount of
PostgreSQL experience didn't catch it, then what hope is there for the
average user?

David

#122

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#111)

1 attachment(s)

Re: On disable_cost

On Thu, 3 Oct 2024 at 03:04, Robert Haas <robertmhaas@gmail.com> wrote:

I don't think this will produce the right answer in all cases because
disabled node counts don't propagate across subquery levels.

I see my patch didn't behave correctly when faced with a SubqueryScan
as SubqueryScan does not use the "lefttree" field and has a "subplan"
field instead. The patch will need special handling for that (fixed in
the attached patch).

I can't quite find the area you're talking about where the
disabled_nodes don't propagate through subquery levels. Looking at
cost_subqueryscan(), I see propagation of disabled_nodes. If the
SubqueryScan node isn't present then the propagation just occurs
normally as it does with other path types. e.g. master does:

# set enable_Seqscan=0;
# explain (costs off) select * from (select * from pg_class offset 0)
order by oid;
QUERY PLAN
----------------------------
Sort
Disabled Nodes: 1
Sort Key: pg_class.oid
-> Seq Scan on pg_class
Disabled Nodes: 1
(5 rows)

Can you provide an example of what you mean?

I've attached an updated PoC patch which I think gets the SubqueryScan
stuff correct. I've not spent time testing everything as if nobody
likes the patch's EXPLAIN output, I don't want to waste time on the
patch for that.

I understand you're keen on keeping the output as it is in master. It
would be good to hear if other people agree with you on this. I
imagine you'd rather work on other things, but it's easier to discuss
this now than after PG18 is out.

For me, I find master's output overly verbose and not all that easy to
identify the disabled nodes as it requires scanning all the
disabled_node values and finding the nodes where the value is one
higher than the sum of the sum node's disabled_nodes. For example, if
a Nested Loop has "Disabled Nodes: 17" and the inner side of the join
has "Disabled Nodes: 9" and the outer side has "Disabled Nodes: 8",
it's not that easy to determine if the nested loop is disabled or not.
Of course, you only need to do 8+9=17 and see it isn't, but when faced
with run-time pruning done at executor startup, some
Append/MergeAppend nodes might be missing from EXPLAIN and when that
happens, you can't just manually add the Disabled Nodes up. Here's
what I mean:

setup:

create table lp (a int) partition by list(a);
create table lp1 partition of lp for values in(1);
create table lp2 partition of lp for values in(2);
set enable_seqscan=0;
prepare q1(int) as select * from lp where a = $1 order by a;
set plan_cache_mode=force_generic_plan;
explain (analyze, costs off, timing off, summary off) execute q1(1);

master:

Append (actual rows=0 loops=1)
Disabled Nodes: 2
Subplans Removed: 1
-> Seq Scan on lp1 lp_1 (actual rows=0 loops=1)
Disabled Nodes: 1
Filter: (a = $1)

patched:

Append (actual rows=0 loops=1)
Subplans Removed: 1
-> Seq Scan on lp1 lp_1 (actual rows=0 loops=1)
Disabled: true
Filter: (a = $1)

With master, it looks like Seq Scan and Append are disabled. With the
patched version, you can see it isn't.

David

Attachments:

poc_improve_disabled_nodes_explain_output_v2.patchapplication/octet-stream; name=poc_improve_disabled_nodes_explain_output_v2.patchDownload

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee1bcb84e2..1e9105298b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1363,6 +1363,80 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 	return planstate_tree_walker(planstate, ExplainPreScanNode, rels_used);
 }
 
+/*
+ * plan_is_disabled
+ *		Checks if the given plan node type was disabled during query planning.
+ *		This is evident by the disable_node field being higher than the sum of
+ *		the disabled_node field from the plan's children.
+ */
+static bool
+plan_is_disabled(Plan *plan)
+{
+	int child_disabled_nodes;
+
+	/* The node is certainly not disabled if this is zero */
+	if (plan->disabled_nodes == 0)
+		return false;
+
+	child_disabled_nodes = 0;
+
+	/*
+	 * Handle special nodes first.  Children of BitmapOrs and BitmapAnds
+	 * can't be disabled, so no need to handle those specifically.
+	 */
+	if (IsA(plan, Append))
+	{
+		ListCell *lc;
+		Append *aplan = (Append *) plan;
+
+		foreach(lc, aplan->appendplans)
+		{
+			Plan *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else if (IsA(plan, MergeAppend))
+	{
+		ListCell *lc;
+		MergeAppend *maplan = (MergeAppend *) plan;
+
+		foreach(lc, maplan->mergeplans)
+		{
+			Plan *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else if (IsA(plan, SubqueryScan))
+		child_disabled_nodes += ((SubqueryScan *) plan)->subplan->disabled_nodes;
+	else if (IsA(plan, CustomScan))
+	{
+		ListCell *lc;
+		CustomScan *cplan = (CustomScan *) plan;
+
+		foreach(lc, cplan->custom_plans)
+		{
+			Plan *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else
+	{
+		/* else, sum up disabled_nodes from the plan's inner and outer side */
+		if (outerPlan(plan))
+			child_disabled_nodes += outerPlan(plan)->disabled_nodes;
+		if (innerPlan(plan))
+			child_disabled_nodes += innerPlan(plan)->disabled_nodes;
+	}
+
+	if (plan->disabled_nodes > child_disabled_nodes)
+		return true;
+
+	return false;
+}
+
 /*
  * ExplainNode -
  *	  Appends a description of a plan tree to es->str
@@ -1399,6 +1473,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	ExplainWorkersState *save_workers_state = es->workers_state;
 	int			save_indent = es->indent;
 	bool		haschildren;
+	bool		isdisabled;
 
 	/*
 	 * Prepare per-worker output buffers, if needed.  We'll append the data in
@@ -1914,9 +1989,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
-	if (plan->disabled_nodes != 0)
-		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
-							   es);
+
+	isdisabled = plan_is_disabled(plan);
+	if (es->format != EXPLAIN_FORMAT_TEXT || isdisabled)
+		ExplainPropertyBool("Disabled", isdisabled, es);
 
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..4070f1b588 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -6076,6 +6076,7 @@ make_sort(Plan *lefttree, int numCols,
 
 	plan = &node->plan;
 	plan->targetlist = lefttree->targetlist;
+	plan->disabled_nodes += lefttree->disabled_nodes + (enable_sort == false);
 	plan->qual = NIL;
 	plan->lefttree = lefttree;
 	plan->righttree = NULL;
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 8ac13b562c..c9235028dc 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2920,23 +2920,20 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
-   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
-         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(17 rows)
+                           Disabled: true
+(14 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2958,24 +2955,21 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -2985,24 +2979,21 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index b350efe128..b3ecfa0e81 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -335,12 +335,11 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                   QUERY PLAN                  
 ----------------------------------------------
  Sort
-   Disabled Nodes: 1
    Sort Key: proname
    ->  Seq Scan on pg_proc
-         Disabled Nodes: 1
+         Disabled: true
          Filter: (proname ~~* 'ri%foo'::text)
-(6 rows)
+(5 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index d01c304c24..dcbdaa0388 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -104,6 +104,7 @@ select explain_filter('explain (analyze, buffers, format xml) select * from int8
        <Actual-Total-Time>N.N</Actual-Total-Time>      +
        <Actual-Rows>N</Actual-Rows>                    +
        <Actual-Loops>N</Actual-Loops>                  +
+       <Disabled>false</Disabled>                      +
        <Shared-Hit-Blocks>N</Shared-Hit-Blocks>        +
        <Shared-Read-Blocks>N</Shared-Read-Blocks>      +
        <Shared-Dirtied-Blocks>N</Shared-Dirtied-Blocks>+
@@ -152,6 +153,7 @@ select explain_filter('explain (analyze, serialize, buffers, format yaml) select
      Actual Total Time: N.N   +
      Actual Rows: N           +
      Actual Loops: N          +
+     Disabled: false          +
      Shared Hit Blocks: N     +
      Shared Read Blocks: N    +
      Shared Dirtied Blocks: N +
@@ -213,6 +215,7 @@ select explain_filter('explain (buffers, format json) select * from int8_tbl i8'
        "Total Cost": N.N,          +
        "Plan Rows": N,             +
        "Plan Width": N,            +
+       "Disabled": false,          +
        "Shared Hit Blocks": N,     +
        "Shared Read Blocks": N,    +
        "Shared Dirtied Blocks": N, +
@@ -262,6 +265,7 @@ select explain_filter('explain (analyze, buffers, format json) select * from int
        "Actual Total Time": N.N,    +
        "Actual Rows": N,            +
        "Actual Loops": N,           +
+       "Disabled": false,           +
        "Shared Hit Blocks": N,      +
        "Shared Read Blocks": N,     +
        "Shared Dirtied Blocks": N,  +
@@ -370,6 +374,7 @@ select explain_filter('explain (memory, summary, format yaml) select * from int8
      Total Cost: N.N          +
      Plan Rows: N             +
      Plan Width: N            +
+     Disabled: false          +
    Planning:                  +
      Memory Used: N           +
      Memory Allocated: N      +
@@ -394,7 +399,8 @@ select explain_filter('explain (memory, analyze, format json) select * from int8
        "Actual Startup Time": N.N, +
        "Actual Total Time": N.N,   +
        "Actual Rows": N,           +
-       "Actual Loops": N           +
+       "Actual Loops": N,          +
+       "Disabled": false           +
      },                            +
      "Planning": {                 +
        "Memory Used": N,           +
@@ -497,6 +503,7 @@ select jsonb_pretty(
                                  "string4"                  +
                              ],                             +
                              "Schema": "public",            +
+                             "Disabled": false,             +
                              "Node Type": "Seq Scan",       +
                              "Plan Rows": 0,                +
                              "Plan Width": 0,               +
@@ -540,6 +547,7 @@ select jsonb_pretty(
                          "stringu2",                        +
                          "string4"                          +
                      ],                                     +
+                     "Disabled": false,                     +
                      "Sort Key": [                          +
                          "tenk1.tenthous"                   +
                      ],                                     +
@@ -586,6 +594,7 @@ select jsonb_pretty(
                  "stringu2",                                +
                  "string4"                                  +
              ],                                             +
+             "Disabled": false,                             +
              "Node Type": "Gather Merge",                   +
              "Plan Rows": 0,                                +
              "Plan Width": 0,                               +
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index 79f0d37a87..cd9b7b7eea 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -538,6 +538,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
@@ -701,19 +702,17 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
-   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
-         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
-               Disabled Nodes: 1
+               Disabled: true
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(13 rows)
+(11 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
@@ -744,6 +743,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index dbb748a2d2..c9defd7e9d 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,7 +1614,6 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
-   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1624,11 +1623,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
-               Disabled Nodes: 1
+               Disabled: true
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(15 rows)
+(14 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
index 5cb9cde030..fdd0f6c8f2 100644
--- a/src/test/regress/expected/insert_conflict.out
+++ b/src/test/regress/expected/insert_conflict.out
@@ -218,6 +218,7 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
        "Async Capable": false,                                         +
        "Relation Name": "insertconflicttest",                          +
        "Alias": "insertconflicttest",                                  +
+       "Disabled": false,                                              +
        "Conflict Resolution": "UPDATE",                                +
        "Conflict Arbiter Indexes": ["key_index"],                      +
        "Conflict Filter": "(insertconflicttest.fruit <> 'Lime'::text)",+
@@ -226,7 +227,8 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
            "Node Type": "Result",                                      +
            "Parent Relationship": "Outer",                             +
            "Parallel Aware": false,                                    +
-           "Async Capable": false                                      +
+           "Async Capable": false,                                     +
+           "Disabled": false                                           +
          }                                                             +
        ]                                                               +
      }                                                                 +
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 31fb7d142e..e6a3da30f0 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8000,15 +8000,14 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
-   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
-         Disabled Nodes: 1
+         Disabled: true
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(9 rows)
+(8 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 9ee09fe2f5..f6b8329cd6 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -303,16 +303,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(10 rows)
+(9 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -320,16 +319,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(10 rows)
+(9 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2c63aa85a6..d17ade278b 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -537,14 +537,11 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
-   Disabled Nodes: 1
    ->  Nested Loop
-         Disabled Nodes: 1
          ->  Gather
-               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
-                     Disabled Nodes: 1
+                     Disabled: true
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -552,7 +549,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(16 rows)
+(13 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 7a698934ac..d62d32241d 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -474,6 +474,7 @@ SELECT * FROM
        "Async Capable": false,                                                                                                                                                                              +
        "Table Function Name": "json_table",                                                                                                                                                                 +
        "Alias": "json_table_func",                                                                                                                                                                          +
+       "Disabled": false,                                                                                                                                                                                   +
        "Output": ["id", "\"int\"", "text"],                                                                                                                                                                 +
        "Table Function Call": "JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 PASSING 3 AS a, '\"foo\"'::jsonb AS \"b c\" COLUMNS (id FOR ORDINALITY, \"int\" integer PATH '$', text text PATH '$'))"+
      }                                                                                                                                                                                                      +
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0456d48c93..c73631a9a1 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,7 +822,7 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
-   Disabled Nodes: 1
+   Disabled: true
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 361a6f9b27..fb5f345855 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1577,6 +1577,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1588,6 +1589,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1597,6 +1599,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d26e10441e..ef7dc03c69 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1209,6 +1209,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1220,6 +1221,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1229,6 +1231,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 73c2851d3f..4a9cdd2afe 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1563,6 +1563,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1574,6 +1575,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1583,6 +1585,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +

#123

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: David Rowley (#121)

Re: On disable_cost

On Thu, 2024-10-03 at 11:44 +1300, David Rowley wrote:

2) Laurenz is misunderstanding what "Disabled Nodes" means. It has
nothing to do with other Paths which were considered and rejected. It
might be better named as "Disabled Degree". It tracks how many plan
nodes below and including this node are disabled.

Because of #2, I think I now understand why Laurenz was interested in
only showing this with VERBOSE. If it worked the way Laurenz thought,
I'd probably agree with him.

Ah, thanks, now I see the light.
You only see a "disabled node" if the optimizer chose a node you explicitly
disabled, like a sequential scan, a nested loop join or a sort.

I completely agree with you: it should always be displayed, and a boolean is
the appropriate way. The display just shouldn't be propagated up the tree
to nodes that were not actually disabled.

Perhaps a line of documentation on the EXPLAIN reference page or on the
"Using EXPLAIN" page would be in order.

Yours,
Laurenz Albe

#124

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#122)

Re: On disable_cost

On Wed, Oct 2, 2024 at 8:11 PM David Rowley <dgrowleyml@gmail.com> wrote:

I can't quite find the area you're talking about where the
disabled_nodes don't propagate through subquery levels. Looking at
cost_subqueryscan(), I see propagation of disabled_nodes. If the
SubqueryScan node isn't present then the propagation just occurs
normally as it does with other path types. e.g. master does:

Yeah, that case seems to work OK. But for example, consider this:

robert.haas=# explain with recursive foo as (select count(*) from
pgbench_accounts union all select aid from pgbench_accounts a, foo
where aid > foo.count) select * from pgbench_accounts, foo where aid =
foo.count;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Hash Join (cost=245288.11..348310.09 rows=3333331 width=105)
Disabled Nodes: 1
Hash Cond: (foo.count = pgbench_accounts.aid)
CTE foo
-> Recursive Union (cost=2890.00..239835.11 rows=3333331 width=8)
Disabled Nodes: 2
-> Aggregate (cost=2890.00..2890.01 rows=1 width=8)
Disabled Nodes: 1
-> Seq Scan on pgbench_accounts pgbench_accounts_1
(cost=0.00..2640.00 rows=100000 width=0)
Disabled Nodes: 1
-> Subquery Scan on "*SELECT* 2" (cost=369.02..20361.18
rows=333333 width=8)
Disabled Nodes: 1
-> Nested Loop (cost=369.02..16194.52 rows=333333 width=4)
Disabled Nodes: 1
-> WorkTable Scan on foo foo_1
(cost=0.00..0.20 rows=10 width=8)
-> Bitmap Heap Scan on pgbench_accounts a
(cost=369.02..1286.10 rows=33333 width=4)
Disabled Nodes: 1
Recheck Cond: (aid > foo_1.count)
-> Bitmap Index Scan on
pgbench_accounts_pkey (cost=0.00..360.69 rows=33333 width=0)
Index Cond: (aid > foo_1.count)
-> CTE Scan on foo (cost=0.00..66666.62 rows=3333331 width=8)
-> Hash (cost=2640.00..2640.00 rows=100000 width=97)
Disabled Nodes: 1
-> Seq Scan on pgbench_accounts (cost=0.00..2640.00
rows=100000 width=97)
Disabled Nodes: 1
(25 rows)

You might expect that the number of disabled nodes for the hash join
would include the number for the CTE attached to it, but it doesn't. I
suspect similar things will happen when a node has an InitPlan or
SubPlan node attached to it. (I have not tested whether your patch
gets these cases right.)

I understand you're keen on keeping the output as it is in master. It
would be good to hear if other people agree with you on this. I
imagine you'd rather work on other things, but it's easier to discuss
this now than after PG18 is out.

For sure. To be clear, it's not that I love the current output. It's
that I'm worried that it will be hard to get the thing that you and
Laurenz want to be fully reliable, and I think there's a chance that
not only might it contain bugs now, but it might turn out that people
changing logic in this area in the future introduce more bugs.
plan_is_disabled() has to get exactly the correct answer for the child
nodes every time, or the answer is wrong, and I'm not as confident as
you are that your logic is fully correct (which doesn't mean that I
can prove to you that it is incorrect; I don't even know that it is).
I agree that if we're going to change this, it's much better to do it
sooner rather than later, because then we've got time to debug it if
needed.

--
Robert Haas
EDB: http://www.enterprisedb.com

#125

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: David Rowley (#121)

Re: On disable_cost

On 03.10.2024 01:44, David Rowley wrote:

On Thu, 3 Oct 2024 at 08:41, Alena Rybakina<a.rybakina@postgrespro.ru> wrote:

I may have misunderstood your message, but disabled nodes number must propagate up the tree, otherwise it will be incorrect.

I think there are two misunderstandings on this thread:

1) You're misunderstanding what Laurenz is complaining about. He's
only concerned with the EXPLAIN output, not how disasbled_nodes works
internally.

Sorry, maybe you're right, I misunderstood his request [0]/messages/by-id/0cdd3504502aac827acb3ae615eda09aeb883f74.camel@cybertec.at. But I tried
to answer his question why disabled nodes aren't displayed by explaining
how it works.

2) Laurenz is misunderstanding what "Disabled Nodes" means. It has
nothing to do with other Paths which were considered and rejected. It
might be better named as "Disabled Degree". It tracks how many plan
nodes below and including this node are disabled.

yes, I agree with you and that's exactly what I tried to explain with
examples.

Unfortunately I was unable to generalize this conclusion correctly. Thanks)

Because of #2, I think I now understand why Laurenz was interested in
only showing this with VERBOSE. If it worked the way Laurenz thought,
I'd probably agree with him.

Overall, I think we need to do something here. There's no
documentation about what Disabled Nodes means so we either need to
make it easier to understand without documenting it or add something
to the documents about it. If Laurenz, who has a huge amount of
PostgreSQL experience didn't catch it, then what hope is there for the
average user?

I think you are right, most users will perceive this parameter as the
number of rejected paths, and not in any other way.

[0]: /messages/by-id/0cdd3504502aac827acb3ae615eda09aeb883f74.camel@cybertec.at
/messages/by-id/0cdd3504502aac827acb3ae615eda09aeb883f74.camel@cybertec.at

--
Regards,
Alena Rybakina
Postgres Professional

#126

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Alena Rybakina (#125)

1 attachment(s)

Re: On disable_cost

Overall, I think we need to do something here. There's no
documentation about what Disabled Nodes means so we either need to
make it easier to understand without documenting it or add something
to the documents about it. If Laurenz, who has a huge amount of
PostgreSQL experience didn't catch it, then what hope is there for the
average user?

I think you are right, most users will perceive this parameter as the
number of rejected paths, and not in any other way.

To be honest, I don't have much experience writing documentation, but I
think we should add a little more information to doc/src/sgml/perform.sgml.

It contains a description about "explain queries", so the description of
"Disabled nodes" is available there.

I prepared a patch that includes the information we can add.

--
Regards,
Alena Rybakina
Postgres Professional

Attachments:

0001-Documentation-about-Disabled-nodes.-We-need-to-descr.patchtext/x-patch; charset=UTF-8; name=0001-Documentation-about-Disabled-nodes.-We-need-to-descr.patchDownload

From 3670f14326bf3ac97242042e63046f7658d13709 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <a.rybakina@postgrespro.ru>
Date: Thu, 3 Oct 2024 20:31:25 +0300
Subject: [PATCH] Documentation about Disabled nodes. We need to describe this,
 as this parameter can be perceived ambiguously.

---
 doc/src/sgml/perform.sgml | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b65245..9f7339f3388 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -130,6 +130,42 @@ EXPLAIN SELECT * FROM tenk1;
     </itemizedlist>
    </para>
 
+   <para>
+    The parameter <literal>Disabled nodes</literal> appears if one of the
+    <xref linkend="runtime-config-query-enable"/> was disable. It tracks how many plan
+    nodes below and including this node are disabled.
+   </para>
+
+   <para>
+    Here is an example, just to show what the output looks like:
+
+<screen>
+SET enable_seqscan = off;
+SET
+SET enable_nestloop = off;
+SET
+EXPLAIN SELECT * FROM tenk1 t1 join tenk1 t2 on t1.unique1 > t2.unique2;
+                                 QUERY PLAN                                 
+----------------------------------------------------------------------------
+ Nested Loop  (cost=0.00..1500915.00 rows=33333333 width=488)
+   Disabled Nodes: 3
+   Join Filter: (t1.unique1 > t2.unique2)
+   ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=10000 width=244)
+         Disabled Nodes: 1
+   ->  Materialize  (cost=0.00..495.00 rows=10000 width=244)
+         Disabled Nodes: 1
+         ->  Seq Scan on tenk1 t2  (cost=0.00..445.00 rows=10000 width=244)
+               Disabled Nodes: 1
+</screen>
+   </para>
+
+   <para>
+    The number of <literal>Disabled nodes</literal> is equal to 3 in nested-loop join node
+    beacuse it is disable itself and it contains two disabled sequential scan nodes. Every
+    sequential scan nodes is disable and has one <literal>Disabled nodes</literal> in the 
+    query plan.
+   </para>
+
    <para>
     The costs are measured in arbitrary units determined by the planner's
     cost parameters (see <xref linkend="runtime-config-query-constants"/>).
-- 
2.34.1

#127

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Alena Rybakina (#126)

Re: On disable_cost

On Thu, Oct 3, 2024 at 1:35 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

I think you are right, most users will perceive this parameter as the number of rejected paths, and not in any other way.

To be honest, I don't have much experience writing documentation, but I think we should add a little more information to doc/src/sgml/perform.sgml.

It contains a description about "explain queries", so the description of "Disabled nodes" is available there.

I prepared a patch that includes the information we can add.

One general thing to think about is that we really document very
little about EXPLAIN. That might not be good, but we should consider
whether it will look strange if we document a bunch of stuff about
this and still don't talk about anything else.

(This is not a comment on this specific patch, which I have not
examined. It's just a general thought.)

--
Robert Haas
EDB: http://www.enterprisedb.com

#128

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Robert Haas (#127)

Re: On disable_cost

On Thu, 2024-10-03 at 14:24 -0400, Robert Haas wrote:

On Thu, Oct 3, 2024 at 1:35 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

I prepared a patch that includes the information we can add.

One general thing to think about is that we really document very
little about EXPLAIN. That might not be good, but we should consider
whether it will look strange if we document a bunch of stuff about
this and still don't talk about anything else.

(This is not a comment on this specific patch, which I have not
examined. It's just a general thought.)

The "EXPLAIN Basics" already mention "enable_seqscan", so I think it is
alright to expand on that a bit.

Here is my take on a documentation patch (assuming David's "Disabled: true"
wording):

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b65245..db906841472 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,28 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>

+   <para>
+    Some plan node types cannot be completely disabled.  For example, there is
+    no other access method than a sequential scan for a table with no index.
+    If you told the planner to disregard a certain node type, but it is forced
+    to use it nonetheless, you will see the plan node marked as
+    <quote>Disabled</quote> in the output of <command>EXPLAIN</command>:
+
+<screen>
+CREATE TABLE dummy (t text);
+
+SET enable_seqscan = off;
+
+EXPLAIN SELECT * FROM dummy;
+
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on dummy  (cost=0.00..23.60 rows=1360 width=32)
+   Disabled: true
+</screen>
+
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

Yours,
Laurenz Albe

#129

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#124)

Re: On disable_cost

On Fri, 4 Oct 2024 at 02:15, Robert Haas <robertmhaas@gmail.com> wrote:

robert.haas=# explain with recursive foo as (select count(*) from
pgbench_accounts union all select aid from pgbench_accounts a, foo
where aid > foo.count) select * from pgbench_accounts, foo where aid =
foo.count;

You might expect that the number of disabled nodes for the hash join
would include the number for the CTE attached to it, but it doesn't. I
suspect similar things will happen when a node has an InitPlan or
SubPlan node attached to it. (I have not tested whether your patch
gets these cases right.)

It looks fine with the patch. The crux of the new logic is just
summing up the disabled_nodes from the child nodes and checking if the
disabled_nodes of the current node is higher than that sum. That's not
exactly hard logic. The biggest risk seems to be not correctly
visiting all the child nodes. I got that wrong with SubqueryScan in my
first PoC.

For sure. To be clear, it's not that I love the current output. It's
that I'm worried that it will be hard to get the thing that you and
Laurenz want to be fully reliable, and I think there's a chance that
not only might it contain bugs now, but it might turn out that people
changing logic in this area in the future introduce more bugs.
plan_is_disabled() has to get exactly the correct answer for the child
nodes every time, or the answer is wrong, and I'm not as confident as
you are that your logic is fully correct (which doesn't mean that I
can prove to you that it is incorrect; I don't even know that it is).

One thing the patch did cause me to find is the missing propagation of
disabled_nodes in make_sort(). It was very obviously wrong with the
patched EXPLAIN output and wasn't so obvious with the current output,
so perhaps you could look at this patch as a better way of ensuring
the disable_node propagation is correct. That's much harder logic to
get right than what I've added to explain.c as it's spread out in many
places.

Take this case, for example:

create table lp (a int) partition by list(a);
create table lp1 partition of lp for values in(1);
create table lp2 partition of lp for values in(2);
create index on lp1(a);
insert into lp select 1 from generate_Series(1,1000000);
analyze lp;
set enable_sort=0;
explain (costs off) select * from lp order by a;

master gives:

Append
Disabled Nodes: 1
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

which isn't correct. Append appears disabled, but it's not. Sort is.
Before I fixed that in the patch, I was incorrectly getting the
"Disabled: true" under the Append. I feel we're more likely to get bug
reports alerting us to incorrect logic when the disabled property only
appears on disabled nodes as there are far fewer of them to look at
and therefore it's more obvious when they're misplaced.

The patched version correctly gives us:

Append
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Disabled: true
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

David

#130

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#129)

Re: On disable_cost

On Thu, Oct 3, 2024 at 5:52 PM David Rowley <dgrowleyml@gmail.com> wrote:

It looks fine with the patch. The crux of the new logic is just
summing up the disabled_nodes from the child nodes and checking if the
disabled_nodes of the current node is higher than that sum. That's not
exactly hard logic. The biggest risk seems to be not correctly
visiting all the child nodes. I got that wrong with SubqueryScan in my
first PoC.

Right, visiting too many or too few child nodes would be bad. The
other worry I have is about things that aren't fully Pathified --
maybe a certain node doesn't have a representation in the Path tree
but gets injected at Plan time. In that case there might be a risk of
the Disabled marker showing up in the wrong place. We do this with
sorts, for example, in the merge-join case.

which isn't correct. Append appears disabled, but it's not. Sort is.
Before I fixed that in the patch, I was incorrectly getting the
"Disabled: true" under the Append. I feel we're more likely to get bug
reports alerting us to incorrect logic when the disabled property only
appears on disabled nodes as there are far fewer of them to look at
and therefore it's more obvious when they're misplaced.

It's certainly possible that you're correct, and the fact that you
have this example in hand makes it more likely. I tend to gravitate
toward displaying things exactly as they exist internally because I've
had so many bad experiences with having to try to reverse-engineer the
value stored internally from whatever is printed. This problem isn't
limited to EXPLAIN, but to give one EXPLAIN-related example, we take
the row count and divide by nloops and round off to an integer and I
cannot tell you how many times that has made my life more difficult
because I really want to see planstate->instrument->ntuples, not
round(planstate->instrument->ntuples/nloops); likewise, I absolutely
loathe the fact that we round off plan->plan_rows to an integer. I
guess we do that so that we "don't confuse people," but what it means
is that I can't see the information I need to help people fix
problems, and frankly what it means is that I myself am confused. I
tend to feel like if the problem is that a user does not understand
what we are printing, that problem can be fixed by the user learning
more until they understand; but if the problem is that we don't print
enough information to understand what is truly happening inside the
data structure, there is no way out from under that problem without
recompiling, which is not where you want to be when something goes
wrong in production. Of course that idea can be taken too far. If you
refuse to translate information into a more human-understandable form
even when you can do so reliably, then you're just making life hard
for users for no benefit, and you might be right that this is such a
case. I'm not here to act like I have all the right answers. I'm just
explaining the reasoning behind what I did; and I hope that it makes
some sense to you.

--
Robert Haas
EDB: http://www.enterprisedb.com

#131

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#130)

Re: On disable_cost

On Sat, 5 Oct 2024 at 03:03, Robert Haas <robertmhaas@gmail.com> wrote:

I tend to gravitate
toward displaying things exactly as they exist internally because I've
had so many bad experiences with having to try to reverse-engineer the
value stored internally from whatever is printed.

Thanks for explaining your point of view. I've not shifted my opinion
any, so I guess we just disagree. I feel a strong enough dislike for
the current EXPLAIN output to feel it's worth working harder to have a
better output.

I won't push my point any further unless someone else appears
supporting Laurenz and I. Thank you for working on getting rid of the
disabled_cost. I think what we have is now much better than before.
The EXPLAIN output is the only part I dislike about this work.

I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.

David

#132

Shayon Mukherjee

shayonj@gmail.com

over 1 year ago

In reply to: David Rowley (#131)

Re: On disable_cost

On Sat, Oct 5, 2024 at 1:37 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Sat, 5 Oct 2024 at 03:03, Robert Haas <robertmhaas@gmail.com> wrote:

I tend to gravitate
toward displaying things exactly as they exist internally because I've
had so many bad experiences with having to try to reverse-engineer the
value stored internally from whatever is printed.

Thanks for explaining your point of view. I've not shifted my opinion
any, so I guess we just disagree. I feel a strong enough dislike for
the current EXPLAIN output to feel it's worth working harder to have a
better output.

I won't push my point any further unless someone else appears
supporting Laurenz and I. Thank you for working on getting rid of the
disabled_cost. I think what we have is now much better than before.
The EXPLAIN output is the only part I dislike about this work.

I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.

David

Hello,

Just like Laurenz, I was initially confused about what "Disabled Nodes"
means. Although I am not a Postgres hacker/committer, here is my $0.02c
perspective as a newcomer if it's useful (:-D):

- I appreciate Robert's concerns regarding the EXPLAIN output, which shows
information closely tied to how it is stored and planned by the planner
code, leaving less room for surprises.
- However, the EXPLAIN from `master` currently still throws me off. For
instance, consider the output below where the outer loop shows two
instances of `Disabled Nodes: 3` and the inner loop shows two instances of
`Disabled Nodes: 1`.

```
SET enable_hashjoin = off;
SET enable_mergejoin = off;
SET enable_indexscan = off;
SET enable_bitmapscan = off;
SET enable_nestloop = off;
SET enable_seqscan = off;

EXPLAIN (ANALYZE, COSTS ON)
SELECT *
FROM pg_class c
JOIN pg_attribute a ON c.oid = a.attrelid
WHERE c.relkind = 'r'
AND a.attnum > 0
LIMIT 10;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.28..7.40 rows=10 width=511) (actual time=0.038..0.043
rows=10 loops=1)
Disabled Nodes: 3
-> Nested Loop (cost=0.28..290.15 rows=407 width=511) (actual
time=0.038..0.042 rows=10 loops=1)
Disabled Nodes: 3
-> Seq Scan on pg_class c (cost=0.00..19.19 rows=68 width=273)
(actual time=0.004..0.004 rows=1 loops=1)
Disabled Nodes: 1
Filter: (relkind = 'r'::"char")
-> Index Scan using pg_attribute_relid_attnum_index on
pg_attribute a (cost=0.28..3.92 rows=6 width=238) (actual
time=0.030..0.032 rows=10 loops=1)
Disabled Nodes: 1
Index Cond: ((attrelid = c.oid) AND (attnum > 0))
```

- In contrast, I think the PATCH's output is much clearer and makes reading
the plan more intuitive. It clearly indicates which nodes were disabled in
a nested output & plan, making it easier to count them if needed (using
grep, etc.).

Limit (cost=0.28..7.40 rows=10 width=511) (actual time=0.031..0.037
rows=10 loops=1)
-> Nested Loop (cost=0.28..290.15 rows=407 width=511) (actual
time=0.031..0.035 rows=10 loops=1)
Disabled: true
-> Seq Scan on pg_class c (cost=0.00..19.19 rows=68 width=273)
(actual time=0.011..0.011 rows=1 loops=1)
Disabled: true
Filter: (relkind = 'r'::"char")
-> Index Scan using pg_attribute_relid_attnum_index on
pg_attribute a (cost=0.28..3.92 rows=6 width=238) (actual
time=0.015..0.016 rows=10 loops=1)
Disabled: true
Index Cond: ((attrelid = c.oid) AND (attnum > 0))
Planning Time: 2.469 ms
Execution Time: 0.120 ms
(11 rows)

- Now, while the above output would make it easier for me as a
developer/user to understand the performance of my query and the decisions
planner took, I do appreciate Robert's concerns about tracing issues back
to the Postgres code if there is something wrong with the planner or
disabled logic itself. That said, not being able to replicate this behavior
with this PATCH is perhaps a good sign.

All that said, I still prefer the boolean attribute and placement under the
node. However, I think `Disabled: true` can still be confusing. `Disabled
Node: true/false` is clearer and leaves less room for conflict with other
features that might be disabled by the planner in the future.

Thanks
Shayon

#133

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#131)

Re: On disable_cost

David Rowley <dgrowleyml@gmail.com> writes:

I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.

FWIW, I do not like the current display one bit.

I think "Disabled: true" on only the nodes that are themselves disabled
would be a very substantial readability improvement.

regards, tom lane

#134

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: David Rowley (#129)

Re: On disable_cost

Hi!
On 04.10.2024 00:52, David Rowley wrote:

One thing the patch did cause me to find is the missing propagation of
disabled_nodes in make_sort(). It was very obviously wrong with the
patched EXPLAIN output and wasn't so obvious with the current output,
so perhaps you could look at this patch as a better way of ensuring
the disable_node propagation is correct. That's much harder logic to
get right than what I've added to explain.c as it's spread out in many
places.

Take this case, for example:

create table lp (a int) partition by list(a);
create table lp1 partition of lp for values in(1);
create table lp2 partition of lp for values in(2);
create index on lp1(a);
insert into lp select 1 from generate_Series(1,1000000);
analyze lp;
set enable_sort=0;
explain (costs off) select * from lp order by a;

master gives:

Append
Disabled Nodes: 1
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

which isn't correct. Append appears disabled, but it's not. Sort is.
Before I fixed that in the patch, I was incorrectly getting the
"Disabled: true" under the Append. I feel we're more likely to get bug
reports alerting us to incorrect logic when the disabled property only
appears on disabled nodes as there are far fewer of them to look at
and therefore it's more obvious when they're misplaced.

The patched version correctly gives us:

Append
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Disabled: true
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

To be honest, I don’t understand at all why we don’t count disabled
nodes for append here? As I understand it, this is due to the fact that
the partitioned table can also be scanned by an index. Besides
mergeappend, in general it’s difficult for me to generalize for which
nodes this rule applies, can you explain here?

--
Regards,
Alena Rybakina
Postgres Professional

#135

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Alena Rybakina (#134)

1 attachment(s)

Re: On disable_cost

BTW, getting off the question of EXPLAIN output for a moment,
I don't understand why disable_cost is still a thing. The
one remaining usage seems trivial to replace, as attached.

regards, tom lane

Attachments:

fully-remove-disable_cost.patchtext/x-diff; charset=us-ascii; name=fully-remove-disable_cost.patchDownload

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index e1523d15df..a676ed2ef6 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -137,9 +137,6 @@ double		parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
 double		recursive_worktable_factor = DEFAULT_RECURSIVE_WORKTABLE_FACTOR;
 
 int			effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
-
-Cost		disable_cost = 1.0e10;
-
 int			max_parallel_workers_per_gather = 2;
 
 bool		enable_seqscan = true;
@@ -4355,15 +4352,15 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 
 	/*
 	 * If the bucket holding the inner MCV would exceed hash_mem, we don't
-	 * want to hash unless there is really no other alternative, so apply
-	 * disable_cost.  (The executor normally copes with excessive memory usage
+	 * want to hash unless there is really no other alternative, so mark path
+	 * as disabled.  (The executor normally copes with excessive memory usage
 	 * by splitting batches, but obviously it cannot separate equal values
 	 * that way, so it will be unable to drive the batch size below hash_mem
 	 * when this is true.)
 	 */
 	if (relation_byte_size(clamp_row_est(inner_path_rows * innermcvfreq),
 						   inner_path->pathtarget->width) > get_hash_memory_limit())
-		startup_cost += disable_cost;
+		path->jpath.path.disabled_nodes++;
 
 	/*
 	 * Compute cost of the hashquals and qpquals (other restriction clauses)
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 854a782944..ebd0e93f5e 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -47,7 +47,6 @@ typedef enum
  */
 
 /* parameter variables and flags (see also optimizer.h) */
-extern PGDLLIMPORT Cost disable_cost;
 extern PGDLLIMPORT int max_parallel_workers_per_gather;
 extern PGDLLIMPORT bool enable_seqscan;
 extern PGDLLIMPORT bool enable_indexscan;

#136

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Alena Rybakina (#134)

Re: On disable_cost

On Sun, 6 Oct 2024 at 06:29, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

On 04.10.2024 00:52, David Rowley wrote:
Append
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Disabled: true
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

To be honest, I don’t understand at all why we don’t count disabled nodes for append here? As I understand it, this is due to the fact that the partitioned table can also be scanned by an index. Besides mergeappend, in general it’s difficult for me to generalize for which nodes this rule applies, can you explain here?

There are no special rules here of what to display based on the node
type. Maybe you think there are some special rules because of the
special cases for Append and MergeAppend in the patch? Those are
handled specially as they don't use the Plan's lefttree and righttree
fields.

Are you saying that the "Disabled: true" should propagate to the root
of the plan tree? That fact that master does that is what Laurenz and
I are complaining about. I'm not sure if I follow what you're asking.

David

#137

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#135)

Re: On disable_cost

On Sun, 6 Oct 2024 at 08:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, getting off the question of EXPLAIN output for a moment,
I don't understand why disable_cost is still a thing. The
one remaining usage seems trivial to replace, as attached.

I didn't notice that any of these remained. I agree we should get rid
of it. The patch looks fine to me.

David

#138

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: David Rowley (#136)

Re: On disable_cost

On 06.10.2024 02:22, David Rowley wrote:

To be honest, I don’t understand at all why we don’t count disabled nodes for append here? As I understand it, this is due to the fact that the partitioned table can also be scanned by an index. Besides mergeappend, in general it’s difficult for me to generalize for which nodes this rule applies, can you explain here?

There are no special rules here of what to display based on the node
type. Maybe you think there are some special rules because of the
special cases for Append and MergeAppend in the patch? Those are
handled specially as they don't use the Plan's lefttree and righttree
fields.

To be honest, I didn't quite understand initially why we don't display
information about disabled nodes for Append and MergeAppend, therefore I
had a question about other cases. Thank you for your explanation it was
helpful!

I also checked the code to see what parameters there are for these nodes
(Append and MergeAppend) and how they are processed.
To sum up, they provide for collecting information from child nodes. I
agree that they do not need additional display about disabled nodes.

Are you saying that the "Disabled: true" should propagate to the root
of the plan tree? That fact that master does that is what Laurenz and
I are complaining about. I'm not sure if I follow what you're asking.

I agree that it's better to display such information for a specific
disabled node. It's clearer what's going on and what it means.

--
Regards,
Alena Rybakina
Postgres Professional

#139

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#128)

Re: On disable_cost

On 03.10.2024 23:10, Laurenz Albe wrote:

On Thu, 2024-10-03 at 14:24 -0400, Robert Haas wrote:

On Thu, Oct 3, 2024 at 1:35 PM Alena Rybakina<a.rybakina@postgrespro.ru> wrote:

I prepared a patch that includes the information we can add.

One general thing to think about is that we really document very
little about EXPLAIN. That might not be good, but we should consider
whether it will look strange if we document a bunch of stuff about
this and still don't talk about anything else.

(This is not a comment on this specific patch, which I have not
examined. It's just a general thought.)

I think we should still add it because it might cause a lot of
misunderstanding.

The "EXPLAIN Basics" already mention "enable_seqscan", so I think it is
alright to expand on that a bit.

Here is my take on a documentation patch (assuming David's "Disabled: true"
wording):

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b65245..db906841472 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,28 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
discussed <link linkend="using-explain-analyze">below</link>.
</para>

+   <para>
+    Some plan node types cannot be completely disabled.  For example, there is
+    no other access method than a sequential scan for a table with no index.
+    If you told the planner to disregard a certain node type, but it is forced
+    to use it nonetheless, you will see the plan node marked as
+    <quote>Disabled</quote> in the output of <command>EXPLAIN</command>:
+
+<screen>
+CREATE TABLE dummy (t text);
+
+SET enable_seqscan = off;
+
+EXPLAIN SELECT * FROM dummy;
+
+                        QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on dummy  (cost=0.00..23.60 rows=1360 width=32)
+   Disabled: true
+</screen>
+
+   </para>
+
<para>
<indexterm>
<primary>subplan</primary>

Sorry for the late reply, I needed time to look into this feature to
respond to your email.
I think this is not entirely correct. I tested last version of the patch
[0]: I created a table and disabled sequential scanning, so there were no other options for optimizer to scan table t1. it still displayed that it has disabled nodes. However you are right that this display will not appear for all nodes that only contain a data collection procedure, such as Append, MergeAppend, Gather, etc. And I agree with you that we should information about it. I also think it’s worth adding additional information that this option does not appear in the postgres_fdw extension.
no other options for optimizer to scan table t1. it still displayed that
it has disabled nodes.
However you are right that this display will not appear for all nodes
that only contain a data collection procedure, such as Append,
MergeAppend, Gather, etc. And I agree with you that we should
information about it. I also think it’s worth adding additional
information that this option does not appear in the postgres_fdw extension.

--
Regards,
Alena Rybakina
Postgres Professional

#140

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#135)

Re: On disable_cost

On Sat, Oct 5, 2024 at 3:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, getting off the question of EXPLAIN output for a moment,
I don't understand why disable_cost is still a thing. The
one remaining usage seems trivial to replace, as attached.

I believe I commented on that somewhere upthread, but maybe I meant to
and didn't or maybe you didn't see it in the flurry of emails.
Basically, I wasn't confident that it made sense to treat this as the
same kind of thing as other cases where we increment disabled_nodes.
Because I couldn't make up my mind what to do and didn't get clear
feedback from anybody else, I did nothing.

The thing is, if somebody says enable_mergejoin=false, presumably they
REALLY, REALLY don't want a merge join. If we start using that same
mechanism for other purposes -- like making sure that a hash join
doesn't overrun work_mem -- then the user might get a merge join
anyway because we've represented a hash join that is big, but not
disabled, in the same way that we represent as merge join that is
actually disabled. I'm pretty uncomfortable with that. Sure, the user
probably doesn't want us to overrun work_mem either, but when push
comes to shove, shouldn't a very explicit user instruction like "don't
use a merge join, l don't want that!" take precedence over any sort of
planner estimate? Estimates can be wrong, and the user is in charge.

--
Robert Haas
EDB: http://www.enterprisedb.com

#141

David G. Johnston

david.g.johnston@gmail.com

over 1 year ago

In reply to: David Rowley (#131)

Re: On disable_cost

On Fri, Oct 4, 2024 at 10:37 PM David Rowley <dgrowleyml@gmail.com> wrote:

I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.

As a DBA when I set one or more of the enable_* settings to false, and
explain a query, I need to know:

1, that the plan shown to me is constrained,
2, which constraints are in place, and
3, which constraints were violated.

The Settings option to Explain fulfills my second need. It is not a
precise solution nor is it automatic. Because of these two things it
really doesn't qualify as fulfilling the first need.

To fulfill the first need I would want to see a data block containing the
following information:
How many (>= 1) enable_* settings are set to false. This is the bare
requirement, but we can also include a count of how many violations exist,
thus aggregating the count of the third need. This information is not
specific to any node and thus should be displayed outside of the execution
tree, the specific choice consistent with the output format under
consideration.

The detail for the third need, violations, is tied to specific executor
nodes. The information provided here should inform me as to which specific
setting was violated as well as, if possible, why. This is effectively
three pieces of information: "Disabled: * (footnote)" The word disabled
is the indicator that this node type was requested to not be included in
the query plan. The * tells me exactly which of the disabled settings is
at play here, reducing the cognitive burden of memorizing node types to
settings. The footnote would be a reference into the documentation under
the enable_* setting that explains why this node is appearing in the query
plan even though I explicitly asked for it to be excluded. In a verbose
output (add a new violations option for this) it would even be possible to
save the trip to the documentation by adding the footnote text to the
explain output.

Now, existing proposals include another piece of data - for every node
calculate how many violations occur in its tree (inclusive). I'm not
understanding the motivation for this data. Knowing which nodes are
violations seems like it is enough. I could always count, and processing
tools could add this aggregate to their displays, but the count itself only
seems useful at the scope of the entire query plan. And possibly sub-plans.

So, by way of example:

set enable_sort=0;
explain (costs off, settings, violations) select * from lp order by a;

Append
-> Index Only Scan using lp1_a_idx on lp1 lp_1
-> Sort
Disabled: Sort (A)
Sort Key: lp_2.a
-> Seq Scan on lp2 lp_2

Disabled Planner Settings: 1
Disabled Node Violations: 1
Settings:
...
enable_sort = false
....
Violation Reasons:
Sort (A): The query contains an order by clause over data coming from a
table being sequentially scanned. This scan's output must be sorted to
fulfill the order by requirement.

I was considering doing a layout like:

Sort (disabled_sort.A) (cost...) (actual...)

but having its own line on those nodes having the violation seems
reasonable. It should be noticeable when the violations occur and this
does stand out. The less pronounced format would be more appropriate for
the "Disabled: #" display that would appear on every single node; which I
believe is counter-productive. Only marking the violation provides the
same amount of detail and allows for the computation of those counts should
the need arise. As a DBA, though, I do not know how to use that count in a
meaningful way.

In text format we place additional information at the bottom of the query
result. It is worth considering whether to place information before the
planner tree. If that is acceptable the two "Disabled Planner %:" counts
should be moved to before the node tree. This draws immediate attention to
the explain output consumer that this plan is constrained and that other
options, like settings and violations, should be added to the explain
command to show additional details. But the two counts and the node detail
"Disabled: * (footnote)" will always be visible.

The footnote definitely is its own feature added to increase usability.
I'm expecting it to not be accepted given the current design of explain,
and also it seems quite difficult to get good data out of the planner to
make the display accurate. But if we tell someone that a setting they
disable is violated they are going to wonder why.

David J.

#142

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: David Rowley (#131)

Re: On disable_cost

On Sat, Oct 5, 2024 at 1:37 AM David Rowley <dgrowleyml@gmail.com> wrote:

Thanks for explaining your point of view. I've not shifted my opinion
any, so I guess we just disagree. I feel a strong enough dislike for
the current EXPLAIN output to feel it's worth working harder to have a
better output.

I won't push my point any further unless someone else appears
supporting Laurenz and I. Thank you for working on getting rid of the
disabled_cost. I think what we have is now much better than before.
The EXPLAIN output is the only part I dislike about this work.

I'd encourage anyone else on the sidelines who has an opinion on how
to display the disabled-ness of a plan node in EXPLAIN to speak up
now, even if it's just a +1 to something someone has already written.
It would be nice to see what more people think.

I think you have adequate consensus to proceed with this. I'd just ask
that you don't disappear completely if it turns out that there are
problems. I accept that my commit created this problem and I'm
certainly willing to be involved too if we need to sort out more
things.

--
Robert Haas
EDB: http://www.enterprisedb.com

#143

Alvaro Herrera

alvherre@alvh.no-ip.org

over 1 year ago

In reply to: Robert Haas (#127)

Re: On disable_cost

On 2024-Oct-03, Robert Haas wrote:

One general thing to think about is that we really document very
little about EXPLAIN. That might not be good, but we should consider
whether it will look strange if we document a bunch of stuff about
this and still don't talk about anything else.

I completely agree that we document very little about EXPLAIN. However,
I disagree that we should continue to do so. I'd rather take the
opportunity to _add_ more details that we currently omit, and make the
documentation more complete. A short blurb about Disabled Nodes such as
the one Laurenz proposed seems an excellent way to start; we can add
more later, as people propose them. We don't have to stop here, and we
don't have to stay at statu quo re. other points.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")

#144

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Alvaro Herrera (#143)

Re: On disable_cost

On Mon, Oct 7, 2024 at 11:28 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Oct-03, Robert Haas wrote:

One general thing to think about is that we really document very
little about EXPLAIN. That might not be good, but we should consider
whether it will look strange if we document a bunch of stuff about
this and still don't talk about anything else.

I completely agree that we document very little about EXPLAIN. However,
I disagree that we should continue to do so. I'd rather take the
opportunity to _add_ more details that we currently omit, and make the
documentation more complete. A short blurb about Disabled Nodes such as
the one Laurenz proposed seems an excellent way to start; we can add
more later, as people propose them. We don't have to stop here, and we
don't have to stay at statu quo re. other points.

Sure, that all makes sense. I was just raising it as a point to consider.

--
Robert Haas
EDB: http://www.enterprisedb.com

#145

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Robert Haas (#142)

Re: On disable_cost

On Mon, 2024-10-07 at 11:14 -0400, Robert Haas wrote:

I accept that my commit created this problem and I'm
certainly willing to be involved too if we need to sort out more
things.

Thanks you. I think it is great that disabled nodes are now handled
better, so I appreciate the change as such. But I had to focus on
the one fly in the ointment; you know how it is...

Yours,
Laurenz Albe

#146

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alena Rybakina (#139)

Re: On disable_cost

On Mon, 2024-10-07 at 10:17 +0300, Alena Rybakina wrote:

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b65245..db906841472 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,28 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>

+   <para>
+    Some plan node types cannot be completely disabled.  For example, there is
+    no other access method than a sequential scan for a table with no index.
+    If you told the planner to disregard a certain node type, but it is forced
+    to use it nonetheless, you will see the plan node marked as
+    <quote>Disabled</quote> in the output of <command>EXPLAIN</command>:
+
+<screen>
+CREATE TABLE dummy (t text);
+
+SET enable_seqscan = off;
+
+EXPLAIN SELECT * FROM dummy;
+
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Seq Scan on dummy  (cost=0.00..23.60 rows=1360 width=32)
+   Disabled: true
+</screen>
+
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

I think this is not entirely correct. I tested last version of the
patch [0]: I created a table and disabled sequential scanning, so
there were no other options for optimizer to scan table t1. it still
displayed that it has disabled nodes.

Isn't that exactly what my doc patch shows?

However you are right that this display will not appear for all
nodes that only contain a data collection procedure, such as Append,
MergeAppend, Gather, etc. And I agree with you that we should
information about it. I also think it’s worth adding additional
information that this option does not appear in the postgres_fdw
extension.

I cannot quite follow that either...

Yours,
Laurenz Albe

#147

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Robert Haas (#140)

Re: On disable_cost

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Oct 5, 2024 at 3:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

BTW, getting off the question of EXPLAIN output for a moment,
I don't understand why disable_cost is still a thing. The
one remaining usage seems trivial to replace, as attached.

I believe I commented on that somewhere upthread, but maybe I meant to
and didn't or maybe you didn't see it in the flurry of emails.
Basically, I wasn't confident that it made sense to treat this as the
same kind of thing as other cases where we increment disabled_nodes.

I don't buy your argument that this case is so special that it
warrants preserving disable_cost. I certainly didn't think it
was special when I added it.

There may be another way to do this that doesn't rely on disabling
the path in the same way as the user-accessible knobs do, but
I don't really believe it's worth the trouble to think of one.
And I definitely don't want to keep disable_cost around even for
just one usage, because then we've not fixed the user-experience
aspect of this (that is, "why does this plan have a ridiculously high
cost?"), nor have we fixed all the concerns you had about higher-level
planning decisions being skewed by that cost.

One other point here is that if disable_cost remains exposed as a
global variable (as it is in HEAD), there is no reason to expect
that any extensions that are using it will get on board with the
new approach.

regards, tom lane

#148

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#146)

Re: On disable_cost

On 07.10.2024 19:02, Laurenz Albe wrote:

On Mon, 2024-10-07 at 10:17 +0300, Alena Rybakina wrote:

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b65245..db906841472 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,28 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>

+   <para>
+    Some plan node types cannot be completely disabled.  For example, there is
+    no other access method than a sequential scan for a table with no index.
+    If you told the planner to disregard a certain node type, but it is forced
+    to use it nonetheless, you will see the plan node marked as
+    <quote>Disabled</quote> in the output of <command>EXPLAIN</command>:
+
+<screen>
+CREATE TABLE dummy (t text);
+
+SET enable_seqscan = off;
+
+EXPLAIN SELECT * FROM dummy;
+
+                        QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on dummy  (cost=0.00..23.60 rows=1360 width=32)
+   Disabled: true
+</screen>
+
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

Isn't that exactly what my doc patch shows?

Sorry, you are right and this is correct. I think I misunderstood at
first because I was tired

However you are right that this display will not appear for all
nodes that only contain a data collection procedure, such as Append,
MergeAppend, Gather, etc. And I agree with you that we should
information about it. I also think it’s worth adding additional
information that this option does not appear in the postgres_fdw
extension.

I cannot quite follow that either...

I meant this [0]/messages/by-id/CAApHDvpMyKJpLGWRmR3+3g4DxrSf6iRpwTRCXMorU0HvgWbocw@mail.gmail.com.

Th disabled description won't display if the MergeAppend and Append
nodes are used in the query plan. I tried to generalize it, but without
success. I'm not sure that these nodes can be called accumulating data.
But I tried to describe this case in the documentation.

About postgres_fdw extension disabled nodes won't show [1]/messages/by-id/CA+TgmoZRwy8202vxbUPBeZd_Tx5NYVtmpvBnJnOzZS3b81cpkg@mail.gmail.com. I think we
should add information about it too.

[0]: /messages/by-id/CAApHDvpMyKJpLGWRmR3+3g4DxrSf6iRpwTRCXMorU0HvgWbocw@mail.gmail.com
/messages/by-id/CAApHDvpMyKJpLGWRmR3+3g4DxrSf6iRpwTRCXMorU0HvgWbocw@mail.gmail.com

[1]: /messages/by-id/CA+TgmoZRwy8202vxbUPBeZd_Tx5NYVtmpvBnJnOzZS3b81cpkg@mail.gmail.com
/messages/by-id/CA+TgmoZRwy8202vxbUPBeZd_Tx5NYVtmpvBnJnOzZS3b81cpkg@mail.gmail.com

--
Regards,
Alena Rybakina
Postgres Professional

#149

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alena Rybakina (#148)

Re: On disable_cost

On Tue, 2024-10-08 at 18:12 +0300, Alena Rybakina wrote:

However you are right that this display will not appear for all
nodes that only contain a data collection procedure, such as Append,
MergeAppend, Gather, etc. And I agree with you that we should
information about it. I also think it’s worth adding additional
information that this option does not appear in the postgres_fdw
extension.

I cannot quite follow that either...

I meant this [0].
[0] /messages/by-id/CAApHDvpMyKJpLGWRmR3+3g4DxrSf6iRpwTRCXMorU0HvgWbocw@mail.gmail.com

Th disabled description won't display if the MergeAppend and Append
nodes are used in the query plan. I tried to generalize it, but without
success. I'm not sure that these nodes can be called accumulating data.
But I tried to describe this case in the documentation.

You mean you rediscovered the bug that David's patch fixes?

About postgres_fdw extension disabled nodes won't show [1]. I think we
should add information about it too.

[1] /messages/by-id/CA+TgmoZRwy8202vxbUPBeZd_Tx5NYVtmpvBnJnOzZS3b81cpkg@mail.gmail.com

You cannot disable a foreign scan...

Or do you want to see "disabled" if the remote query uses a disabled node?
I think that would be out of scope...

Yours,
Laurenz Albe

#150

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Tom Lane (#147)

Re: On disable_cost

On Mon, Oct 7, 2024 at 6:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't buy your argument that this case is so special that it
warrants preserving disable_cost. I certainly didn't think it
was special when I added it.

That's fair. I'm telling you what I think, not what you have to think. :-)

There may be another way to do this that doesn't rely on disabling
the path in the same way as the user-accessible knobs do, but
I don't really believe it's worth the trouble to think of one.
And I definitely don't want to keep disable_cost around even for
just one usage, because then we've not fixed the user-experience
aspect of this (that is, "why does this plan have a ridiculously high
cost?"), nor have we fixed all the concerns you had about higher-level
planning decisions being skewed by that cost.

We don't represent the memory usage of a plan in general, so the
uses-too-much-memory case has to be represented in some other way,
either incrementing disabled nodes or adding something to the cost.
Neither is wholly accurate, in my view. We either conflate too much
memory usage with "runs for a long time" or with "the user said not to
do it".

What I'm actually worried about here is that getting rid of this last
use of disable_cost will interact badly with the work I've proposed
over on the "allowing extensions to control planner behavior" thread.
While the exact details of what we do there have yet to be finalized,
the concept definitely revolves around extensions having a way to
disable certain paths. If an extension gets to control the disabled
nodes knob and the planner gets to control the cost knob, the
extension can be certain of being able to choose the outcome it wants
out of those that are possible. If the core planner also fiddles with
the disabled nodes knob, that is no longer certain. I don't want
extensions to have to invent some weird hack to work around that case,
and I'm sure neither of us wants to have a third thing in core that is
an even-higher-order component of the cost than disabled_nodes.

One other point here is that if disable_cost remains exposed as a
global variable (as it is in HEAD), there is no reason to expect
that any extensions that are using it will get on board with the
new approach.

Yes, I think if we keep this one use of disable_cost we should rename
it to something like too_much_memory_cost or whatever.

--
Robert Haas
EDB: http://www.enterprisedb.com

#151

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: Laurenz Albe (#149)

Re: On disable_cost

On 08.10.2024 18:49, Laurenz Albe wrote:

On Tue, 2024-10-08 at 18:12 +0300, Alena Rybakina wrote:

However you are right that this display will not appear for all
nodes that only contain a data collection procedure, such as Append,
MergeAppend, Gather, etc. And I agree with you that we should
information about it. I also think it’s worth adding additional
information that this option does not appear in the postgres_fdw
extension.

I cannot quite follow that either...

I meant this [0].
[0]/messages/by-id/CAApHDvpMyKJpLGWRmR3+3g4DxrSf6iRpwTRCXMorU0HvgWbocw@mail.gmail.com

Th disabled description won't display if the MergeAppend and Append
nodes are used in the query plan. I tried to generalize it, but without
success. I'm not sure that these nodes can be called accumulating data.
But I tried to describe this case in the documentation.

You mean you rediscovered the bug that David's patch fixes?

Sorry, it works fine. I apparently tested the wrong version of the
patch. Sorry for noise.

About postgres_fdw extension disabled nodes won't show [1]. I think we
should add information about it too.

[1]/messages/by-id/CA+TgmoZRwy8202vxbUPBeZd_Tx5NYVtmpvBnJnOzZS3b81cpkg@mail.gmail.com

You cannot disable a foreign scan...

Or do you want to see "disabled" if the remote query uses a disabled node?
I think that would be out of scope...

Yes, you are right.

--
Regards,
Alena Rybakina
Postgres Professional

#152

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Robert Haas (#142)

1 attachment(s)

Re: On disable_cost

On Tue, 8 Oct 2024 at 04:14, Robert Haas <robertmhaas@gmail.com> wrote:

I think you have adequate consensus to proceed with this. I'd just ask
that you don't disappear completely if it turns out that there are
problems. I accept that my commit created this problem and I'm
certainly willing to be involved too if we need to sort out more
things.

Thanks. I've attached a polished-up version of the earlier patch.

I spent quite a bit of time testing it by manually adjusting
disabled_nodes while attached with my debugger. Doing it that way was
easier as it's often hard and maybe sometimes not possible to get the
disabled node you want in a plan.

I'll loop back to the documentation part and Laruenz's patch after
this part is committed.

If anyone wants to take a look at the attached, please do so.
Otherwise, I'm pretty happy with it and will likely push it on New
Zealand Friday (aka later today).

David

Attachments:

v1-0001-Adjust-EXPLAIN-s-output-for-disabled-nodes.patchapplication/octet-stream; name=v1-0001-Adjust-EXPLAIN-s-output-for-disabled-nodes.patchDownload

From 843914ed82c2942c93d480cc8063b076f51470f0 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Fri, 27 Sep 2024 23:52:13 +1200
Subject: [PATCH v1] Adjust EXPLAIN's output for disabled nodes

c01743aa4 added EXPLAIN output to display the plan node's disabled_node
count whenever that count is above 0.  Seemingly, there weren't many
people who liked that output as each parent of a disabled node would
also have a "Disabled Nodes" output due to the way disabled_nodes is
accumulated towards the root plan node.  It was often hard and sometimes
impossible to figure out which nodes were disabled from looking at
EXPLAIN.  You might think it would be possible to manually add up the
numbers from the "Disabled Nodes" output of a given node's children to
figure out if that node has a higher disabled_nodes count, but that
wouldn't have worked for Append and Merge Append nodes if some of the
children were run-time pruned during init plan. Those are not
displayed in EXPLAIN.

Here we attempt to improve this output by only showing "Disabled: true"
against only the nodes which are explicitly disabled themselves, which
seems to be the most desired output.  This is achieved by simply
checking if the given node has a higher disabled_nodes count than the
sum of its child nodes.

This also fixes a bug in make_sort() which was neglecting to set the
Sort's disabled_nodes field.  With the new output, the choice to not
maintain that field properly was clearly wrong as the disabled-ness of
the node was attributed to the parent instead.

Discussion: https://postgr.es/m/9e4ad616bebb103ec2084bf6f724cfc739e7fabb.camel@cybertec.at
---
 src/backend/commands/explain.c                | 98 ++++++++++++++++++-
 src/backend/optimizer/plan/createplan.c       |  1 +
 src/test/regress/expected/aggregates.out      | 27 ++---
 src/test/regress/expected/btree_index.out     |  5 +-
 src/test/regress/expected/explain.out         | 11 ++-
 .../regress/expected/incremental_sort.out     |  8 +-
 src/test/regress/expected/inherit.out         |  5 +-
 src/test/regress/expected/insert_conflict.out |  4 +-
 src/test/regress/expected/join.out            |  5 +-
 src/test/regress/expected/memoize.out         | 10 +-
 src/test/regress/expected/select_parallel.out |  7 +-
 .../regress/expected/sqljson_jsontable.out    |  1 +
 src/test/regress/expected/union.out           |  2 +-
 src/test/regress/expected/xml.out             |  3 +
 src/test/regress/expected/xml_1.out           |  3 +
 src/test/regress/expected/xml_2.out           |  3 +
 16 files changed, 145 insertions(+), 48 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ee1bcb84e2..18a5af6b91 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1363,6 +1363,96 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
 	return planstate_tree_walker(planstate, ExplainPreScanNode, rels_used);
 }
 
+/*
+ * plan_is_disabled
+ *		Checks if the given plan node type was disabled during query planning.
+ *		This is evident by the disable_node field being higher than the sum of
+ *		the disabled_node field from the plan's children.
+ */
+static bool
+plan_is_disabled(Plan *plan)
+{
+	int			child_disabled_nodes;
+
+	/* The node is certainly not disabled if this is zero */
+	if (plan->disabled_nodes == 0)
+		return false;
+
+	child_disabled_nodes = 0;
+
+	/*
+	 * Handle special nodes first.  Children of BitmapOrs and BitmapAnds can't
+	 * be disabled, so no need to handle those specifically.
+	 */
+	if (IsA(plan, Append))
+	{
+		ListCell   *lc;
+		Append	   *aplan = (Append *) plan;
+
+		/*
+		 * Sum the Append childrens' disabled_nodes.  This purposefully
+		 * includes any run-time pruned children.  Ignoring those could give
+		 * us the incorrect number of disabled nodes.
+		 */
+		foreach(lc, aplan->appendplans)
+		{
+			Plan	   *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else if (IsA(plan, MergeAppend))
+	{
+		ListCell   *lc;
+		MergeAppend *maplan = (MergeAppend *) plan;
+
+		/*
+		 * Sum the MergeAppend childrens' disabled_nodes.  This purposefully
+		 * includes any run-time pruned children.  Ignoring those could give
+		 * us the incorrect number of disabled nodes.
+		 */
+		foreach(lc, maplan->mergeplans)
+		{
+			Plan	   *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else if (IsA(plan, SubqueryScan))
+		child_disabled_nodes += ((SubqueryScan *) plan)->subplan->disabled_nodes;
+	else if (IsA(plan, CustomScan))
+	{
+		ListCell   *lc;
+		CustomScan *cplan = (CustomScan *) plan;
+
+		foreach(lc, cplan->custom_plans)
+		{
+			Plan	   *subplan = lfirst(lc);
+
+			child_disabled_nodes += subplan->disabled_nodes;
+		}
+	}
+	else
+	{
+		/*
+		 * Else, sum up disabled_nodes from the plan's inner and outer side.
+		 */
+		if (outerPlan(plan))
+			child_disabled_nodes += outerPlan(plan)->disabled_nodes;
+		if (innerPlan(plan))
+			child_disabled_nodes += innerPlan(plan)->disabled_nodes;
+	}
+
+	/*
+	 * It's disabled if the plan's disable_nodes is higher than the sum of its
+	 * child's plan disabled_nodes.
+	 */
+	if (plan->disabled_nodes > child_disabled_nodes)
+		return true;
+
+	return false;
+}
+
 /*
  * ExplainNode -
  *	  Appends a description of a plan tree to es->str
@@ -1399,6 +1489,7 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	ExplainWorkersState *save_workers_state = es->workers_state;
 	int			save_indent = es->indent;
 	bool		haschildren;
+	bool		isdisabled;
 
 	/*
 	 * Prepare per-worker output buffers, if needed.  We'll append the data in
@@ -1914,9 +2005,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
-	if (plan->disabled_nodes != 0)
-		ExplainPropertyInteger("Disabled Nodes", NULL, plan->disabled_nodes,
-							   es);
+
+	isdisabled = plan_is_disabled(plan);
+	if (es->format != EXPLAIN_FORMAT_TEXT || isdisabled)
+		ExplainPropertyBool("Disabled", isdisabled, es);
 
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 0d195a07ff..c13586c537 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -6149,6 +6149,7 @@ make_sort(Plan *lefttree, int numCols,
 
 	plan = &node->plan;
 	plan->targetlist = lefttree->targetlist;
+	plan->disabled_nodes = lefttree->disabled_nodes + (enable_sort == false);
 	plan->qual = NIL;
 	plan->lefttree = lefttree;
 	plan->righttree = NULL;
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 495deb606e..1e682565d1 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2944,23 +2944,20 @@ GROUP BY c1.w, c1.z;
                      QUERY PLAN                      
 -----------------------------------------------------
  GroupAggregate
-   Disabled Nodes: 2
    Group Key: c1.w, c1.z
    ->  Sort
-         Disabled Nodes: 2
          Sort Key: c1.w, c1.z, c1.x, c1.y
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(17 rows)
+                           Disabled: true
+(14 rows)
 
 SELECT avg(c1.f ORDER BY c1.x, c1.y)
 FROM group_agg_pk c1 JOIN group_agg_pk c2 ON c1.x = c2.x
@@ -2982,24 +2979,21 @@ GROUP BY c1.y,c1.x,c2.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c1.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c1.x, c1.y
          Presorted Key: c1.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 EXPLAIN (COSTS OFF)
 SELECT c1.y,c1.x FROM group_agg_pk c1
@@ -3009,24 +3003,21 @@ GROUP BY c1.y,c2.x,c1.x;
                      QUERY PLAN                      
 -----------------------------------------------------
  Group
-   Disabled Nodes: 2
    Group Key: c2.x, c1.y
    ->  Incremental Sort
-         Disabled Nodes: 2
          Sort Key: c2.x, c1.y
          Presorted Key: c2.x
          ->  Merge Join
-               Disabled Nodes: 2
                Merge Cond: (c1.x = c2.x)
                ->  Sort
                      Sort Key: c1.x
                      ->  Seq Scan on group_agg_pk c1
-                           Disabled Nodes: 1
+                           Disabled: true
                ->  Sort
                      Sort Key: c2.x
                      ->  Seq Scan on group_agg_pk c2
-                           Disabled Nodes: 1
-(18 rows)
+                           Disabled: true
+(15 rows)
 
 RESET enable_nestloop;
 RESET enable_hashjoin;
diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index d3f4c7e08c..def78ef858 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -335,12 +335,11 @@ select proname from pg_proc where proname ilike 'ri%foo' order by 1;
                   QUERY PLAN                  
 ----------------------------------------------
  Sort
-   Disabled Nodes: 1
    Sort Key: proname
    ->  Seq Scan on pg_proc
-         Disabled Nodes: 1
+         Disabled: true
          Filter: (proname ~~* 'ri%foo'::text)
-(6 rows)
+(5 rows)
 
 reset enable_seqscan;
 reset enable_indexscan;
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index d01c304c24..dcbdaa0388 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -104,6 +104,7 @@ select explain_filter('explain (analyze, buffers, format xml) select * from int8
        <Actual-Total-Time>N.N</Actual-Total-Time>      +
        <Actual-Rows>N</Actual-Rows>                    +
        <Actual-Loops>N</Actual-Loops>                  +
+       <Disabled>false</Disabled>                      +
        <Shared-Hit-Blocks>N</Shared-Hit-Blocks>        +
        <Shared-Read-Blocks>N</Shared-Read-Blocks>      +
        <Shared-Dirtied-Blocks>N</Shared-Dirtied-Blocks>+
@@ -152,6 +153,7 @@ select explain_filter('explain (analyze, serialize, buffers, format yaml) select
      Actual Total Time: N.N   +
      Actual Rows: N           +
      Actual Loops: N          +
+     Disabled: false          +
      Shared Hit Blocks: N     +
      Shared Read Blocks: N    +
      Shared Dirtied Blocks: N +
@@ -213,6 +215,7 @@ select explain_filter('explain (buffers, format json) select * from int8_tbl i8'
        "Total Cost": N.N,          +
        "Plan Rows": N,             +
        "Plan Width": N,            +
+       "Disabled": false,          +
        "Shared Hit Blocks": N,     +
        "Shared Read Blocks": N,    +
        "Shared Dirtied Blocks": N, +
@@ -262,6 +265,7 @@ select explain_filter('explain (analyze, buffers, format json) select * from int
        "Actual Total Time": N.N,    +
        "Actual Rows": N,            +
        "Actual Loops": N,           +
+       "Disabled": false,           +
        "Shared Hit Blocks": N,      +
        "Shared Read Blocks": N,     +
        "Shared Dirtied Blocks": N,  +
@@ -370,6 +374,7 @@ select explain_filter('explain (memory, summary, format yaml) select * from int8
      Total Cost: N.N          +
      Plan Rows: N             +
      Plan Width: N            +
+     Disabled: false          +
    Planning:                  +
      Memory Used: N           +
      Memory Allocated: N      +
@@ -394,7 +399,8 @@ select explain_filter('explain (memory, analyze, format json) select * from int8
        "Actual Startup Time": N.N, +
        "Actual Total Time": N.N,   +
        "Actual Rows": N,           +
-       "Actual Loops": N           +
+       "Actual Loops": N,          +
+       "Disabled": false           +
      },                            +
      "Planning": {                 +
        "Memory Used": N,           +
@@ -497,6 +503,7 @@ select jsonb_pretty(
                                  "string4"                  +
                              ],                             +
                              "Schema": "public",            +
+                             "Disabled": false,             +
                              "Node Type": "Seq Scan",       +
                              "Plan Rows": 0,                +
                              "Plan Width": 0,               +
@@ -540,6 +547,7 @@ select jsonb_pretty(
                          "stringu2",                        +
                          "string4"                          +
                      ],                                     +
+                     "Disabled": false,                     +
                      "Sort Key": [                          +
                          "tenk1.tenthous"                   +
                      ],                                     +
@@ -586,6 +594,7 @@ select jsonb_pretty(
                  "stringu2",                                +
                  "string4"                                  +
              ],                                             +
+             "Disabled": false,                             +
              "Node Type": "Gather Merge",                   +
              "Plan Rows": 0,                                +
              "Plan Width": 0,                               +
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index c561b62b2d..2df7a5db12 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -538,6 +538,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
@@ -701,19 +702,17 @@ explain (costs off) select * from t left join (select * from (select * from t or
                    QUERY PLAN                   
 ------------------------------------------------
  Nested Loop Left Join
-   Disabled Nodes: 1
    Join Filter: (t_1.a = t.a)
    ->  Seq Scan on t
          Filter: (a = ANY ('{1,2}'::integer[]))
    ->  Incremental Sort
-         Disabled Nodes: 1
          Sort Key: t_1.a, t_1.b
          Presorted Key: t_1.a
          ->  Sort
-               Disabled Nodes: 1
+               Disabled: true
                Sort Key: t_1.a
                ->  Seq Scan on t t_1
-(13 rows)
+(11 rows)
 
 select * from t left join (select * from (select * from t order by a) v order by a, b) s on s.a = t.a where t.a in (1, 2);
  a | b | a | b 
@@ -744,6 +743,7 @@ select jsonb_pretty(explain_analyze_inc_sort_nodes_without_memory('select * from
 -------------------------------------------------
  [                                              +
      {                                          +
+         "Disabled": false,                     +
          "Sort Key": [                          +
              "t.a",                             +
              "t.b"                              +
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index dbb748a2d2..c9defd7e9d 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1614,7 +1614,6 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
                                QUERY PLAN                               
 ------------------------------------------------------------------------
  Merge Append
-   Disabled Nodes: 1
    Sort Key: ((1 - matest0.id))
    ->  Index Scan using matest0i on public.matest0 matest0_1
          Output: matest0_1.id, matest0_1.name, (1 - matest0_1.id)
@@ -1624,11 +1623,11 @@ explain (verbose, costs off) select * from matest0 order by 1-id;
          Output: matest0_3.id, matest0_3.name, ((1 - matest0_3.id))
          Sort Key: ((1 - matest0_3.id))
          ->  Seq Scan on public.matest2 matest0_3
-               Disabled Nodes: 1
+               Disabled: true
                Output: matest0_3.id, matest0_3.name, (1 - matest0_3.id)
    ->  Index Scan using matest3i on public.matest3 matest0_4
          Output: matest0_4.id, matest0_4.name, (1 - matest0_4.id)
-(15 rows)
+(14 rows)
 
 select * from matest0 order by 1-id;
  id |  name  
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
index 5cb9cde030..fdd0f6c8f2 100644
--- a/src/test/regress/expected/insert_conflict.out
+++ b/src/test/regress/expected/insert_conflict.out
@@ -218,6 +218,7 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
        "Async Capable": false,                                         +
        "Relation Name": "insertconflicttest",                          +
        "Alias": "insertconflicttest",                                  +
+       "Disabled": false,                                              +
        "Conflict Resolution": "UPDATE",                                +
        "Conflict Arbiter Indexes": ["key_index"],                      +
        "Conflict Filter": "(insertconflicttest.fruit <> 'Lime'::text)",+
@@ -226,7 +227,8 @@ explain (costs off, format json) insert into insertconflicttest values (0, 'Bilb
            "Node Type": "Result",                                      +
            "Parent Relationship": "Outer",                             +
            "Parallel Aware": false,                                    +
-           "Async Capable": false                                      +
+           "Async Capable": false,                                     +
+           "Disabled": false                                           +
          }                                                             +
        ]                                                               +
      }                                                                 +
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 12abd3a0e7..756c2e2496 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8010,15 +8010,14 @@ SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS
                        QUERY PLAN                        
 ---------------------------------------------------------
  Nested Loop Anti Join
-   Disabled Nodes: 1
    ->  Seq Scan on skip_fetch t1
-         Disabled Nodes: 1
+         Disabled: true
    ->  Materialize
          ->  Bitmap Heap Scan on skip_fetch t2
                Recheck Cond: (a = 1)
                ->  Bitmap Index Scan on skip_fetch_a_idx
                      Index Cond: (a = 1)
-(9 rows)
+(8 rows)
 
 SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
  a 
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index 9ee09fe2f5..f6b8329cd6 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -303,16 +303,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.n >= s2.n;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.n
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_n_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (n <= s1.n)
-(10 rows)
+(9 rows)
 
 -- Ensure we get 3 hits and 3 misses
 SELECT explain_memoize('
@@ -320,16 +319,15 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false);
                                  explain_memoize                                  
 ----------------------------------------------------------------------------------
  Nested Loop (actual rows=24 loops=N)
-   Disabled Nodes: 1
    ->  Seq Scan on strtest s1 (actual rows=6 loops=N)
-         Disabled Nodes: 1
+         Disabled: true
    ->  Memoize (actual rows=4 loops=N)
          Cache Key: s1.t
          Cache Mode: binary
          Hits: 3  Misses: 3  Evictions: Zero  Overflows: 0  Memory Usage: NkB
          ->  Index Scan using strtest_t_idx on strtest s2 (actual rows=4 loops=N)
                Index Cond: (t <= s1.t)
-(10 rows)
+(9 rows)
 
 DROP TABLE strtest;
 -- Ensure memoize works with partitionwise join
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2c63aa85a6..d17ade278b 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -537,14 +537,11 @@ explain (costs off)
                          QUERY PLAN                         
 ------------------------------------------------------------
  Aggregate
-   Disabled Nodes: 1
    ->  Nested Loop
-         Disabled Nodes: 1
          ->  Gather
-               Disabled Nodes: 1
                Workers Planned: 4
                ->  Parallel Seq Scan on tenk2
-                     Disabled Nodes: 1
+                     Disabled: true
                      Filter: (thousand = 0)
          ->  Gather
                Workers Planned: 4
@@ -552,7 +549,7 @@ explain (costs off)
                      Recheck Cond: (hundred > 1)
                      ->  Bitmap Index Scan on tenk1_hundred
                            Index Cond: (hundred > 1)
-(16 rows)
+(13 rows)
 
 select count(*) from tenk1, tenk2 where tenk1.hundred > 1 and tenk2.thousand=0;
  count 
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 7a698934ac..d62d32241d 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -474,6 +474,7 @@ SELECT * FROM
        "Async Capable": false,                                                                                                                                                                              +
        "Table Function Name": "json_table",                                                                                                                                                                 +
        "Alias": "json_table_func",                                                                                                                                                                          +
+       "Disabled": false,                                                                                                                                                                                   +
        "Output": ["id", "\"int\"", "text"],                                                                                                                                                                 +
        "Table Function Call": "JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 PASSING 3 AS a, '\"foo\"'::jsonb AS \"b c\" COLUMNS (id FOR ORDINALITY, \"int\" integer PATH '$', text text PATH '$'))"+
      }                                                                                                                                                                                                      +
diff --git a/src/test/regress/expected/union.out b/src/test/regress/expected/union.out
index 0456d48c93..c73631a9a1 100644
--- a/src/test/regress/expected/union.out
+++ b/src/test/regress/expected/union.out
@@ -822,7 +822,7 @@ explain (costs off) select '123'::xid union select '123'::xid;
         QUERY PLAN         
 ---------------------------
  HashAggregate
-   Disabled Nodes: 1
+   Disabled: true
    Group Key: ('123'::xid)
    ->  Append
          ->  Result
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 361a6f9b27..fb5f345855 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1577,6 +1577,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1588,6 +1589,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1597,6 +1599,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d26e10441e..ef7dc03c69 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1209,6 +1209,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1220,6 +1221,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1229,6 +1231,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 73c2851d3f..4a9cdd2afe 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1563,6 +1563,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
        "Parallel Aware": false,                                                                                                                                                                 +
        "Async Capable": false,                                                                                                                                                                  +
        "Join Type": "Inner",                                                                                                                                                                    +
+       "Disabled": false,                                                                                                                                                                       +
        "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                     +
        "Inner Unique": false,                                                                                                                                                                   +
        "Plans": [                                                                                                                                                                               +
@@ -1574,6 +1575,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Relation Name": "xmldata",                                                                                                                                                          +
            "Schema": "public",                                                                                                                                                                  +
            "Alias": "xmldata",                                                                                                                                                                  +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["xmldata.data"]                                                                                                                                                           +
          },                                                                                                                                                                                     +
          {                                                                                                                                                                                      +
@@ -1583,6 +1585,7 @@ SELECT f.* FROM xmldata, LATERAL xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
            "Async Capable": false,                                                                                                                                                              +
            "Table Function Name": "xmltable",                                                                                                                                                   +
            "Alias": "f",                                                                                                                                                                        +
+           "Disabled": false,                                                                                                                                                                   +
            "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],                                                                                                                                 +
            "Table Function Call": "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
            "Filter": "(f.\"COUNTRY_NAME\" = 'Japan'::text)"                                                                                                                                     +
-- 
2.34.1

#153

Alena Rybakina

a.rybakina@postgrespro.ru

over 1 year ago

In reply to: David Rowley (#152)

1 attachment(s)

Re: On disable_cost

Hi!

On 10.10.2024 15:43, David Rowley wrote:

On Tue, 8 Oct 2024 at 04:14, Robert Haas<robertmhaas@gmail.com> wrote:

I think you have adequate consensus to proceed with this. I'd just ask
that you don't disappear completely if it turns out that there are
problems. I accept that my commit created this problem and I'm
certainly willing to be involved too if we need to sort out more
things.

Thanks. I've attached a polished-up version of the earlier patch.

I spent quite a bit of time testing it by manually adjusting
disabled_nodes while attached with my debugger. Doing it that way was
easier as it's often hard and maybe sometimes not possible to get the
disabled node you want in a plan.

I'll loop back to the documentation part and Laruenz's patch after
this part is committed.

If anyone wants to take a look at the attached, please do so.
Otherwise, I'm pretty happy with it and will likely push it on New
Zealand Friday (aka later today).

David

I think you missed some previous output and we should fix that.

--
Regards,
Alena Rybakina
Postgres Professional

Attachments:

disable_nodes.difftext/x-patch; charset=UTF-8; name=disable_nodes.diffDownload

diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 31345295c11..faa376e060c 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -989,7 +989,7 @@ select * from collate_test1 where b ilike 'abc';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
-   Disabled Nodes: 1
+   Disabled: true
    Filter: (b ~~* 'abc'::text)
 (3 rows)
 
@@ -1005,7 +1005,7 @@ select * from collate_test1 where b ilike 'ABC';
           QUERY PLAN           
 -------------------------------
  Seq Scan on collate_test1
-   Disabled Nodes: 1
+   Disabled: true
    Filter: (b ~~* 'ABC'::text)
 (3 rows)

#154

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Alena Rybakina (#153)

Re: On disable_cost

On Fri, 11 Oct 2024 at 02:02, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

On 10.10.2024 15:43, David Rowley wrote:

If anyone wants to take a look at the attached, please do so.
Otherwise, I'm pretty happy with it and will likely push it on New
Zealand Friday (aka later today).

I think you missed some previous output and we should fix that.

Thanks. I should install ICU...

I've now pushed this change and will look at the docs now.

David

#155

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: David Rowley (#154)

Re: On disable_cost

On Fri, 2024-10-11 at 17:24 +1300, David Rowley wrote:

On Fri, 11 Oct 2024 at 02:02, Alena Rybakina <a.rybakina@postgrespro.ru> wrote:

On 10.10.2024 15:43, David Rowley wrote:

If anyone wants to take a look at the attached, please do so.
Otherwise, I'm pretty happy with it and will likely push it on New
Zealand Friday (aka later today).

I think you missed some previous output and we should fix that.

Thanks. I should install ICU...

I've now pushed this change and will look at the docs now.

Thanks you for taking care of that!

Yours,
Laurenz Albe

#156

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#155)

2 attachment(s)

Re: On disable_cost

On Fri, 11 Oct 2024 at 19:44, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2024-10-11 at 17:24 +1300, David Rowley wrote:

I've now pushed this change and will look at the docs now.

Thanks you for taking care of that!

I've attached a patch for this. It's very similar to your patch and in
the same location, just the wording is slightly different. I also
opted to use a table that exists in the regression database as the top
of the page mentions "Examples in this section are drawn from the
regression test database".

Attached in patch form and with compiled HTML.

David

Attachments:

using-explain.htmltext/html; charset=UTF-8; name=using-explain.htmlDownload

disabled_docs.patchapplication/octet-stream; name=disabled_docs.patchDownload

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b6524..861b9cf0bc 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,34 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>
 
+   <para>
+    When using the enable/disable flags to disable plan node types, the
+    majority of the flags only deprioritize the corresponding plan node
+    and don't outright disallow the planner's ability to use the plan node
+    type.  This is done so that the planner still maintains the ability to
+    form a plan for a given query.  Otherwise, certain queries would not be
+    possible to execute when certain plan node types are disabled.  This means
+    it is possible that the planner chooses a plan using a node that has been
+    disabled.  When this happens, the <command>EXPLAIN</command> output will
+    indicate this fact.
+
+<screen>
+SET enable_seqscan = off;
+EXPLAIN SELECT * FROM unit;
+
+                       QUERY PLAN
+---------------------------------------------------------
+ Seq Scan on unit  (cost=0.00..21.30 rows=1130 width=44)
+   Disabled: true
+</screen>
+   </para>
+
+   <para>
+    Because the <literal>unit</literal> table has no indexes, there is no
+    other means to read the table data, so the <literal>Seq Scan</literal>
+    is the only option available to the query planner.
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

#157

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: David Rowley (#156)

Re: On disable_cost

On Fri, 2024-10-11 at 20:45 +1300, David Rowley wrote:

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b6524..861b9cf0bc 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,34 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
discussed <link linkend="using-explain-analyze">below</link>.
</para>

+   <para>
+    When using the enable/disable flags to disable plan node types, the
+    majority of the flags only deprioritize the corresponding plan node

I don't like "deprioritize".
How about "discourage the use of"?

Besides, is that really the majority? I had though that only a few nodes
are unavoidable (sequential scan, sort, nested loop). But I guess I am wrong.

+    and don't outright disallow the planner's ability to use the plan node
+    type.  This is done so that the planner still maintains the ability to
+    form a plan for a given query.  Otherwise, certain queries would not be
+    possible to execute when certain plan node types are disabled.  This means

"would not be possible to execute" can be simplified to "could be executed".

+    it is possible that the planner chooses a plan using a node that has been
+    disabled.  When this happens, the <command>EXPLAIN</command> output will
+    indicate this fact.
+
+<screen>
+SET enable_seqscan = off;
+EXPLAIN SELECT * FROM unit;
+
+                       QUERY PLAN
+---------------------------------------------------------
+ Seq Scan on unit  (cost=0.00..21.30 rows=1130 width=44)
+   Disabled: true
+</screen>
+   </para>
+
+   <para>
+    Because the <literal>unit</literal> table has no indexes, there is no
+    other means to read the table data, so the <literal>Seq Scan</literal>
+    is the only option available to the query planner.
+   </para>
+

Can we have "sequential scan" instead of "Seq Scan"?
It's somewhat unrelated, but I cannot count how many people I have talked
to who think that it is a "sequence scan".

Yours,
Laurenz Albe

#158

David Rowley

dgrowleyml@gmail.com

about 1 year ago

In reply to: Laurenz Albe (#157)

1 attachment(s)

Re: On disable_cost

On Sat, 12 Oct 2024 at 02:16, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2024-10-11 at 20:45 +1300, David Rowley wrote:
+    When using the enable/disable flags to disable plan node types, the
+    majority of the flags only deprioritize the corresponding plan node
I don't like "deprioritize".
How about "discourage the use of"?

Yeah, that's ok for me.

Besides, is that really the majority? I had though that only a few nodes
are unavoidable (sequential scan, sort, nested loop). But I guess I am wrong.

Ok, I looked and you're right. I did make a quick pass to
approximately figure that out and I came up with:

Soft disable: enable_bitmapscan, enable_gathermerge, enable_hashagg,
enable_hashjoin, enable_indexscan, enable_mergejoin, enable_nestloop,
enable_seqscan, enable_sort

That's 9.

Hard disable: enable_async_append, enable_group_by_reordering,
enable_incremental_sort, enable_indexonlyscan, enable_material,
enable_memoize, enable_parallel_append, enable_parallel_hash,
enable_partition_pruning, enable_partitionwise_aggregate,
enable_partitionwise_join, enable_presorted_aggregate, enable_tidscan

And 13.

(there are a few ones that are in a grey area such as enable_hashagg
or enable_tidscan will still happen with a WHERE CURRENT OF <cursor>.)

I changed "majority" to "many"

+    and don't outright disallow the planner's ability to use the plan node
+    type.  This is done so that the planner still maintains the ability to
+    form a plan for a given query.  Otherwise, certain queries would not be
+    possible to execute when certain plan node types are disabled.  This means

"would not be possible to execute" can be simplified to "could be executed".

Changed.

Can we have "sequential scan" instead of "Seq Scan"?
It's somewhat unrelated, but I cannot count how many people I have talked
to who think that it is a "sequence scan".

Yeah, looks like we only call it "Seq Scan" in EXPLAIN and we use
"sequence scan" when talking about it in sentences.

Thanks for looking. Updated patch is attached.

David

Attachments:

disabled_docs_v2.patchapplication/octet-stream; name=disabled_docs_v2.patchDownload

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b6524..e3912bdbf2 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,33 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>
 
+   <para>
+    When using the enable/disable flags to disable plan node types, many of
+    the flags only discourage the use of the corresponding plan node and don't
+    outright disallow the planner's ability to use the plan node type.  This
+    is done so that the planner still maintains the ability to form a plan for
+    a given query.  Otherwise, certain queries could be executed when certain
+    plan node types are disabled.  This means it is possible that the planner
+    chooses a plan using a node that has been disabled.  When this happens,
+    the <command>EXPLAIN</command> output will indicate this fact.
+
+<screen>
+SET enable_seqscan = off;
+EXPLAIN SELECT * FROM unit;
+
+                       QUERY PLAN
+---------------------------------------------------------
+ Seq Scan on unit  (cost=0.00..21.30 rows=1130 width=44)
+   Disabled: true
+</screen>
+   </para>
+
+   <para>
+    Because the <literal>unit</literal> table has no indexes, there is no
+    other means to read the table data, so the sequential scan is the only
+    option available to the query planner.
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

#159

Laurenz Albe

laurenz.albe@cybertec.at

about 1 year ago

In reply to: David Rowley (#158)

Re: On disable_cost

Thanks for the fixes; I have only a few quibbles now.

On Fri, 2024-10-18 at 23:54 +1300, David Rowley wrote:

--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,33 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
discussed <link linkend="using-explain-analyze">below</link>.
</para>

+   <para>
+    When using the enable/disable flags to disable plan node types, many of
+    the flags only discourage the use of the corresponding plan node and don't
+    outright disallow the planner's ability to use the plan node type.  This
+    is done so that the planner still maintains the ability to form a plan for
+    a given query.  Otherwise, certain queries could be executed when certain

You mean "could *not* be executed".

+ plan node types are disabled. This means it is possible that the planner

The "this" is potentially confusing. Does it refer to queries that cannot be
executed?

+    chooses a plan using a node that has been disabled.  When this happens,
+    the <command>EXPLAIN</command> output will indicate this fact.

Here is my attempt on that paragraph:

When using the enable/disable flags to disable plan node types, many of
the flags only discourage the use of the corresponding plan node and don't
outright disallow the planner's ability to use the plan node type.
Otherwise, certain queries could not be executed for lack of an alternative
to using a disabled plan node. As a consequence, it is possible that the
planner chooses a plan using a node that has been disabled. When this
happens, the <command>EXPLAIN</command> output will indicate this fact.

Yours,
Laurenz Albe

#160

David Rowley

dgrowleyml@gmail.com

about 1 year ago

In reply to: Laurenz Albe (#159)

1 attachment(s)

Re: On disable_cost

On Sat, 19 Oct 2024 at 01:09, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Here is my attempt on that paragraph:

When using the enable/disable flags to disable plan node types, many of
the flags only discourage the use of the corresponding plan node and don't
outright disallow the planner's ability to use the plan node type.
Otherwise, certain queries could not be executed for lack of an alternative
to using a disabled plan node. As a consequence, it is possible that the
planner chooses a plan using a node that has been disabled. When this
happens, the <command>EXPLAIN</command> output will indicate this fact.

I think that looks pretty good. However, I would like to keep the part
about the possibility of disabled nodes still being used is on
purpose. Mostly just to make it clear that it's not a bug. We get so
many false bug reports that I feel it's worthwhile mentioning that
explicitly.

Maybe since you dropped that sentence to shorten the paragraph, we
could instead just drop the "Otherwise, certain" sentence.

Also, the concern about using "this". How about we just write "When
the resulting plan contains a disabled node, the
<command>EXPLAIN</command> output will indicate this fact.", which
makes that self-contained.

That becomes:

When using the enable/disable flags to disable plan node types, many of
the flags only discourage the use of the corresponding plan node and don't
outright disallow the planner's ability to use the plan node type. This
is by design so that the planner still maintains the ability to form a
plan for a given query. When the resulting plan contains a disabled node,
the <command>EXPLAIN</command> output will indicate this fact.

David

Attachments:

disabled_docs_v3.patchapplication/octet-stream; name=disabled_docs_v3.patchDownload

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index ff689b6524..cd12b9ce48 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -578,6 +578,31 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
     discussed <link linkend="using-explain-analyze">below</link>.
    </para>
 
+   <para>
+    When using the enable/disable flags to disable plan node types, many of
+    the flags only discourage the use of the corresponding plan node and don't
+    outright disallow the planner's ability to use the plan node type.  This
+    is by design so that the planner still maintains the ability to form a
+    plan for a given query.  When the resulting plan contains a disabled node,
+    the <command>EXPLAIN</command> output will indicate this fact.
+
+<screen>
+SET enable_seqscan = off;
+EXPLAIN SELECT * FROM unit;
+
+                       QUERY PLAN
+---------------------------------------------------------
+ Seq Scan on unit  (cost=0.00..21.30 rows=1130 width=44)
+   Disabled: true
+</screen>
+   </para>
+
+   <para>
+    Because the <literal>unit</literal> table has no indexes, there is no
+    other means to read the table data, so the sequential scan is the only
+    option available to the query planner.
+   </para>
+
    <para>
     <indexterm>
      <primary>subplan</primary>

#161

Laurenz Albe

laurenz.albe@cybertec.at

about 1 year ago

In reply to: David Rowley (#160)

Re: On disable_cost

On Tue, 2024-10-29 at 12:21 +1300, David Rowley wrote:

    When using the enable/disable flags to disable plan node types, many of
    the flags only discourage the use of the corresponding plan node and don't
    outright disallow the planner's ability to use the plan node type. This
    is by design so that the planner still maintains the ability to form a
    plan for a given query. When the resulting plan contains a disabled node,
    the <command>EXPLAIN</command> output will indicate this fact.

That patch is good in my opinion.

Yours,
Laurenz Albe

#162

David Rowley

dgrowleyml@gmail.com

about 1 year ago

In reply to: Laurenz Albe (#161)

Re: On disable_cost

On Tue, 29 Oct 2024 at 20:04, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

That patch is good in my opinion.

Thanks for checking. Pushed.

David

#163

Michael Paquier

michael@paquier.xyz

7 months ago

In reply to: Robert Haas (#82)

Re: On disable_cost

On Wed, Aug 21, 2024 at 10:29:23AM -0400, Robert Haas wrote:

I went ahead and committed these patches. I know there's some debate
over whether we want to show the # of disabled nodes and if so whether
it should be controlled by COSTS, and I suspect I haven't completely
allayed David's concerns about the initial_cost_XXX functions although
I think that I did the right thing. But, I don't have the impression
that anyone is desperately opposed to the basic concept, so I think it
makes sense to put these into the tree and see what happens. We have
quite a bit of time left in this release cycle to uncover bugs, hear
from users or other developers, etc. about what problems there may be
with this. If we end up deciding to reverse course or need to fix a
bunch of stuff, so be it, but let's see what the feedback is.

I have been reviewing a bit what was done in the scope of this thread
for some planner hint things, as in e22253467942 and 161320b4b960, and
I just wanted to say thanks for committing these changes.

In pg_hint_plan, we've been discussing lately some limitations with
always appending disable_cost to all the nodes that we want disabled,
up to v17. There were two proposals that turned around copying from
the backend planner more cost-related logic to get a tighter control
over the ordering of the node paths that we want disabled
(pg_hint_plan needs may attempt to force a strict ordering, like for
joins with Leading nodes), still relying on disable_cost brings a lot
of limitations. The talks happened around these two issues, FYI:
https://github.com/ossc-db/pg_hint_plan/pull/207
https://github.com/ossc-db/pg_hint_plan/pull/208

Let's just say that the proposals turned around copying more cost
routines from the planner back into the module, which was a bad idea
on maintenance ground. With v18, we do not need to do any of that
anymore, as disabled_nodes takes priority over the costs. Not to
mention that it's more helpful to know if the planner disabled a node
without having to look at a cost close to an infinite value. A bunch
of plans in the regression tests have changed while making the code
stable with v18 because we show some nodes as disabled but still used
by the planner, but that was not a big deal at the end.

In short, thanks all for the work done here!
--
Michael

#164

Robert Haas

robertmhaas@gmail.com

7 months ago

In reply to: Michael Paquier (#163)

Re: On disable_cost

On Mon, Jun 30, 2025 at 8:14 PM Michael Paquier <michael@paquier.xyz> wrote:

In short, thanks all for the work done here!

I'm glad to hear that you found it helpful!

--
Robert Haas
EDB: http://www.enterprisedb.com