Consider low startup cost in add_partial_path

Started by James Colemanover 6 years ago8 messages

jtc331@gmail.com

over 6 years ago

1 attachment(s)

Over in the incremental sort patch discussion we found [1]/messages/by-id/20190720132244.3vgg2uynfpxh3me5@development a case
where a higher cost plan ends up being chosen because a low startup
cost partial path is ignored in favor of a lower total cost partial
path and a limit is a applied on top of that which would normal favor
the lower startup cost plan.

45be99f8cd5d606086e0a458c9c72910ba8a613d originally added
`add_partial_path` with the comment:

Neither do we need to consider startup costs:
parallelism is only used for plans that will be run to completion.
Therefore, this routine is much simpler than add_path: it needs to
consider only pathkeys and total cost.

I'm not entirely sure if that is still true or not--I can't easily
come up with a scenario in which it's not, but I also can't come up
with an inherent reason why such a scenario cannot exist.

We could just continue to include this change as part of the
incremental sort patch itself, but it seemed worth it to me to break
it out for some more targeted discussion, and also include Robert as
the initial author of add_partial_path in the hopes that maybe we
could retrieve some almost 4-year-old memories on why this was
inherently true then, and maybe that would shed some light on whether
it's still inherently true.

I've attached a patch (by Tomas Vondra, also cc'd) to consider startup
cost in add_partial_path, but should we apply the patch we'll also
likely need to apply the same kind of change to
add_partial_path_precheck.

James Coleman

[1]: /messages/by-id/20190720132244.3vgg2uynfpxh3me5@development

Attachments:

consider-startup-cost-in-add-partial-path_v1.patchapplication/octet-stream; name=consider-startup-cost-in-add-partial-path_v1.patchDownload

commit 02b738b5e32326a24687890e066c66f5508b5976
Author: Tomas Vondra <tomas@2ndquadrant.com>
Date:   Sun Jul 28 15:55:54 2019 +0200

    Consider low startup cost when adding partial path
    
    45be99f8cd5d606086e0a458c9c72910ba8a613d added `add_partial_path` with the
    comment:
    
    > Neither do we need to consider startup costs:
    > parallelism is only used for plans that will be run to completion.
    > Therefore, this routine is much simpler than add_path: it needs to
    > consider only pathkeys and total cost.
    
    I'm not entirely sure if that is still true or not--I can't easily come
    up with a scenario in which it's not, but I also can't come up with an
    inherent reason why such a scenario cannot exist.
    
    Regardless, the in-progress incremental sort patch uncovered a new case
    where it definitely no longer holds, and, as a result a higher cost plan
    ends up being chosen because a low startup cost partial path is ignored
    in favor of a lower total cost partial path and a limit is a applied on
    top of that which would normal favor the lower startup cost plan.

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 34acb732ee..5d66fc2177 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -778,41 +778,30 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Unless pathkeys are incompatible, keep just one of the two paths. */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
-			{
-				/* New path costs more; keep it only if pathkeys are better. */
-				if (keyscmp != PATHKEYS_BETTER1)
-					accept_new = false;
-			}
-			else if (old_path->total_cost > new_path->total_cost
-					 * STD_FUZZ_FACTOR)
+			PathCostComparison costcmp;
+
+			/*
+			 * Do a fuzzy cost comparison with standard fuzziness limit.
+			 */
+			costcmp = compare_path_costs_fuzzily(new_path, old_path,
+												 STD_FUZZ_FACTOR);
+
+			if (costcmp == COSTS_BETTER1)
 			{
-				/* Old path costs more; keep it only if pathkeys are better. */
-				if (keyscmp != PATHKEYS_BETTER2)
+				if (keyscmp == PATHKEYS_BETTER1)
 					remove_old = true;
 			}
-			else if (keyscmp == PATHKEYS_BETTER1)
-			{
-				/* Costs are about the same, new path has better pathkeys. */
-				remove_old = true;
-			}
-			else if (keyscmp == PATHKEYS_BETTER2)
+			else if (costcmp == COSTS_BETTER2)
 			{
-				/* Costs are about the same, old path has better pathkeys. */
-				accept_new = false;
-			}
-			else if (old_path->total_cost > new_path->total_cost * 1.0000000001)
-			{
-				/* Pathkeys are the same, and the old path costs more. */
-				remove_old = true;
+				if (keyscmp == PATHKEYS_BETTER2)
+					accept_new = false;
 			}
-			else
+			else if (costcmp == COSTS_EQUAL)
 			{
-				/*
-				 * Pathkeys are the same, and new path isn't materially
-				 * cheaper.
-				 */
-				accept_new = false;
+				if (keyscmp == PATHKEYS_BETTER1)
+					remove_old = true;
+				else if (keyscmp == PATHKEYS_BETTER2)
+					accept_new = false;
 			}
 		}

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: James Coleman (#1)

Re: Consider low startup cost in add_partial_path

On Fri, Sep 27, 2019 at 2:24 PM James Coleman <jtc331@gmail.com> wrote:

Over in the incremental sort patch discussion we found [1] a case
where a higher cost plan ends up being chosen because a low startup
cost partial path is ignored in favor of a lower total cost partial
path and a limit is a applied on top of that which would normal favor
the lower startup cost plan.

45be99f8cd5d606086e0a458c9c72910ba8a613d originally added
`add_partial_path` with the comment:

Neither do we need to consider startup costs:
parallelism is only used for plans that will be run to completion.
Therefore, this routine is much simpler than add_path: it needs to
consider only pathkeys and total cost.

I'm not entirely sure if that is still true or not--I can't easily
come up with a scenario in which it's not, but I also can't come up
with an inherent reason why such a scenario cannot exist.

I think I just didn't think carefully about the Limit case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 6 years ago

In reply to: Robert Haas (#2)

Re: Consider low startup cost in add_partial_path

On Sat, Sep 28, 2019 at 12:16:05AM -0400, Robert Haas wrote:

On Fri, Sep 27, 2019 at 2:24 PM James Coleman <jtc331@gmail.com> wrote:

Over in the incremental sort patch discussion we found [1] a case
where a higher cost plan ends up being chosen because a low startup
cost partial path is ignored in favor of a lower total cost partial
path and a limit is a applied on top of that which would normal favor
the lower startup cost plan.

45be99f8cd5d606086e0a458c9c72910ba8a613d originally added
`add_partial_path` with the comment:

Neither do we need to consider startup costs:
parallelism is only used for plans that will be run to completion.
Therefore, this routine is much simpler than add_path: it needs to
consider only pathkeys and total cost.

I'm not entirely sure if that is still true or not--I can't easily
come up with a scenario in which it's not, but I also can't come up
with an inherent reason why such a scenario cannot exist.

I think I just didn't think carefully about the Limit case.

Thanks! In that case I suggest we treat it as a separate patch/fix,
independent of the incremental sort patch. I don't want to bury it in
that patch series, it's already pretty large.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

James Coleman

jtc331@gmail.com

over 6 years ago

In reply to: Tomas Vondra (#3)

Re: Consider low startup cost in add_partial_path

On Saturday, September 28, 2019, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:

On Sat, Sep 28, 2019 at 12:16:05AM -0400, Robert Haas wrote:

On Fri, Sep 27, 2019 at 2:24 PM James Coleman <jtc331@gmail.com> wrote:

Over in the incremental sort patch discussion we found [1] a case
where a higher cost plan ends up being chosen because a low startup
cost partial path is ignored in favor of a lower total cost partial
path and a limit is a applied on top of that which would normal favor
the lower startup cost plan.

45be99f8cd5d606086e0a458c9c72910ba8a613d originally added
`add_partial_path` with the comment:

Neither do we need to consider startup costs:
parallelism is only used for plans that will be run to completion.
Therefore, this routine is much simpler than add_path: it needs to
consider only pathkeys and total cost.

I'm not entirely sure if that is still true or not--I can't easily
come up with a scenario in which it's not, but I also can't come up
with an inherent reason why such a scenario cannot exist.

I think I just didn't think carefully about the Limit case.

Thanks! In that case I suggest we treat it as a separate patch/fix,
independent of the incremental sort patch. I don't want to bury it in
that patch series, it's already pretty large.

Now the trick is to figure out a way to demonstrate it in test :)

Basically we need:
Path A: Can short circuit with LIMIT but has high total cost
Path B: Can’t short circuit with LIMIT but has lower total cost

(Both must be parallel aware of course.)

Maybe ordering in B can be a sort node and A can be an index scan (perhaps
with very high random page cost?) and force choosing a parallel plan?

I’m trying to describe this to jog my thoughts (not in front of my laptop
right now so can’t try it out).

Any other ideas?

James

James Coleman

jtc331@gmail.com

over 6 years ago

In reply to: James Coleman (#4)

Re: Consider low startup cost in add_partial_path

On Sat, Sep 28, 2019 at 7:21 PM James Coleman <jtc331@gmail.com> wrote:

Now the trick is to figure out a way to demonstrate it in test :)

Basically we need:
Path A: Can short circuit with LIMIT but has high total cost
Path B: Can’t short circuit with LIMIT but has lower total cost

(Both must be parallel aware of course.)

I'm adding one requirement, or clarifying it anyway: the above paths
must be partial paths, and can't just apply at the top level of the
parallel part of the plan. I.e., the lower startup cost has to matter
at a subtree of the parallel portion of the plan.

Maybe ordering in B can be a sort node and A can be an index scan (perhaps with very high random page cost?) and force choosing a parallel plan?

I’m trying to describe this to jog my thoughts (not in front of my laptop right now so can’t try it out).

Any other ideas?

I've been playing with this a good bit, and I'm struggling to come up
with a test case. Because the issue only manifests in a subtree of the
parallel portion of the plan, a scan on a single relation won't do.
Merge join seems like a good area to look at because it requires
ordering, and that ordering can be either the result of an index scan
(short-circuit-able) or an explicit sort (not short-circuit-able). But
I've been unable to make that result in any different plans with
either 2 or 3 relations joined together, ordered, and a limit applied.

In all cases I've been starting with:

set enable_hashjoin = off;
set enable_nestloop = off;
set max_parallel_workers_per_gather = 4;
set min_parallel_index_scan_size = 0;
set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

I've also tried various combinations of random_page_cost,
cpu_index_tuple_cost, cpu_tuple_cost.

Interestingly I've noticed plans joining two relations that look like:

Limit
-> Merge Join
Merge Cond: (t1.pk = t2.pk)
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t1
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t2

Where I would have expected a Gather Merge above a parallelized merge
join. Is that reasonable to expect?

If there doesn't seem to be an obvious way to reproduce the issue
currently, but we know we have a reproduction example along with
incremental sort, what is the path forward for this? Is it reasonable
to try to commit it anyway knowing that it's a "correct" change and
been demonstrated elsewhere?

James

Robert Haas

robertmhaas@gmail.com

over 6 years ago

In reply to: James Coleman (#5)

Re: Consider low startup cost in add_partial_path

On Wed, Oct 2, 2019 at 10:22 AM James Coleman <jtc331@gmail.com> wrote:

In all cases I've been starting with:

set enable_hashjoin = off;
set enable_nestloop = off;
set max_parallel_workers_per_gather = 4;
set min_parallel_index_scan_size = 0;
set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

I've also tried various combinations of random_page_cost,
cpu_index_tuple_cost, cpu_tuple_cost.

Interestingly I've noticed plans joining two relations that look like:

Limit
-> Merge Join
Merge Cond: (t1.pk = t2.pk)
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t1
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t2

Where I would have expected a Gather Merge above a parallelized merge
join. Is that reasonable to expect?

Well, you told the planner that parallel_setup_cost = 0, so starting
workers is free. And you told the planner that parallel_tuple_cost =
0, so shipping tuples from the worker to the leader is also free. So
it is unclear why it should prefer a single Gather Merge over two
Gather Merges: after all, the Gather Merge is free!

If you use give those things some positive cost, even if it's smaller
than the default, you'll probably get a saner-looking plan choice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

James Coleman

jtc331@gmail.com

about 6 years ago

In reply to: Robert Haas (#6)

Re: Consider low startup cost in add_partial_path

On Fri, Oct 4, 2019 at 8:36 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Oct 2, 2019 at 10:22 AM James Coleman <jtc331@gmail.com> wrote:

In all cases I've been starting with:

set enable_hashjoin = off;
set enable_nestloop = off;
set max_parallel_workers_per_gather = 4;
set min_parallel_index_scan_size = 0;
set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

I've also tried various combinations of random_page_cost,
cpu_index_tuple_cost, cpu_tuple_cost.

Interestingly I've noticed plans joining two relations that look like:

Limit
-> Merge Join
Merge Cond: (t1.pk = t2.pk)
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t1
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t2

Where I would have expected a Gather Merge above a parallelized merge
join. Is that reasonable to expect?

Well, you told the planner that parallel_setup_cost = 0, so starting
workers is free. And you told the planner that parallel_tuple_cost =
0, so shipping tuples from the worker to the leader is also free. So
it is unclear why it should prefer a single Gather Merge over two
Gather Merges: after all, the Gather Merge is free!

If you use give those things some positive cost, even if it's smaller
than the default, you'll probably get a saner-looking plan choice.

That makes sense.

Right now I currently see trying to get this a separate test feels a
bit like a distraction.

Given there doesn't seem to be an obvious way to reproduce the issue
currently, but we know we have a reproduction example along with
incremental sort, what is the path forward for this? Is it reasonable
to try to commit it anyway knowing that it's a "correct" change and
been demonstrated elsewhere?

James

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 6 years ago

In reply to: James Coleman (#7)

3 attachment(s)

Re: Consider low startup cost in add_partial_path

Hi,

For the record, here is the relevant part of the Incremental Sort patch
series, updating add_partial_path and add_partial_path_precheck to also
consider startup cost.

The changes in the first two patches are pretty straight-forward, plus
there's a proposed optimization in the precheck function to only run
compare_pathkeys if entirely necessary. I'm currently evaluating those
changes and I'll post the results to the incremental sort thread.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v54-0001-Consider-low-startup-cost-when-adding-partial-path.patchtext/plain; charset=us-asciiDownload

From 761b935584229243ecc6fd47d83e86d6b1b382c7 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sun, 28 Jul 2019 15:55:54 +0200
Subject: [PATCH 1/5] Consider low startup cost when adding partial path

45be99f8cd5d606086e0a458c9c72910ba8a613d added `add_partial_path` with the
comment:

> Neither do we need to consider startup costs:
> parallelism is only used for plans that will be run to completion.
> Therefore, this routine is much simpler than add_path: it needs to
> consider only pathkeys and total cost.

I'm not entirely sure if that is still true or not--I can't easily come
up with a scenario in which it's not, but I also can't come up with an
inherent reason why such a scenario cannot exist.

Regardless, the in-progress incremental sort patch uncovered a new case
where it definitely no longer holds, and, as a result a higher cost plan
ends up being chosen because a low startup cost partial path is ignored
in favor of a lower total cost partial path and a limit is a applied on
top of that which would normal favor the lower startup cost plan.
---
 src/backend/optimizer/util/pathnode.c | 65 +++++++++++++--------------
 1 file changed, 31 insertions(+), 34 deletions(-)

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 8ba8122ee2..b570bfd3be 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -733,10 +733,11 @@ add_path_precheck(RelOptInfo *parent_rel,
  *
  *	  Because we don't consider parameterized paths here, we also don't
  *	  need to consider the row counts as a measure of quality: every path will
- *	  produce the same number of rows.  Neither do we need to consider startup
- *	  costs: parallelism is only used for plans that will be run to completion.
- *	  Therefore, this routine is much simpler than add_path: it needs to
- *	  consider only pathkeys and total cost.
+ *	  produce the same number of rows.  It may however matter how much the
+ *	  path ordering matches the final ordering, needed by upper parts of the
+ *	  plan. Because that will affect how expensive the incremental sort is,
+ *	  we need to consider both the total and startup path, in addition to
+ *	  pathkeys.
  *
  *	  As with add_path, we pfree paths that are found to be dominated by
  *	  another partial path; this requires that there be no other references to
@@ -774,44 +775,40 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
 		/* Compare pathkeys. */
 		keyscmp = compare_pathkeys(new_path->pathkeys, old_path->pathkeys);
 
-		/* Unless pathkeys are incompatible, keep just one of the two paths. */
+		/*
+		 * Unless pathkeys are incompatible, see if one of the paths dominates
+		 * the other (both in startup and total cost). It may happen that one
+		 * path has lower startup cost, the other has lower total cost.
+		 *
+		 * XXX Perhaps we could do this only when incremental sort is enabled,
+		 * and use the simpler version (comparing just total cost) otherwise?
+		 */
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR)
-			{
-				/* New path costs more; keep it only if pathkeys are better. */
-				if (keyscmp != PATHKEYS_BETTER1)
-					accept_new = false;
-			}
-			else if (old_path->total_cost > new_path->total_cost
-					 * STD_FUZZ_FACTOR)
+			PathCostComparison costcmp;
+
+			/*
+			 * Do a fuzzy cost comparison with standard fuzziness limit.
+			 */
+			costcmp = compare_path_costs_fuzzily(new_path, old_path,
+												 STD_FUZZ_FACTOR);
+
+			if (costcmp == COSTS_BETTER1)
 			{
-				/* Old path costs more; keep it only if pathkeys are better. */
-				if (keyscmp != PATHKEYS_BETTER2)
+				if (keyscmp == PATHKEYS_BETTER1)
 					remove_old = true;
 			}
-			else if (keyscmp == PATHKEYS_BETTER1)
+			else if (costcmp == COSTS_BETTER2)
 			{
-				/* Costs are about the same, new path has better pathkeys. */
-				remove_old = true;
-			}
-			else if (keyscmp == PATHKEYS_BETTER2)
-			{
-				/* Costs are about the same, old path has better pathkeys. */
-				accept_new = false;
-			}
-			else if (old_path->total_cost > new_path->total_cost * 1.0000000001)
-			{
-				/* Pathkeys are the same, and the old path costs more. */
-				remove_old = true;
+				if (keyscmp == PATHKEYS_BETTER2)
+					accept_new = false;
 			}
-			else
+			else if (costcmp == COSTS_EQUAL)
 			{
-				/*
-				 * Pathkeys are the same, and new path isn't materially
-				 * cheaper.
-				 */
-				accept_new = false;
+				if (keyscmp == PATHKEYS_BETTER1)
+					remove_old = true;
+				else if (keyscmp == PATHKEYS_BETTER2)
+					accept_new = false;
 			}
 		}
 
-- 
2.21.1

v54-0002-rework-add_partial_path_precheck-too.patchtext/plain; charset=us-asciiDownload

From 060234426851cb8f815fb873ab5aaf33b3830143 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Fri, 3 Apr 2020 18:26:29 +0200
Subject: [PATCH 2/5] rework add_partial_path_precheck too

---
 src/backend/optimizer/path/joinpath.c | 12 ++++++++---
 src/backend/optimizer/util/pathnode.c | 31 ++++++++++++---------------
 src/include/optimizer/pathnode.h      |  3 ++-
 3 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index db54a6ba2e..1d0c3e8027 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -515,7 +515,9 @@ try_partial_nestloop_path(PlannerInfo *root,
 	 */
 	initial_cost_nestloop(root, &workspace, jointype,
 						  outer_path, inner_path, extra);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel,
+								   workspace.startup_cost, workspace.total_cost,
+								   pathkeys))
 		return;
 
 	/*
@@ -693,7 +695,9 @@ try_partial_mergejoin_path(PlannerInfo *root,
 						   outersortkeys, innersortkeys,
 						   extra);
 
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
+	if (!add_partial_path_precheck(joinrel,
+								   workspace.startup_cost, workspace.total_cost,
+								   pathkeys))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
@@ -817,7 +821,9 @@ try_partial_hashjoin_path(PlannerInfo *root,
 	 */
 	initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
 						  outer_path, inner_path, extra, parallel_hash);
-	if (!add_partial_path_precheck(joinrel, workspace.total_cost, NIL))
+	if (!add_partial_path_precheck(joinrel,
+								   workspace.startup_cost, workspace.total_cost,
+								   NIL))
 		return;
 
 	/* Might be good enough to be worth trying, so let's try it. */
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index b570bfd3be..7211fc35fd 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -854,15 +854,14 @@ add_partial_path(RelOptInfo *parent_rel, Path *new_path)
  * add_partial_path_precheck
  *	  Check whether a proposed new partial path could possibly get accepted.
  *
- * Unlike add_path_precheck, we can ignore startup cost and parameterization,
- * since they don't matter for partial paths (see add_partial_path).  But
- * we do want to make sure we don't add a partial path if there's already
- * a complete path that dominates it, since in that case the proposed path
- * is surely a loser.
+ * Unlike add_path_precheck, we can ignore parameterization, since it doesn't
+ * matter for partial paths (see add_partial_path).  But we do want to make
+ * sure we don't add a partial path if there's already a complete path that
+ * dominates it, since in that case the proposed path is surely a loser.
  */
 bool
-add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
-						  List *pathkeys)
+add_partial_path_precheck(RelOptInfo *parent_rel, Cost startup_cost,
+						  Cost total_cost, List *pathkeys)
 {
 	ListCell   *p1;
 
@@ -885,11 +884,14 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 		keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys);
 		if (keyscmp != PATHKEYS_DIFFERENT)
 		{
-			if (total_cost > old_path->total_cost * STD_FUZZ_FACTOR &&
-				keyscmp != PATHKEYS_BETTER1)
+			if ((startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR) &&
+				(total_cost > old_path->total_cost * STD_FUZZ_FACTOR) &&
+				(keyscmp != PATHKEYS_BETTER1))
 				return false;
-			if (old_path->total_cost > total_cost * STD_FUZZ_FACTOR &&
-				keyscmp != PATHKEYS_BETTER2)
+
+			if ((old_path->startup_cost > startup_cost * STD_FUZZ_FACTOR) &&
+				(old_path->total_cost > total_cost * STD_FUZZ_FACTOR) &&
+				(keyscmp != PATHKEYS_BETTER2))
 				return true;
 		}
 	}
@@ -899,13 +901,8 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost total_cost,
 	 * clearly good enough that it might replace one.  Compare it to
 	 * non-parallel plans.  If it loses even before accounting for the cost of
 	 * the Gather node, we should definitely reject it.
-	 *
-	 * Note that we pass the total_cost to add_path_precheck twice.  This is
-	 * because it's never advantageous to consider the startup cost of a
-	 * partial path; the resulting plans, if run in parallel, will be run to
-	 * completion.
 	 */
-	if (!add_path_precheck(parent_rel, total_cost, total_cost, pathkeys,
+	if (!add_path_precheck(parent_rel, startup_cost, total_cost, pathkeys,
 						   NULL))
 		return false;
 
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index e450fe112a..e73c5637cc 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -32,7 +32,8 @@ extern bool add_path_precheck(RelOptInfo *parent_rel,
 							  List *pathkeys, Relids required_outer);
 extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path);
 extern bool add_partial_path_precheck(RelOptInfo *parent_rel,
-									  Cost total_cost, List *pathkeys);
+									  Cost startup_cost, Cost total_cost,
+									  List *pathkeys);
 
 extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,
 								 Relids required_outer, int parallel_workers);
-- 
2.21.1

v54-0003-rework-add_partial_path_precheck-check-costs-first.patchtext/plain; charset=us-asciiDownload

From e5dc0ccd72cd4dce25de22982a1d950ae73f1f6a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Fri, 3 Apr 2020 18:43:50 +0200
Subject: [PATCH 3/5] rework add_partial_path_precheck - check costs first

---
 src/backend/optimizer/util/pathnode.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 7211fc35fd..4e798b801a 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -880,18 +880,25 @@ add_partial_path_precheck(RelOptInfo *parent_rel, Cost startup_cost,
 	{
 		Path	   *old_path = (Path *) lfirst(p1);
 		PathKeysComparison keyscmp;
+		bool		compared = false;
 
-		keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys);
-		if (keyscmp != PATHKEYS_DIFFERENT)
+		if ((startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR) &&
+			(total_cost > old_path->total_cost * STD_FUZZ_FACTOR))
 		{
-			if ((startup_cost > old_path->startup_cost * STD_FUZZ_FACTOR) &&
-				(total_cost > old_path->total_cost * STD_FUZZ_FACTOR) &&
-				(keyscmp != PATHKEYS_BETTER1))
+			keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys);
+			compared = true;
+
+			if (keyscmp != PATHKEYS_BETTER1)
 				return false;
+		}
+
+		if ((old_path->startup_cost > startup_cost * STD_FUZZ_FACTOR) &&
+			(old_path->total_cost > total_cost * STD_FUZZ_FACTOR))
+		{
+			if (!compared)
+				keyscmp = compare_pathkeys(pathkeys, old_path->pathkeys);
 
-			if ((old_path->startup_cost > startup_cost * STD_FUZZ_FACTOR) &&
-				(old_path->total_cost > total_cost * STD_FUZZ_FACTOR) &&
-				(keyscmp != PATHKEYS_BETTER2))
+			if (keyscmp != PATHKEYS_BETTER2)
 				return true;
 		}
 	}
-- 
2.21.1