postgres_fdw: oddity in costing aggregate pushdown paths
Hi,
While working on [1]/messages/by-id/5BBC4140.50403@lab.ntt.co.jp, I noticed that since we don't set the selectivity
and cost of the local_conds (i.e., fpinfo->local_conds_sel and
fpinfo->local_conds_cost) properly in add_foreign_grouping_paths and
foreign_grouping_ok, estimate_path_cost_size produces considerably
underestimated results when use_remote_estimate. I think this would
cause problems in later planning steps. So, why not add something like
this to add_foreign_grouping_paths, as done in other functions such as
postgresGetForeignJopinPaths?
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5496,6 +5496,19 @@ add_foreign_grouping_paths(PlannerInfo *root,
RelOptInfo\
*input_rel,
if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
return;
+ /*
+ * Compute the selectivity and cost of the local_conds, so we don't have
+ * to do it over again for each path. The best we can do for these
+ * conditions is to estimate selectivity on the basis of local
statistics.
+ */
+ fpinfo->local_conds_sel = clauselist_selectivity(root,
+ fpinfo->local_conds,
+ 0,
+ JOIN_INNER,
+ NULL);
+
+ cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+
/* Estimate the cost of push down */
estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows,
&width, &startup_cost, &total_cost);
Best regards,
Etsuro Fujita
(2018/11/27 21:55), Etsuro Fujita wrote:
While working on [1], I noticed that since we don't set the selectivity
and cost of the local_conds (i.e., fpinfo->local_conds_sel and
fpinfo->local_conds_cost) properly in add_foreign_grouping_paths and
foreign_grouping_ok, estimate_path_cost_size produces considerably
underestimated results when use_remote_estimate. I think this would
cause problems in later planning steps. So, why not add something like
this to add_foreign_grouping_paths, as done in other functions such as
postgresGetForeignJopinPaths?--- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -5496,6 +5496,19 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo\ *input_rel, if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual)) return;+ /* + * Compute the selectivity and cost of the local_conds, so we don't have + * to do it over again for each path. The best we can do for these + * conditions is to estimate selectivity on the basis of local statistics. + */ + fpinfo->local_conds_sel = clauselist_selectivity(root, + fpinfo->local_conds, + 0, + JOIN_INNER, + NULL); + + cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root); + /* Estimate the cost of push down */ estimate_path_cost_size(root, grouped_rel, NIL, NIL,&rows, &width,&startup_cost,&total_cost);
Here is a patch for that.
BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:
* Also, core does not care about costing HAVING expressions and
* adding that to the costs. So similarly, here too we are not
* considering remote and local conditions for costing.
I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well? I think this
comment needs to be updated at least, though.
Best regards,
Etsuro Fujita
Attachments:
postgres-fdw-aggregate-pushdown-costsize-1.patchtext/x-patch; name=postgres-fdw-aggregate-pushdown-costsize-1.patchDownload
*** a/contrib/postgres_fdw/expected/postgres_fdw.out
--- b/contrib/postgres_fdw/expected/postgres_fdw.out
***************
*** 3209,3214 **** select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
--- 3209,3216 ----
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
(6 rows)
+ -- Update local stats on ft2
+ ANALYZE ft2;
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 5496,5501 **** add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
--- 5496,5514 ----
if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
return;
+ /*
+ * Compute the selectivity and cost of the local_conds, so we don't have
+ * to do it over again for each path. The best we can do for these
+ * conditions is to estimate selectivity on the basis of local statistics.
+ */
+ fpinfo->local_conds_sel = clauselist_selectivity(root,
+ fpinfo->local_conds,
+ 0,
+ JOIN_INNER,
+ NULL);
+
+ cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+
/* Estimate the cost of push down */
estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows,
&width, &startup_cost, &total_cost);
*** a/contrib/postgres_fdw/sql/postgres_fdw.sql
--- b/contrib/postgres_fdw/sql/postgres_fdw.sql
***************
*** 807,812 **** create operator class my_op_class for type int using btree family my_op_family a
--- 807,815 ----
explain (verbose, costs off)
select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
+ -- Update local stats on ft2
+ ANALYZE ft2;
+
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
(2018/11/28 13:38), Etsuro Fujita wrote:
BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:* Also, core does not care about costing HAVING expressions and
* adding that to the costs. So similarly, here too we are not
* considering remote and local conditions for costing.I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well?
There seems to be no objections, I updated the patch as such. Attached
is an updated version of the patch.
Best regards,
Etsuro Fujita
Attachments:
postgres-fdw-aggregate-pushdown-costsize-2.patchtext/x-patch; name=postgres-fdw-aggregate-pushdown-costsize-2.patchDownload
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e653c30..dfa6201 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3209,6 +3209,8 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
(6 rows)
+-- Update local stats on ft2
+ANALYZE ft2;
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d22c974..d9b26b5 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2844,10 +2844,6 @@ estimate_path_cost_size(PlannerInfo *root,
* strategy will be considered at remote side, thus for
* simplicity, we put all startup related costs in startup_cost
* and all finalization and run cost are added in total_cost.
- *
- * Also, core does not care about costing HAVING expressions and
- * adding that to the costs. So similarly, here too we are not
- * considering remote and local conditions for costing.
*/
ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
@@ -2880,10 +2876,25 @@ estimate_path_cost_size(PlannerInfo *root,
input_rows, NULL);
/*
- * Number of rows expected from foreign server will be same as
- * that of number of groups.
+ * Get the retrieved-rows and rows estimates. If there are HAVING
+ * quals, account for the selectivities of the remotely-checked
+ * and locally-checked quals.
*/
- rows = retrieved_rows = numGroups;
+ if (root->parse->havingQual)
+ {
+ retrieved_rows =
+ clamp_row_est(numGroups *
+ clauselist_selectivity(root,
+ fpinfo->remote_conds,
+ 0,
+ JOIN_INNER,
+ NULL));
+ rows = clamp_row_est(retrieved_rows * fpinfo->local_conds_sel);
+ }
+ else
+ {
+ rows = retrieved_rows = numGroups;
+ }
/*-----
* Startup cost includes:
@@ -2909,6 +2920,20 @@ estimate_path_cost_size(PlannerInfo *root,
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
run_cost += ptarget->cost.per_tuple * numGroups;
+
+ /* Accout for the eval cost of HAVING quals, if any */
+ if (root->parse->havingQual)
+ {
+ QualCost remote_cost;
+
+ /* Add in the eval cost of the remotely-checked quals */
+ cost_qual_eval(&remote_cost, fpinfo->remote_conds, root);
+ startup_cost += remote_cost.startup;
+ run_cost += remote_cost.per_tuple * numGroups;
+ /* Add in the eval cost of the locally-checked quals */
+ startup_cost += fpinfo->local_conds_cost.startup;
+ run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
+ }
}
else
{
@@ -5496,6 +5521,19 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
return;
+ /*
+ * Compute the selectivity and cost of the local_conds, so we don't have
+ * to do it over again for each path. The best we can do for these
+ * conditions is to estimate selectivity on the basis of local statistics.
+ */
+ fpinfo->local_conds_sel = clauselist_selectivity(root,
+ fpinfo->local_conds,
+ 0,
+ JOIN_INNER,
+ NULL);
+
+ cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+
/* Estimate the cost of push down */
estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows,
&width, &startup_cost, &total_cost);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 6aa9a7f..f963e99 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -807,6 +807,9 @@ create operator class my_op_class for type int using btree family my_op_family a
explain (verbose, costs off)
select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
+-- Update local stats on ft2
+ANALYZE ft2;
+
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
(2018/11/30 18:51), Etsuro Fujita wrote:
(2018/11/28 13:38), Etsuro Fujita wrote:
BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:* Also, core does not care about costing HAVING expressions and
* adding that to the costs. So similarly, here too we are not
* considering remote and local conditions for costing.I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well?There seems to be no objections, I updated the patch as such. Attached
is an updated version of the patch.
I revised some comments a bit and added the commit message. Attached is
an updated patch. If there are no objections, I'll apply this to HEAD only.
Best regards,
Etsuro Fujita
Attachments:
postgres-fdw-aggregate-pushdown-costsize-3.patchtext/x-patch; name=postgres-fdw-aggregate-pushdown-costsize-3.patchDownload
From 1596c46b0531b9ca2433352e47e70cf1f22d4f44 Mon Sep 17 00:00:00 2001
From: Etsuro Fujita <efujita@postgresql.org>
Date: Mon, 3 Dec 2018 19:52:47 +0900
Subject: [PATCH] postgres_fdw: Improve cost and size estimation for aggregte
pushdown
In commit 7012b132d07c2b4ea15b0b3cb1ea9f3278801d98, which added aggregate
pushdown to postgres_fdw, we didn't account for the evaluation cost and the
selectivity of HAVING quals attached to ForeignPaths performing aggregate
pushdown, as core had never accounted for that for AggPaths and GroupPaths.
And we didn't set these values of the locally-checked quals (ie, fpinfo's
local_conds_cost and local_conds_sel), which were initialized to zeros, but
since estimate_path_cost_size factors in these to estimate the result size
and the evaluation cost of such a ForeignPath when the use_remote_estimate
option is enabled, this caused it to produce underestimated results in that
mode.
By commit 7b6c07547190f056b0464098bb5a2247129d7aa2 core was changed so that
it accounts for the evaluation cost and the selectivity of HAVING quals in
aggregation paths, so change the postgres_fdw's aggregate pushdown code as
well as such. This not only fixes the underestimation issue mentioned
above, but improves the estimation using local statistics in that function
when that option is diabled.
This would be a bug fix rather than an improvement, but apply it to HEAD
only to avoid destabilizing existing plan choices.
Author: Etsuro Fujita
Discussion: https://postgr.es/m/5BFD3EAD.2060301%40lab.ntt.co.jp
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +
contrib/postgres_fdw/postgres_fdw.c | 56 ++++++++++++++++++++++----
contrib/postgres_fdw/sql/postgres_fdw.sql | 3 ++
3 files changed, 54 insertions(+), 7 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e653c30..dfa6201 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3209,6 +3209,8 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
(6 rows)
+-- Update local stats on ft2
+ANALYZE ft2;
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d22c974..159bf4b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2844,10 +2844,6 @@ estimate_path_cost_size(PlannerInfo *root,
* strategy will be considered at remote side, thus for
* simplicity, we put all startup related costs in startup_cost
* and all finalization and run cost are added in total_cost.
- *
- * Also, core does not care about costing HAVING expressions and
- * adding that to the costs. So similarly, here too we are not
- * considering remote and local conditions for costing.
*/
ofpinfo = (PgFdwRelationInfo *) fpinfo->outerrel->fdw_private;
@@ -2880,10 +2876,26 @@ estimate_path_cost_size(PlannerInfo *root,
input_rows, NULL);
/*
- * Number of rows expected from foreign server will be same as
- * that of number of groups.
+ * Get the retrieved-rows and rows estimates. If there are HAVING
+ * quals, account for their selectivity.
*/
- rows = retrieved_rows = numGroups;
+ if (root->parse->havingQual)
+ {
+ /* Factor in the selectivity of the remotely-checked quals */
+ retrieved_rows =
+ clamp_row_est(numGroups *
+ clauselist_selectivity(root,
+ fpinfo->remote_conds,
+ 0,
+ JOIN_INNER,
+ NULL));
+ /* Factor in the selectivity of the locally-checked quals */
+ rows = clamp_row_est(retrieved_rows * fpinfo->local_conds_sel);
+ }
+ else
+ {
+ rows = retrieved_rows = numGroups;
+ }
/*-----
* Startup cost includes:
@@ -2909,6 +2921,20 @@ estimate_path_cost_size(PlannerInfo *root,
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
run_cost += ptarget->cost.per_tuple * numGroups;
+
+ /* Accout for the eval cost of HAVING quals, if any */
+ if (root->parse->havingQual)
+ {
+ QualCost remote_cost;
+
+ /* Add in the eval cost of the remotely-checked quals */
+ cost_qual_eval(&remote_cost, fpinfo->remote_conds, root);
+ startup_cost += remote_cost.startup;
+ run_cost += remote_cost.per_tuple * numGroups;
+ /* Add in the eval cost of the locally-checked quals */
+ startup_cost += fpinfo->local_conds_cost.startup;
+ run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
+ }
}
else
{
@@ -5496,6 +5522,22 @@ add_foreign_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
if (!foreign_grouping_ok(root, grouped_rel, extra->havingQual))
return;
+ /*
+ * Compute the selectivity and cost of the local_conds, so we don't have
+ * to do it over again for each path. (Currently we create just a single
+ * path here, but in future it would be possible that we build more paths
+ * such as pre-sorted paths as in postgresGetForeignPaths and
+ * postgresGetForeignJoinPaths.) The best we can do for these conditions
+ * is to estimate selectivity on the basis of local statistics.
+ */
+ fpinfo->local_conds_sel = clauselist_selectivity(root,
+ fpinfo->local_conds,
+ 0,
+ JOIN_INNER,
+ NULL);
+
+ cost_qual_eval(&fpinfo->local_conds_cost, fpinfo->local_conds, root);
+
/* Estimate the cost of push down */
estimate_path_cost_size(root, grouped_rel, NIL, NIL, &rows,
&width, &startup_cost, &total_cost);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 6aa9a7f..f963e99 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -807,6 +807,9 @@ create operator class my_op_class for type int using btree family my_op_family a
explain (verbose, costs off)
select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6 and c1 < 100 group by c2;
+-- Update local stats on ft2
+ANALYZE ft2;
+
-- Add into extension
alter extension postgres_fdw add operator class my_op_class using btree;
alter extension postgres_fdw add function my_op_cmp(a int, b int);
--
1.8.3.1
(2018/12/03 20:20), Etsuro Fujita wrote:
(2018/11/30 18:51), Etsuro Fujita wrote:
(2018/11/28 13:38), Etsuro Fujita wrote:
BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:* Also, core does not care about costing HAVING expressions and
* adding that to the costs. So similarly, here too we are not
* considering remote and local conditions for costing.I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well?There seems to be no objections, I updated the patch as such. Attached
is an updated version of the patch.I revised some comments a bit and added the commit message. Attached is
an updated patch. If there are no objections, I'll apply this to HEAD only.
Done after fixing typos in the commit message.
Best regards,
Etsuro Fujita
(2018/12/04 17:24), Etsuro Fujita wrote:
(2018/12/03 20:20), Etsuro Fujita wrote:
(2018/11/30 18:51), Etsuro Fujita wrote:
(2018/11/28 13:38), Etsuro Fujita wrote:
BTW another thing I noticed is this comment on costing aggregate
pushdown paths using local statistics in estimate_path_cost_size:* Also, core does not care about costing HAVING expressions and
* adding that to the costs. So similarly, here too we are not
* considering remote and local conditions for costing.I think this was true when aggregate pushdown went in, but isn't anymore
because of commit 7b6c07547190f056b0464098bb5a2247129d7aa2. So we
should update estimate_path_cost_size so that it accounts for the
selectivity and cost of the HAVING expressions as well?There seems to be no objections, I updated the patch as such. Attached
is an updated version of the patch.I revised some comments a bit and added the commit message. Attached is
an updated patch. If there are no objections, I'll apply this to HEAD only.Done after fixing typos in the commit message.
I noticed that I forgot to modify the cost for evaluating the PathTarget
for each output row accordingly to this change :(. Attached is a patch
for that.
Best regards,
Etsuro Fujita
Attachments:
fix-postgres-fdw-grouping-path-cost.patchtext/x-patch; name=fix-postgres-fdw-grouping-path-cost.patchDownload
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 2937,2943 **** estimate_path_cost_size(PlannerInfo *root,
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
! run_cost += ptarget->cost.per_tuple * numGroups;
/* Accout for the eval cost of HAVING quals, if any */
if (root->parse->havingQual)
--- 2937,2943 ----
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
! run_cost += ptarget->cost.per_tuple * rows;
/* Accout for the eval cost of HAVING quals, if any */
if (root->parse->havingQual)
(2018/12/28 17:28), Etsuro Fujita wrote:
I noticed that I forgot to modify the cost for evaluating the PathTarget
for each output row accordingly to this change :(. Attached is a patch
for that.
On reflection, I noticed these on estimate_path_cost_size, other than that:
1) In the case of a foreign-grouping path, we failed to adjust the
PathTarget eval cost when using the remote estimates, which I think
would be needed because the PathTarget expressions cannot always be
pre-computed as entries of the fdw_scan_tlist for the foreign-grouping path.
2) We also failed to factor in the eval cost for the foreign-scan and
foreign-join cases, with/without the remote estimates; in the scan/join
cases, the PathTarget might contain PHVs, so the cost of evaluating PHVs
should be charged. Currently, PHVs are evaluated locally, so the cost
of PHV expressions should also be factored in when using the remote
estimates.
Here is a new version of the patch.
Best regards,
Etsuro Fujita
Attachments:
fix-estimate_path_cost_size.patchtext/x-patch; name=fix-estimate_path_cost_size.patchDownload
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 2ac9d7b..3304ebe 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2667,6 +2667,9 @@ estimate_path_cost_size(PlannerInfo *root,
Cost total_cost;
Cost cpu_per_tuple;
+ /* Make sure the core code has set up the relation's reltarget */
+ Assert(foreignrel->reltarget);
+
/*
* If the table or the server is configured to use remote estimates,
* connect to the foreign server and execute EXPLAIN to estimate the
@@ -2744,6 +2747,25 @@ estimate_path_cost_size(PlannerInfo *root,
cost_qual_eval(&local_cost, local_param_join_conds, root);
startup_cost += local_cost.startup;
total_cost += local_cost.per_tuple * retrieved_rows;
+
+ /* Add in the tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ total_cost += foreignrel->reltarget->cost.per_tuple * rows;
+
+ /*
+ * In the case of a foreign-grouping path, the fdw_scan_tlist would
+ * have contained tlist expressions that will be pushed down to the
+ * remote as-is including at least grouping expressions; adjust the
+ * eval cost.
+ */
+ if (IS_UPPER_REL(foreignrel))
+ {
+ QualCost tlist_cost;
+
+ cost_qual_eval(&tlist_cost, fdw_scan_tlist, root);
+ startup_cost -= tlist_cost.startup;
+ total_cost -= tlist_cost.per_tuple * rows;
+ }
}
else
{
@@ -2842,19 +2864,19 @@ estimate_path_cost_size(PlannerInfo *root,
nrows = clamp_row_est(nrows * fpinfo->joinclause_sel);
run_cost += nrows * remote_conds_cost.per_tuple;
run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
+
+ /* Add in the tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
else if (IS_UPPER_REL(foreignrel))
{
PgFdwRelationInfo *ofpinfo;
- PathTarget *ptarget = foreignrel->reltarget;
AggClauseCosts aggcosts;
double input_rows;
int numGroupCols;
double numGroups = 1;
- /* Make sure the core code set the pathtarget. */
- Assert(ptarget != NULL);
-
/*
* This cost model is mixture of costing done for sorted and
* hashed aggregates in cost_agg(). We are not sure which
@@ -2918,26 +2940,22 @@ estimate_path_cost_size(PlannerInfo *root,
* Startup cost includes:
* 1. Startup cost for underneath input relation
* 2. Cost of performing aggregation, per cost_agg()
- * 3. Startup cost for PathTarget eval
*-----
*/
startup_cost = ofpinfo->rel_startup_cost;
startup_cost += aggcosts.transCost.startup;
startup_cost += aggcosts.transCost.per_tuple * input_rows;
startup_cost += (cpu_operator_cost * numGroupCols) * input_rows;
- startup_cost += ptarget->cost.startup;
/*-----
* Run time cost includes:
* 1. Run time cost of underneath input relation
* 2. Run time cost of performing aggregation, per cost_agg()
- * 3. PathTarget eval cost for each output row
*-----
*/
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
- run_cost += ptarget->cost.per_tuple * numGroups;
/* Accout for the eval cost of HAVING quals, if any */
if (root->parse->havingQual)
@@ -2952,6 +2970,10 @@ estimate_path_cost_size(PlannerInfo *root,
startup_cost += fpinfo->local_conds_cost.startup;
run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
}
+
+ /* Add in the tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
else
{
@@ -2970,6 +2992,10 @@ estimate_path_cost_size(PlannerInfo *root,
startup_cost += foreignrel->baserestrictcost.startup;
cpu_per_tuple = cpu_tuple_cost + foreignrel->baserestrictcost.per_tuple;
run_cost += cpu_per_tuple * foreignrel->tuples;
+
+ /* Add int the tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
/*
(2019/01/04 20:33), Etsuro Fujita wrote:
Here is a new version of the patch.
Attached is an updated version of the patch.
Changes:
* Fixed a stupid bug in the case when use_remote_estimate
* Fixed a typo in a comment I added
* Modified comments I added a little bit further
* Added the commit message
If there are no objections, I'll push that.
Best regards,
Etsuro Fujita
Attachments:
0001-postgres_fdw-Account-for-tlist-eval-costs-in-estimat.patchtext/x-patch; name=0001-postgres_fdw-Account-for-tlist-eval-costs-in-estimat.patchDownload
From 3ff65a6f05a330140bf5de94fbcf6ab2a04f040b Mon Sep 17 00:00:00 2001
From: Etsuro Fujita <efujita@postgresql.org>
Date: Wed, 23 Jan 2019 20:30:30 +0900
Subject: [PATCH] postgres_fdw: Account for tlist eval costs in
estimate_path_cost_size()
Previously, estimate_path_cost_size() didn't account for tlist eval
costs, except when costing a foreign-grouping path using local
statistics, but such costs should be accounted for when costing that path
using remote estimates, because some of the tlist expressions might be
evaluated locally. Also, such costs should be accounted for in the case
of a foreign-scan or foreign-join path, because currently, postgres_fdw
evaluates PlaceHolderVars (if any) locally.
This also fixes an oversight in my commit
f8f6e44676ef38fee7a5bbe4f256a34ea7799ac1.
Like that commit, apply this to HEAD only to avoid destabilizing existing
plan choices.
Author: Etsuro Fujita
Discussion: https://postgr.es/m/5BFD3EAD.2060301%40lab.ntt.co.jp
---
contrib/postgres_fdw/postgres_fdw.c | 41 +++++++++++++++++++++++------
1 file changed, 33 insertions(+), 8 deletions(-)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d85a83abe9..f460fd2ec4 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2499,6 +2499,9 @@ estimate_path_cost_size(PlannerInfo *root,
Cost total_cost;
Cost cpu_per_tuple;
+ /* Make sure the core code has set up the relation's reltarget */
+ Assert(foreignrel->reltarget);
+
/*
* If the table or the server is configured to use remote estimates,
* connect to the foreign server and execute EXPLAIN to estimate the
@@ -2576,6 +2579,24 @@ estimate_path_cost_size(PlannerInfo *root,
cost_qual_eval(&local_cost, local_param_join_conds, root);
startup_cost += local_cost.startup;
total_cost += local_cost.per_tuple * retrieved_rows;
+
+ /*
+ * Add in tlist eval cost for each output row. In case of an
+ * aggregate, some of the tlist expressions such as grouping
+ * exressions will be evaluated remotely, so adjust the costs.
+ */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ total_cost += foreignrel->reltarget->cost.startup;
+ total_cost += foreignrel->reltarget->cost.per_tuple * rows;
+ if (IS_UPPER_REL(foreignrel))
+ {
+ QualCost tlist_cost;
+
+ cost_qual_eval(&tlist_cost, fdw_scan_tlist, root);
+ startup_cost -= tlist_cost.startup;
+ total_cost -= tlist_cost.startup;
+ total_cost -= tlist_cost.per_tuple * rows;
+ }
}
else
{
@@ -2674,19 +2695,19 @@ estimate_path_cost_size(PlannerInfo *root,
nrows = clamp_row_est(nrows * fpinfo->joinclause_sel);
run_cost += nrows * remote_conds_cost.per_tuple;
run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
+
+ /* Add in tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
else if (IS_UPPER_REL(foreignrel))
{
PgFdwRelationInfo *ofpinfo;
- PathTarget *ptarget = foreignrel->reltarget;
AggClauseCosts aggcosts;
double input_rows;
int numGroupCols;
double numGroups = 1;
- /* Make sure the core code set the pathtarget. */
- Assert(ptarget != NULL);
-
/*
* This cost model is mixture of costing done for sorted and
* hashed aggregates in cost_agg(). We are not sure which
@@ -2750,26 +2771,22 @@ estimate_path_cost_size(PlannerInfo *root,
* Startup cost includes:
* 1. Startup cost for underneath input relation
* 2. Cost of performing aggregation, per cost_agg()
- * 3. Startup cost for PathTarget eval
*-----
*/
startup_cost = ofpinfo->rel_startup_cost;
startup_cost += aggcosts.transCost.startup;
startup_cost += aggcosts.transCost.per_tuple * input_rows;
startup_cost += (cpu_operator_cost * numGroupCols) * input_rows;
- startup_cost += ptarget->cost.startup;
/*-----
* Run time cost includes:
* 1. Run time cost of underneath input relation
* 2. Run time cost of performing aggregation, per cost_agg()
- * 3. PathTarget eval cost for each output row
*-----
*/
run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost;
run_cost += aggcosts.finalCost * numGroups;
run_cost += cpu_tuple_cost * numGroups;
- run_cost += ptarget->cost.per_tuple * numGroups;
/* Accout for the eval cost of HAVING quals, if any */
if (root->parse->havingQual)
@@ -2784,6 +2801,10 @@ estimate_path_cost_size(PlannerInfo *root,
startup_cost += fpinfo->local_conds_cost.startup;
run_cost += fpinfo->local_conds_cost.per_tuple * retrieved_rows;
}
+
+ /* Add in tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
else
{
@@ -2802,6 +2823,10 @@ estimate_path_cost_size(PlannerInfo *root,
startup_cost += foreignrel->baserestrictcost.startup;
cpu_per_tuple = cpu_tuple_cost + foreignrel->baserestrictcost.per_tuple;
run_cost += cpu_per_tuple * foreignrel->tuples;
+
+ /* Add in tlist eval cost for each output row */
+ startup_cost += foreignrel->reltarget->cost.startup;
+ run_cost += foreignrel->reltarget->cost.per_tuple * rows;
}
/*
--
2.19.2
(2019/01/23 20:35), Etsuro Fujita wrote:
Attached is an updated version of the patch.
Changes:
* Fixed a stupid bug in the case when use_remote_estimate
* Fixed a typo in a comment I added
* Modified comments I added a little bit further
* Added the commit messageIf there are no objections, I'll push that.
Pushed after fixing another typo in a comment.
Best regards,
Etsuro Fujita