postgres_fdw: estimate_path_cost_size fails to re-use cached costs
Hi,
I noticed yet another thing while updating the patch for pushing down
ORDER BY LIMIT. Let me explain. When costing foreign paths on the
basis of local statistics, we calculate/cache the costs of an unsorted
foreign path, and re-use them to estimate the costs of presorted foreign
paths, as shown below. BUT: we fail to re-use them for some typical
queries, such as "select * from ft1 order by a", due to
fpinfo->rel_startup_cost=0, leading to doing the same cost calculation
repeatedly.
/*
* We will come here again and again with different set of pathkeys
* that caller wants to cost. We don't need to calculate the cost of
* bare scan each time. Instead, use the costs if we have cached
them
* already.
*/
if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
}
I think we should use "fpinfo->rel_startup_cost >= 0" here, not
"fpinfo->rel_startup_cost > 0". Also, it would be possible that the
total cost calculated is zero in corner cases (eg, seq_page_cost=0 and
cpu_tuple_cost=0 for the example), so I think we should change the total
cost part as well. Attached is a patch for that.
Best regards,
Etsuro Fujita
Attachments:
yet-another-fix-for-estimate_path_cost_size.patchtext/x-patch; name=yet-another-fix-for-estimate_path_cost_size.patchDownload
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***************
*** 2625,2631 **** estimate_path_cost_size(PlannerInfo *root,
* bare scan each time. Instead, use the costs if we have cached them
* already.
*/
! if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
--- 2625,2631 ----
* bare scan each time. Instead, use the costs if we have cached them
* already.
*/
! if (fpinfo->rel_startup_cost >= 0 && fpinfo->rel_total_cost >= 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
(2019/01/25 20:33), Etsuro Fujita wrote:
I noticed yet another thing while updating the patch for pushing down
ORDER BY LIMIT. Let me explain. When costing foreign paths on the
basis of local statistics, we calculate/cache the costs of an unsorted
foreign path, and re-use them to estimate the costs of presorted foreign
paths, as shown below. BUT: we fail to re-use them for some typical
queries, such as "select * from ft1 order by a", due to
fpinfo->rel_startup_cost=0, leading to doing the same cost calculation
repeatedly./*
* We will come here again and again with different set of pathkeys
* that caller wants to cost. We don't need to calculate the cost of
* bare scan each time. Instead, use the costs if we have cached
them
* already.
*/
if (fpinfo->rel_startup_cost> 0&& fpinfo->rel_total_cost> 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
}I think we should use "fpinfo->rel_startup_cost>= 0" here, not
"fpinfo->rel_startup_cost> 0". Also, it would be possible that the
total cost calculated is zero in corner cases (eg, seq_page_cost=0 and
cpu_tuple_cost=0 for the example), so I think we should change the total
cost part as well. Attached is a patch for that.
I added the commit message. Updated patch attached. If no objections,
I'll apply that to HEAD only as there are no reports of actual trouble
from this, as far as I know.
Best regards,
Etsuro Fujita
Attachments:
0001-postgres_fdw-Fix-test-for-cached-costs-in-estimate_p.patchtext/x-patch; name=0001-postgres_fdw-Fix-test-for-cached-costs-in-estimate_p.patchDownload
From 24b84c230434e3dec6b20b47a859b834fbaabfd3 Mon Sep 17 00:00:00 2001
From: Etsuro Fujita <efujita@postgresql.org>
Date: Mon, 28 Jan 2019 19:33:46 +0900
Subject: [PATCH] postgres_fdw: Fix test for cached costs in
estimate_path_cost_size().
estimate_path_cost_size() failed to re-use cached costs when the cached
startup/total cost was 0, so it calculated the costs redundantly.
This is an oversight in commit aa09cd242f; but apply the patch to HEAD
only because there are no reports of actual trouble from that.
Author: Etsuro Fujita
Discussion: https://postgr.es/m/5C4AF3F3.4060409%40lab.ntt.co.jp
---
contrib/postgres_fdw/postgres_fdw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 9244fe7571..1a88919cfc 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2625,7 +2625,7 @@ estimate_path_cost_size(PlannerInfo *root,
* bare scan each time. Instead, use the costs if we have cached them
* already.
*/
- if (fpinfo->rel_startup_cost > 0 && fpinfo->rel_total_cost > 0)
+ if (fpinfo->rel_startup_cost >= 0 && fpinfo->rel_total_cost >= 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
--
2.19.2
(2019/01/28 19:37), Etsuro Fujita wrote:
(2019/01/25 20:33), Etsuro Fujita wrote:
I noticed yet another thing while updating the patch for pushing down
ORDER BY LIMIT. Let me explain. When costing foreign paths on the
basis of local statistics, we calculate/cache the costs of an unsorted
foreign path, and re-use them to estimate the costs of presorted foreign
paths, as shown below. BUT: we fail to re-use them for some typical
queries, such as "select * from ft1 order by a", due to
fpinfo->rel_startup_cost=0, leading to doing the same cost calculation
repeatedly./*
* We will come here again and again with different set of pathkeys
* that caller wants to cost. We don't need to calculate the cost of
* bare scan each time. Instead, use the costs if we have cached
them
* already.
*/
if (fpinfo->rel_startup_cost> 0&& fpinfo->rel_total_cost> 0)
{
startup_cost = fpinfo->rel_startup_cost;
run_cost = fpinfo->rel_total_cost - fpinfo->rel_startup_cost;
}I think we should use "fpinfo->rel_startup_cost>= 0" here, not
"fpinfo->rel_startup_cost> 0". Also, it would be possible that the
total cost calculated is zero in corner cases (eg, seq_page_cost=0 and
cpu_tuple_cost=0 for the example), so I think we should change the total
cost part as well. Attached is a patch for that.I added the commit message. Updated patch attached. If no objections,
I'll apply that to HEAD only as there are no reports of actual trouble
from this, as far as I know.
Pushed.
Best regards,
Etsuro Fujita