A wrong index choose issue because of inaccurate statistics

Started by Andy Fanover 5 years ago8 messages
#1Andy Fan
zhihui.fan1213@gmail.com
3 attachment(s)

This thread is a follow-up of thread [1]https://postgrespro.com/list/id/8810.1590714246@sss.pgh.pa.us where I don't have a good writing
to
describe the issue and solution in my mind. So I start this thread to fix
that
and also enrich the topic by taking the advices from Ashutosh, Tomas and
Tom.

Inaccurate statistics is not avoidable and can cause lots of issue, I don't
think we can fix much of them, but the issue I want to talk about now like
:select * from t where a = x and b = y, and we have index on (a, b) and (a,
c);
The current implementation may choose (a, c) at last while I think we should
always choose index (a, b) over (a, c).

Why will the (a, c) be choose? If planner think a = x has only 1 row, it
will cost
the index (a, b) as "an index access to with 2 qual checking and get 1 row
+ table
scan with the index result,". the cost of (a, c) as "an index access with
1 qual
and get 1 row, and table scan the 1 row and filter the another qual". There
is no cost
difference for the qual filter on index scan and table scan, so the final
cost is
exactly same. If the cost is exactly same, which path is found first,
which one will be choose at last. You can use the attached reproduced.sql
to
reproduce this issue.

The solution in my mind is just hacking the cost model to treat the qual
filter
on table scan is slightly higher so that the index (a, b) will be always
choose.
At the same time, planner still think only 1 row returned which maybe wrong,
but that's the issue I can fix here and will not impact on the final index
choose
anyway.

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double
loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate
statistics
+        * issue, we will add a extra cost to qpquals so that the less
qpquals
+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases
worse? My
answer is no based on my current knowledge, and this is most important place
I want to get advised. The mainly impact I can figure out is: it not only
change the cost difference between (a, b) and (a, c) it also cause the cost
difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

I test TPC-H to see if this change causes any unexpected behavior, the
data and index are setup based on [2]https://ankane.org/tpc-h, the attached normal.log is the plan
without this patch and patched.log is the plan with this patch. In summary,
I
didn't find any unexpected behaviors because of that change. All the plans
whose cost changed have the following pattern which is expected.

Index Scan ...
Index Cond: ...
Filter: ...

Any thought?

[1]: https://postgrespro.com/list/id/8810.1590714246@sss.pgh.pa.us
[2]: https://ankane.org/tpc-h

--
Best Regards
Andy Fan

Attachments:

normal.logapplication/octet-stream; name=normal.logDownload
patched.logapplication/octet-stream; name=patched.logDownload
reproduce.sqlapplication/octet-stream; name=reproduce.sqlDownload
#2Andy Fan
zhihui.fan1213@gmail.com
In reply to: Andy Fan (#1)
Re: A wrong index choose issue because of inaccurate statistics

Why will the (a, c) be choose? If planner think a = x has only 1 row ..

I just did more research and found above statement is not accurate,
the root cause of this situation is because IndexSelectivity = 0. Even
through I
don't think we can fix anything here since IndexSelectivity is calculated
from
statistics and we don't know it is 1% wrong or 10% wrong or more just like
What
Tomas said.

The way of fixing it is just add a "small" extra cost for pqquals for index
scan. that should be small enough to not have impacts on others part. I will
discuss how small is small later with details, we can say it just a guc
variable
for now.

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double
loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate
statistics
+        * issue, we will add a extra cost to qpquals so that the less
qpquals
+        * the lower cost it has.
+        */
+       cpu_run_cost += stat_stale_cost * list_length(qpquals);
+

If we want to reduce the impact of this change further, we can only add
this if
the IndexSelecivity == 0.

How to set the value of stat_stale_cost? Since the minimum cost for a query
should be a cpu_tuple_cost which is 0.01 default. Adding an 0.01 cost for
each
pqqual in index scan should not make a big difference. However sometimes
we may
set it to 0.13 if we consider index->tree_height was estimated wrongly for
1 (cost is
50 * 0.0025 = 0.125). I don't know how it happened, but looks it do happen
in prod
environment. At the same time it is unlikely index->tree_height is estimated
wrongly for 2 or more. so basically we can set this value to 0(totally
disable
this feature), 0.01 (should be ok for most case), 0.13 (A bit aggressive).

The wrong estimation of IndexSelectitity = 0 might be common case and if
people just have 2 related index like (A, B) and (A, C). we have 50%
chances to
have a wrong decision, so I would say this case worth the troubles. My
current
implementation looks not cool, so any suggestion to research further is
pretty
welcome.

--
Best Regards
Andy Fan

#3David Rowley
dgrowleyml@gmail.com
In reply to: Andy Fan (#1)
Re: A wrong index choose issue because of inaccurate statistics

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate statistics
+        * issue, we will add a extra cost to qpquals so that the less qpquals
+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases worse? My
answer is no based on my current knowledge, and this is most important place
I want to get advised. The mainly impact I can figure out is: it not only
change the cost difference between (a, b) and (a, c) it also cause the cost
difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]/messages/by-id/20200529001602.eu7vuiouuuiclpgb@development?

I understand that you wish to increase the cost by some seemingly
innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I think you need to think of a more generic solution and propose that
instead. There are plenty of other quirks in the planner that can
cause suffering due to inaccurate or non-existing statistics. For
example, due to how we multiply individual selectivity estimates,
having a few correlated columns in a join condition can cause the
number of join rows to be underestimated. Subsequent joins can then
end up making bad choices on which join operator to use based on those
inaccurate row estimates. There's also a problem with WHERE <x> ORDER
BY col LIMIT n; sometimes choosing an index that provides pre-sorted
input to the ORDER BY but cannot use <x> as an indexable condition.
We don't record any stats to make better choices there, maybe we
should, but either way, we're taking a bit risk there as all the rows
matching <x> might be right at the end of the index and we might need
to scan the entire thing before hitting the LIMIT. For now, we just
assume completely even distribution of rows. i.e. If there are 50 rows
estimated in the path and the limit is for 5 rows, then we'll assume
we need to read 10% of those before finding all the ones we need. In
reality, everything matching <x> might be 95% through the index and we
could end up reading 100% of rows. That particular problem is not just
caused by the uneven distribution of rows in the index, but also from
selectivity underestimation.

I'd more recently wondered if we shouldn't have some sort of "risk"
factor to the cost model. I really don't have ideas on how exactly we
would calculate the risk factor in all cases, but in this case, say
the risk factor always starts out as 1. We could multiply that risk
factor by some >1 const each time we find another index filter qual.
add_path() can prefer lower risk paths when the costs are similar.
Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

Anyway, it's pretty much a large research subject which would take a
lot of work to iron out even just the design. It's likely not a
perfect idea, but I think it has a bit more merit that trying to
introduce lies to the cost modal to account for a single case where
there is a problem.

David

[1]: /messages/by-id/20200529001602.eu7vuiouuuiclpgb@development

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: David Rowley (#3)
Re: A wrong index choose issue because of inaccurate statistics

pá 5. 6. 2020 v 8:19 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root,

double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate

statistics

+ * issue, we will add a extra cost to qpquals so that the less

qpquals

+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases

worse? My

answer is no based on my current knowledge, and this is most important

place

I want to get advised. The mainly impact I can figure out is: it not only
change the cost difference between (a, b) and (a, c) it also cause the

cost

difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]?

I understand that you wish to increase the cost by some seemingly
innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I think you need to think of a more generic solution and propose that
instead. There are plenty of other quirks in the planner that can
cause suffering due to inaccurate or non-existing statistics. For
example, due to how we multiply individual selectivity estimates,
having a few correlated columns in a join condition can cause the
number of join rows to be underestimated. Subsequent joins can then
end up making bad choices on which join operator to use based on those
inaccurate row estimates. There's also a problem with WHERE <x> ORDER
BY col LIMIT n; sometimes choosing an index that provides pre-sorted
input to the ORDER BY but cannot use <x> as an indexable condition.
We don't record any stats to make better choices there, maybe we
should, but either way, we're taking a bit risk there as all the rows
matching <x> might be right at the end of the index and we might need
to scan the entire thing before hitting the LIMIT. For now, we just
assume completely even distribution of rows. i.e. If there are 50 rows
estimated in the path and the limit is for 5 rows, then we'll assume
we need to read 10% of those before finding all the ones we need. In
reality, everything matching <x> might be 95% through the index and we
could end up reading 100% of rows. That particular problem is not just
caused by the uneven distribution of rows in the index, but also from
selectivity underestimation.

I'd more recently wondered if we shouldn't have some sort of "risk"
factor to the cost model. I really don't have ideas on how exactly we
would calculate the risk factor in all cases, but in this case, say
the risk factor always starts out as 1. We could multiply that risk
factor by some >1 const each time we find another index filter qual.
add_path() can prefer lower risk paths when the costs are similar.
Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

I thought about these ideas too. And I am not alone.

https://hal.archives-ouvertes.fr/hal-01316823/document

Regards

Pavel

Anyway, it's pretty much a large research subject which would take a

Show quoted text

lot of work to iron out even just the design. It's likely not a
perfect idea, but I think it has a bit more merit that trying to
introduce lies to the cost modal to account for a single case where
there is a problem.

David

[1]
/messages/by-id/20200529001602.eu7vuiouuuiclpgb@development

#5Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Pavel Stehule (#4)
Re: A wrong index choose issue because of inaccurate statistics

I know one project where they used PostgreSQL code base to detect
"robust plans". https://dsl.cds.iisc.ac.in/projects/PICASSO/. Some of
the papers cited in https://www.vldb.org/pvldb/vldb2010/papers/D01.pdf
describe the idea.

In short, the idea is to annotate a plan with a "bandwidth" i.e. how
does the plan fair with degradation of statistics. A plan which has a
slightly higher cost which doesn't degrade much with degradation of
statistics is preferred over a low cost plan whose cost rises sharply
with degradation of statistics. This is similar to what David is
suggesting.

On Fri, Jun 5, 2020 at 12:00 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

pá 5. 6. 2020 v 8:19 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate statistics
+        * issue, we will add a extra cost to qpquals so that the less qpquals
+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases worse? My
answer is no based on my current knowledge, and this is most important place
I want to get advised. The mainly impact I can figure out is: it not only
change the cost difference between (a, b) and (a, c) it also cause the cost
difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]?

I understand that you wish to increase the cost by some seemingly
innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I think you need to think of a more generic solution and propose that
instead. There are plenty of other quirks in the planner that can
cause suffering due to inaccurate or non-existing statistics. For
example, due to how we multiply individual selectivity estimates,
having a few correlated columns in a join condition can cause the
number of join rows to be underestimated. Subsequent joins can then
end up making bad choices on which join operator to use based on those
inaccurate row estimates. There's also a problem with WHERE <x> ORDER
BY col LIMIT n; sometimes choosing an index that provides pre-sorted
input to the ORDER BY but cannot use <x> as an indexable condition.
We don't record any stats to make better choices there, maybe we
should, but either way, we're taking a bit risk there as all the rows
matching <x> might be right at the end of the index and we might need
to scan the entire thing before hitting the LIMIT. For now, we just
assume completely even distribution of rows. i.e. If there are 50 rows
estimated in the path and the limit is for 5 rows, then we'll assume
we need to read 10% of those before finding all the ones we need. In
reality, everything matching <x> might be 95% through the index and we
could end up reading 100% of rows. That particular problem is not just
caused by the uneven distribution of rows in the index, but also from
selectivity underestimation.

I'd more recently wondered if we shouldn't have some sort of "risk"
factor to the cost model. I really don't have ideas on how exactly we
would calculate the risk factor in all cases, but in this case, say
the risk factor always starts out as 1. We could multiply that risk
factor by some >1 const each time we find another index filter qual.
add_path() can prefer lower risk paths when the costs are similar.
Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

I thought about these ideas too. And I am not alone.

https://hal.archives-ouvertes.fr/hal-01316823/document

Regards

Pavel

Anyway, it's pretty much a large research subject which would take a
lot of work to iron out even just the design. It's likely not a
perfect idea, but I think it has a bit more merit that trying to
introduce lies to the cost modal to account for a single case where
there is a problem.

David

[1] /messages/by-id/20200529001602.eu7vuiouuuiclpgb@development

--
Best Wishes,
Ashutosh Bapat

#6Andy Fan
zhihui.fan1213@gmail.com
In reply to: David Rowley (#3)
3 attachment(s)
Re: A wrong index choose issue because of inaccurate statistics

On Fri, Jun 5, 2020 at 2:19 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root,

double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate

statistics

+ * issue, we will add a extra cost to qpquals so that the less

qpquals

+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases

worse? My

answer is no based on my current knowledge, and this is most important

place

I want to get advised. The mainly impact I can figure out is: it not only
change the cost difference between (a, b) and (a, c) it also cause the

cost

difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]?

Thanks for chiming in. I treat this as I didn't describe my idea clearly
enough then
both Tomas and Tom didn't spend much time to read it (no offense, and I
understand they need to do lots of things every day), so I re-summarize the
issue to make it easier to read.

In Tomas's reply, he raises concerns about how to fix the issue, since we
don't know how much it errored 1%, 10% and so on, so I emphasized I don't
touch that part actually. Even the wrong estimation still plays a bad role
on
later join, but that would not be the issue I would fix here.

I understand that you wish to increase the cost by some seemingly

innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I agree with that to some extent. However we can just provide more options
to
users. At the same time, I still believe we should provide such options
very carefully.

Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

I probably underestimated the impacts for a large number of join quals.
This looks like a weakness we can't ignore confomforably, so I checked
the code further, maybe we don't need a risk factor for this case.

For query WHERE a = 1 AND b = 2, both Selectivity(a = 1) and
Selectivity(b = 2) are greater than 0 even the statistics are stale enough,
so the IndexSelectivity is greater than 0 as well. Based on this,
IndexSelectivity(A, B) should be less than IndexSelectivity(A, C) for the
above query However they still generate the same cost because of the
below code. (genericcostestimate and btcostestimate)

numIndexTuples = indexSelectivity * index->rel->tuples;
numIndexTuples = rint(numIndexTuples / num_sa_scans);

if (numIndexTuples < 1.0)

numIndexTuples = 1.0;

later numIndexTuples is used to calculate cost. The above code eats the
difference of indexSelectivity.

Many places say we need to "round to integer" but I am still not figuring
out
why it is a must. so If we can "round to integer" just after the
IndexCost
is calculated, the issue can be fixed as well.

The attached is the patch in my mind, since this patch may lower the index
costs if the numIndexTuples < 1.0, so it makes the optimizer prefer to use
nest loop rather than a hash join if the loop is small, which is a common
case in our regression test where we don't have much data, so there are
several changes like that.

Another impact I found in the regression test is that optimizer choose an
IndexScan of a conditional index rather than IndexOnlyScan a normal index.
I checked the code and didn't find anything wrong, so I'd like to say that
is
because the data is too small.

I also tested TPC-H workload, Query-12 changed hash join to nest loop,
The execution time changed from1605.118ms to 699.032ms.

the root cause of this situation is because IndexSelectivity = 0.

This is wrong. I got the 0 by elog(INFO, "%f", IndexSelectivity).
that reported 0 while the value is pretty small.

--
Best Regards
Andy Fan

Attachments:

v1-0001-Delay-the-round-to-integer-of-numIndexTuples-so-t.patchapplication/octet-stream; name=v1-0001-Delay-the-round-to-integer-of-numIndexTuples-so-t.patchDownload
From 83759d8d9a7e989ccae5c86c027a69a2757f8085 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Tue, 9 Jun 2020 11:21:33 +0800
Subject: [PATCH v1] Delay the "round to integer" of numIndexTuples so that the
 different

IndexSelectivity can generate different IndexCosts for sure.
---
 src/backend/utils/adt/selfuncs.c             |  15 ++-
 src/test/regress/expected/create_index.out   |   6 +-
 src/test/regress/expected/join.out           |  42 +++----
 src/test/regress/expected/partition_join.out | 125 +++++++++----------
 4 files changed, 90 insertions(+), 98 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 0f02f32ffd..1ceeb5a515 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -6061,10 +6061,11 @@ genericcostestimate(PlannerInfo *root,
 		 * The above calculation counts all the tuples visited across all
 		 * scans induced by ScalarArrayOpExpr nodes.  We want to consider the
 		 * average per-indexscan number, so adjust.  This is a handy place to
-		 * round to integer, too.  (If caller supplied tuple estimate, it's
+		 * round to integer (XXX: what is the purpose of round to integer?), 
+		 * too.  (If caller supplied tuple estimate, it's
 		 * responsible for handling these considerations.)
 		 */
-		numIndexTuples = rint(numIndexTuples / num_sa_scans);
+		numIndexTuples = numIndexTuples / num_sa_scans;
 	}
 
 	/*
@@ -6074,8 +6075,6 @@ genericcostestimate(PlannerInfo *root,
 	 */
 	if (numIndexTuples > index->tuples)
 		numIndexTuples = index->tuples;
-	if (numIndexTuples < 1.0)
-		numIndexTuples = 1.0;
 
 	/*
 	 * Estimate the number of index pages that will be retrieved.
@@ -6090,7 +6089,7 @@ genericcostestimate(PlannerInfo *root,
 	 * and leave it to the caller to add a suitable charge if needed.
 	 */
 	if (index->pages > 1 && index->tuples > 1)
-		numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);
+		numIndexPages = numIndexTuples * index->pages / index->tuples;
 	else
 		numIndexPages = 1.0;
 
@@ -6184,8 +6183,8 @@ genericcostestimate(PlannerInfo *root,
 	costs->indexTotalCost = indexTotalCost;
 	costs->indexSelectivity = indexSelectivity;
 	costs->indexCorrelation = indexCorrelation;
-	costs->numIndexPages = numIndexPages;
-	costs->numIndexTuples = numIndexTuples;
+	costs->numIndexPages = ceil(numIndexPages < 1.0 ? 1.0 : numIndexPages);
+	costs->numIndexTuples = ceil(numIndexTuples < 1.0 ? 1.0: numIndexTuples);
 	costs->spc_random_page_cost = spc_random_page_cost;
 	costs->num_sa_scans = num_sa_scans;
 }
@@ -6386,7 +6385,7 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 		 * ScalarArrayOpExpr quals included in indexBoundQuals, and then round
 		 * to integer.
 		 */
-		numIndexTuples = rint(numIndexTuples / num_sa_scans);
+		numIndexTuples = numIndexTuples / num_sa_scans;
 	}
 
 	/*
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index e3e6634d7e..1685ffb9b0 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1825,15 +1825,15 @@ SELECT count(*) FROM tenk1
 ---------------------------------------------------------------------------------
  Aggregate
    ->  Bitmap Heap Scan on tenk1
-         Recheck Cond: ((hundred = 42) AND ((thousand = 42) OR (thousand = 99)))
+         Recheck Cond: (((thousand = 42) OR (thousand = 99)) AND (hundred = 42))
          ->  BitmapAnd
-               ->  Bitmap Index Scan on tenk1_hundred
-                     Index Cond: (hundred = 42)
                ->  BitmapOr
                      ->  Bitmap Index Scan on tenk1_thous_tenthous
                            Index Cond: (thousand = 42)
                      ->  Bitmap Index Scan on tenk1_thous_tenthous
                            Index Cond: (thousand = 99)
+               ->  Bitmap Index Scan on tenk1_hundred
+                     Index Cond: (hundred = 42)
 (11 rows)
 
 SELECT count(*) FROM tenk1
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index a46b1573bd..efc3a48d3e 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3725,9 +3725,10 @@ explain (costs off)
 select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
   where a.unique2 < 10 and coalesce(b.twothousand, a.twothousand) = 44;
-                                         QUERY PLAN                                          
----------------------------------------------------------------------------------------------
+                             QUERY PLAN                              
+---------------------------------------------------------------------
  Nested Loop Left Join
+   Join Filter: (c.unique2 = COALESCE(b.twothousand, a.twothousand))
    ->  Nested Loop Left Join
          Filter: (COALESCE(b.twothousand, a.twothousand) = 44)
          ->  Index Scan using tenk1_unique2 on tenk1 a
@@ -3737,8 +3738,8 @@ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
                ->  Bitmap Index Scan on tenk1_thous_tenthous
                      Index Cond: (thousand = a.unique1)
    ->  Index Scan using tenk1_unique2 on tenk1 c
-         Index Cond: ((unique2 = COALESCE(b.twothousand, a.twothousand)) AND (unique2 = 44))
-(11 rows)
+         Index Cond: (unique2 = 44)
+(12 rows)
 
 select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 = coalesce(b.twothousand, a.twothousand)
@@ -3763,22 +3764,22 @@ left join
    using (join_key)
   ) foo3
 using (join_key);
-                                QUERY PLAN                                
---------------------------------------------------------------------------
- Nested Loop Left Join
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Hash Right Join
    Output: "*VALUES*".column1, i1.f1, (666)
-   Join Filter: ("*VALUES*".column1 = i1.f1)
-   ->  Values Scan on "*VALUES*"
+   Hash Cond: (i1.f1 = "*VALUES*".column1)
+   ->  Nested Loop Left Join
+         Output: i1.f1, 666
+         ->  Seq Scan on public.int4_tbl i1
+               Output: i1.f1
+         ->  Index Only Scan using tenk1_unique2 on public.tenk1 i2
+               Output: i2.unique2
+               Index Cond: (i2.unique2 = i1.f1)
+   ->  Hash
          Output: "*VALUES*".column1
-   ->  Materialize
-         Output: i1.f1, (666)
-         ->  Nested Loop Left Join
-               Output: i1.f1, 666
-               ->  Seq Scan on public.int4_tbl i1
-                     Output: i1.f1
-               ->  Index Only Scan using tenk1_unique2 on public.tenk1 i2
-                     Output: i2.unique2
-                     Index Cond: (i2.unique2 = i1.f1)
+         ->  Values Scan on "*VALUES*"
+               Output: "*VALUES*".column1
 (14 rows)
 
 select foo1.join_key as foo1_id, foo3.join_key AS foo3_id, bug_field from
@@ -6209,10 +6210,9 @@ where j1.id1 % 1000 = 1 and j2.id1 % 1000 = 1 and j2.id1 >= any (array[1,5]);
    Merge Cond: (j1.id1 = j2.id1)
    Join Filter: (j1.id2 = j2.id2)
    ->  Index Scan using j1_id1_idx on j1
-   ->  Index Only Scan using j2_pkey on j2
+   ->  Index Scan using j2_id1_idx on j2
          Index Cond: (id1 >= ANY ('{1,5}'::integer[]))
-         Filter: ((id1 % 1000) = 1)
-(7 rows)
+(6 rows)
 
 select * from j1
 inner join j2 on j1.id1 = j2.id1 and j1.id2 = j2.id2
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index b45a590b94..525dc65fb3 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -284,8 +284,8 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1 WHERE a < 450) t1 FULL JO
 -- Semi-join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t2.b FROM prt2 t2 WHERE t2.a = 0) AND t1.b = 0 ORDER BY t1.a;
-                    QUERY PLAN                    
---------------------------------------------------
+                          QUERY PLAN                           
+---------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
@@ -303,14 +303,15 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t2.b FROM prt2 t2 WHERE t2.a = 0)
                ->  Hash
                      ->  Seq Scan on prt2_p2 t2_2
                            Filter: (a = 0)
-         ->  Nested Loop Semi Join
-               Join Filter: (t1_3.a = t2_3.b)
-               ->  Seq Scan on prt1_p3 t1_3
-                     Filter: (b = 0)
-               ->  Materialize
+         ->  Nested Loop
+               ->  HashAggregate
+                     Group Key: t2_3.b
                      ->  Seq Scan on prt2_p3 t2_3
                            Filter: (a = 0)
-(24 rows)
+               ->  Index Scan using iprt1_p3_a on prt1_p3 t1_3
+                     Index Cond: (a = t2_3.b)
+                     Filter: (b = 0)
+(25 rows)
 
 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t2.b FROM prt2 t2 WHERE t2.a = 0) AND t1.b = 0 ORDER BY t1.a;
   a  | b |  c   
@@ -643,8 +644,8 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM prt1 t1, prt2 t2, prt1_e t
 
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) LEFT JOIN prt1_e t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.b = 0 ORDER BY t1.a, t2.b, t3.a + t3.b;
-                          QUERY PLAN                          
---------------------------------------------------------------
+                             QUERY PLAN                              
+---------------------------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
@@ -668,17 +669,16 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
                            ->  Hash
                                  ->  Seq Scan on prt1_p2 t1_2
                                        Filter: (b = 0)
-         ->  Hash Right Join
-               Hash Cond: (((t3_3.a + t3_3.b) / 2) = t1_3.a)
-               ->  Seq Scan on prt1_e_p3 t3_3
-               ->  Hash
-                     ->  Hash Right Join
-                           Hash Cond: (t2_3.b = t1_3.a)
-                           ->  Seq Scan on prt2_p3 t2_3
-                           ->  Hash
-                                 ->  Seq Scan on prt1_p3 t1_3
-                                       Filter: (b = 0)
-(33 rows)
+         ->  Nested Loop Left Join
+               ->  Hash Right Join
+                     Hash Cond: (t2_3.b = t1_3.a)
+                     ->  Seq Scan on prt2_p3 t2_3
+                     ->  Hash
+                           ->  Seq Scan on prt1_p3 t1_3
+                                 Filter: (b = 0)
+               ->  Index Scan using iprt1_e_p3_ab2 on prt1_e_p3 t3_3
+                     Index Cond: (((a + b) / 2) = t1_3.a)
+(32 rows)
 
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) LEFT JOIN prt1_e t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.b = 0 ORDER BY t1.a, t2.b, t3.a + t3.b;
   a  |  c   |  b  |  c   | ?column? | c 
@@ -699,8 +699,8 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
 
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b;
-                            QUERY PLAN                             
--------------------------------------------------------------------
+                             QUERY PLAN                              
+---------------------------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
@@ -723,15 +723,14 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
                ->  Index Scan using iprt2_p2_b on prt2_p2 t2_2
                      Index Cond: (b = t1_2.a)
          ->  Nested Loop Left Join
-               ->  Hash Right Join
-                     Hash Cond: (t1_3.a = ((t3_3.a + t3_3.b) / 2))
-                     ->  Seq Scan on prt1_p3 t1_3
-                     ->  Hash
-                           ->  Seq Scan on prt1_e_p3 t3_3
-                                 Filter: (c = 0)
+               ->  Nested Loop Left Join
+                     ->  Seq Scan on prt1_e_p3 t3_3
+                           Filter: (c = 0)
+                     ->  Index Scan using iprt1_p3_a on prt1_p3 t1_3
+                           Index Cond: (a = ((t3_3.a + t3_3.b) / 2))
                ->  Index Scan using iprt2_p3_b on prt2_p3 t2_3
                      Index Cond: (b = t1_3.a)
-(30 rows)
+(29 rows)
 
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b;
   a  |  c   |  b  |  c   | ?column? | c 
@@ -2229,8 +2228,8 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
@@ -2240,19 +2239,17 @@ SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a
                      Filter: (b = 0)
                ->  Hash
                      ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Semi Join
-               Hash Cond: (t1_2.a = t2_2.b)
+         ->  Nested Loop Semi Join
                ->  Seq Scan on prt1_adv_p2 t1_2
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Semi Join
-               Hash Cond: (t1_3.a = t2_3.b)
+               ->  Index Only Scan using prt2_adv_p2_b_idx on prt2_adv_p2 t2_2
+                     Index Cond: (b = t1_2.a)
+         ->  Nested Loop Semi Join
                ->  Seq Scan on prt1_adv_p3 t1_3
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
-(21 rows)
+               ->  Index Only Scan using prt2_adv_p3_b_idx on prt2_adv_p3 t2_3
+                     Index Cond: (b = t1_3.a)
+(19 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
   a  | b |  c   
@@ -2315,8 +2312,8 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
@@ -2332,13 +2329,12 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
                      Filter: (b = 0)
                ->  Hash
                      ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
+         ->  Nested Loop Anti Join
                ->  Seq Scan on prt1_adv_p3 t1_3
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
-(21 rows)
+               ->  Index Only Scan using prt2_adv_p3_b_idx on prt2_adv_p3 t2_3
+                     Index Cond: (b = t1_3.a)
+(20 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
   a  | b |  c   
@@ -2438,8 +2434,8 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
@@ -2449,19 +2445,17 @@ SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a
                      Filter: (b = 0)
                ->  Hash
                      ->  Seq Scan on prt2_adv_p1 t2_1
-         ->  Hash Semi Join
-               Hash Cond: (t1_2.a = t2_2.b)
+         ->  Nested Loop Semi Join
                ->  Seq Scan on prt1_adv_p2 t1_2
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Semi Join
-               Hash Cond: (t1_3.a = t2_3.b)
+               ->  Index Only Scan using prt2_adv_p2_b_idx on prt2_adv_p2 t2_2
+                     Index Cond: (b = t1_2.a)
+         ->  Nested Loop Semi Join
                ->  Seq Scan on prt1_adv_p3 t1_3
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
-(21 rows)
+               ->  Index Only Scan using prt2_adv_p3_b_idx on prt2_adv_p3 t2_3
+                     Index Cond: (b = t1_3.a)
+(19 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
   a  | b |  c   
@@ -2550,8 +2544,8 @@ SELECT t1.b, t1.c, t2.a, t2.c FROM prt2_adv t1 LEFT JOIN prt1_adv t2 ON (t1.b =
 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN                      
-------------------------------------------------------
+                                  QUERY PLAN                                   
+-------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Append
@@ -2567,13 +2561,12 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
                      Filter: (b = 0)
                ->  Hash
                      ->  Seq Scan on prt2_adv_p2 t2_2
-         ->  Hash Anti Join
-               Hash Cond: (t1_3.a = t2_3.b)
+         ->  Nested Loop Anti Join
                ->  Seq Scan on prt1_adv_p3 t1_3
                      Filter: (b = 0)
-               ->  Hash
-                     ->  Seq Scan on prt2_adv_p3 t2_3
-(21 rows)
+               ->  Index Only Scan using prt2_adv_p3_b_idx on prt2_adv_p3 t2_3
+                     Index Cond: (b = t1_3.a)
+(20 rows)
 
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
   a  | b |  c   
-- 
2.21.0

patched.logapplication/octet-stream; name=patched.logDownload
normal.logapplication/octet-stream; name=normal.logDownload
#7Andy Fan
zhihui.fan1213@gmail.com
In reply to: Pavel Stehule (#4)
Re: A wrong index choose issue because of inaccurate statistics

On Fri, Jun 5, 2020 at 2:30 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

pá 5. 6. 2020 v 8:19 odesílatel David Rowley <dgrowleyml@gmail.com>
napsal:

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root,

double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate

statistics

+ * issue, we will add a extra cost to qpquals so that the less

qpquals

+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases

worse? My

answer is no based on my current knowledge, and this is most important

place

I want to get advised. The mainly impact I can figure out is: it not

only

change the cost difference between (a, b) and (a, c) it also cause the

cost

difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]?

I understand that you wish to increase the cost by some seemingly
innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I think you need to think of a more generic solution and propose that
instead. There are plenty of other quirks in the planner that can
cause suffering due to inaccurate or non-existing statistics. For
example, due to how we multiply individual selectivity estimates,
having a few correlated columns in a join condition can cause the
number of join rows to be underestimated. Subsequent joins can then
end up making bad choices on which join operator to use based on those
inaccurate row estimates. There's also a problem with WHERE <x> ORDER
BY col LIMIT n; sometimes choosing an index that provides pre-sorted
input to the ORDER BY but cannot use <x> as an indexable condition.
We don't record any stats to make better choices there, maybe we
should, but either way, we're taking a bit risk there as all the rows
matching <x> might be right at the end of the index and we might need
to scan the entire thing before hitting the LIMIT. For now, we just
assume completely even distribution of rows. i.e. If there are 50 rows
estimated in the path and the limit is for 5 rows, then we'll assume
we need to read 10% of those before finding all the ones we need. In
reality, everything matching <x> might be 95% through the index and we
could end up reading 100% of rows. That particular problem is not just
caused by the uneven distribution of rows in the index, but also from
selectivity underestimation.

I'd more recently wondered if we shouldn't have some sort of "risk"
factor to the cost model. I really don't have ideas on how exactly we
would calculate the risk factor in all cases, but in this case, say
the risk factor always starts out as 1. We could multiply that risk
factor by some >1 const each time we find another index filter qual.
add_path() can prefer lower risk paths when the costs are similar.
Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

I thought about these ideas too. And I am not alone.

https://hal.archives-ouvertes.fr/hal-01316823/document

Thanks for the documents, after checking it, that is more like
oracle's statistics feedback[1]https://blogs.oracle.com/optimizer/cardinality-feedback. Hope we can have it someday:)

[1]: https://blogs.oracle.com/optimizer/cardinality-feedback

--
Best Regards
Andy Fan

#8Andy Fan
zhihui.fan1213@gmail.com
In reply to: Ashutosh Bapat (#5)
Re: A wrong index choose issue because of inaccurate statistics

On Mon, Jun 8, 2020 at 10:16 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

I know one project where they used PostgreSQL code base to detect
"robust plans". https://dsl.cds.iisc.ac.in/projects/PICASSO/. Some of
the papers cited in https://www.vldb.org/pvldb/vldb2010/papers/D01.pdf
describe the idea.

In short, the idea is to annotate a plan with a "bandwidth" i.e. how

does the plan fair with degradation of statistics. A plan which has a
slightly higher cost which doesn't degrade much with degradation of
statistics is preferred over a low cost plan whose cost rises sharply
with degradation of statistics. This is similar to what David is
suggesting.

Great to know them, thank you for sharing it, links have been bookmarked.

On Fri, Jun 5, 2020 at 12:00 PM Pavel Stehule <pavel.stehule@gmail.com>

wrote:

pá 5. 6. 2020 v 8:19 odesílatel David Rowley <dgrowleyml@gmail.com>

napsal:

On Mon, 1 Jun 2020 at 01:24, Andy Fan <zhihui.fan1213@gmail.com> wrote:

The one-line fix describe the exact idea in my mind:

+++ b/src/backend/optimizer/path/costsize.c
@@ -730,6 +730,13 @@ cost_index(IndexPath *path, PlannerInfo *root,

double loop_count,

cpu_run_cost += cpu_per_tuple * tuples_fetched;

+       /*
+        * To make the planner more robust to handle some inaccurate

statistics

+ * issue, we will add a extra cost to qpquals so that the

less qpquals

+        * the lower cost it has.
+        */
+       cpu_run_cost += 0.01 * list_length(qpquals);
+

This change do fix the issue above, but will it make some other cases

worse? My

answer is no based on my current knowledge, and this is most

important place

I want to get advised. The mainly impact I can figure out is: it not

only

change the cost difference between (a, b) and (a, c) it also cause

the cost

difference between Index scan on (a, c) and seq scan. However the
cost different between index scan and seq scan are huge by practice so
I don't think this impact is harmful.

Didn't that idea already get shot down in the final paragraph on [1]?

I understand that you wish to increase the cost by some seemingly
innocent constant to fix your problem case. Here are my thoughts
about that: Telling lies is not really that easy to pull off. Bad
liers think it's easy and good ones know it's hard. The problem is
that the lies can start small, but then at some point the future you
must fashion some more lies to account for your initial lie. Rinse and
repeat that a few times and before you know it, your course is set
well away from the point of truth. I feel the part about "rinse and
repeat" applies reasonably well to how join costing works. The lie is
likely to be amplified as the join level gets deeper.

I think you need to think of a more generic solution and propose that
instead. There are plenty of other quirks in the planner that can
cause suffering due to inaccurate or non-existing statistics. For
example, due to how we multiply individual selectivity estimates,
having a few correlated columns in a join condition can cause the
number of join rows to be underestimated. Subsequent joins can then
end up making bad choices on which join operator to use based on those
inaccurate row estimates. There's also a problem with WHERE <x> ORDER
BY col LIMIT n; sometimes choosing an index that provides pre-sorted
input to the ORDER BY but cannot use <x> as an indexable condition.
We don't record any stats to make better choices there, maybe we
should, but either way, we're taking a bit risk there as all the rows
matching <x> might be right at the end of the index and we might need
to scan the entire thing before hitting the LIMIT. For now, we just
assume completely even distribution of rows. i.e. If there are 50 rows
estimated in the path and the limit is for 5 rows, then we'll assume
we need to read 10% of those before finding all the ones we need. In
reality, everything matching <x> might be 95% through the index and we
could end up reading 100% of rows. That particular problem is not just
caused by the uneven distribution of rows in the index, but also from
selectivity underestimation.

I'd more recently wondered if we shouldn't have some sort of "risk"
factor to the cost model. I really don't have ideas on how exactly we
would calculate the risk factor in all cases, but in this case, say
the risk factor always starts out as 1. We could multiply that risk
factor by some >1 const each time we find another index filter qual.
add_path() can prefer lower risk paths when the costs are similar.
Unsure what the exact add_path logic would be. Perhaps a GUC would
need to assist with the decision there. Likewise, with
NestedLoopPaths which have a large number of join quals, the risk
factor could go up a bit with those so that we take a stronger
consideration for hash or merge joins instead.

I thought about these ideas too. And I am not alone.

https://hal.archives-ouvertes.fr/hal-01316823/document

Regards

Pavel

Anyway, it's pretty much a large research subject which would take a
lot of work to iron out even just the design. It's likely not a
perfect idea, but I think it has a bit more merit that trying to
introduce lies to the cost modal to account for a single case where
there is a problem.

David

[1]

/messages/by-id/20200529001602.eu7vuiouuuiclpgb@development

--
Best Wishes,
Ashutosh Bapat

--
Best Regards
Andy Fan