Poor cost estimate with interaction between table correlation and partial indexes
Hi.
I'm looking to get started contributing code to Postgres. A small
issue I'm aware of that I think would make a good first contribution
is a poor cost estimate made due to an interaction between table
correlation and partial indexes. Currently the planner assumes that
when an index is perfectly correlated with a table and a range scan is
performed on the index, all of the table page reads performed by the
index scan except for the first one will be sequential reads. While
this assumption is correct for regular indexes, it is not true for
partial indexes.
The assumption holds for regular indexes because the rows
corresponding to two entries in a regular index that is perfectly
correlated with the table are guaranteed to be next to each other in
the table. On the other hand with a partial index perfectly correlated
with a table, there may be rows in the table in between the two rows
corresponding to two adjacent entries in the index that are not
included in the index because they do not satisfy the partial index
predicate.
To make the cost calculation for this case more accurate, I want to
apply the same estimate as the one currently used to estimate the cost
of a bitmap heap scan. The bitmap heap scan cost estimate applies in
this case because both cases involve reading pages from disk ordered
by the location in the table, but where the pages may not be
consecutive.
For the relevant functions, see cost_index and cost_index_heap_scan in
costsize.c.
Thanks,
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Do you think this is a reasonable approach? Should I start working on
a patch based on the solution I described or is there some other
approach I should look into?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Aug 26, 2017 at 05:50:26PM -0700, Michael Malis wrote:
Do you think this is a reasonable approach? Should I start working
on a patch based on the solution I described or is there some other
approach I should look into?
You'll get more traction with a proof-of-concept patch accompanying
the plan than without. Don't bother with any level of care past
proof-of-concept until you get positive feedback.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
(Sorry David. I initially replied only to you)
Ok. I've attached a patch of a proof-of-concept. I have a few
questions about tests.
What is typical workflow to add tests for changes to the planner? Also
I ran make check and it appears one of the existing tests is failing.
What is a typical way for going about discovering why the query plan
for a specific query changed? Also, how should I go about changing the
old test? Should I replace the old test output with the new test
output or modify the old test slightly to get it to produce the same
case as before?
Thanks,
Michael
Attachments:
improve-correlated-partial-index-cost-v1.patchtext/x-patch; charset=US-ASCII; name=improve-correlated-partial-index-cost-v1.patchDownload
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 051a854..58224e6 100644
*** a/src/backend/optimizer/path/costsize.c
--- b/src/backend/optimizer/path/costsize.c
***************
*** 163,169 **** static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
static double relation_byte_size(double tuples, int width);
static double page_size(double tuples, int width);
static double get_parallel_divisor(Path *path);
!
/*
* clamp_row_est
--- 163,172 ----
static double relation_byte_size(double tuples, int width);
static double page_size(double tuples, int width);
static double get_parallel_divisor(Path *path);
! static double ordered_page_read_cost(double pages_fetched,
! int baserel_pages,
! double spc_seq_page_cost,
! double spc_random_page_cost);
/*
* clamp_row_est
***************
*** 652,660 **** cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
if (pages_fetched > 0)
{
! min_IO_cost = spc_random_page_cost;
! if (pages_fetched > 1)
! min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
}
else
min_IO_cost = 0;
--- 655,680 ----
if (pages_fetched > 0)
{
! if (index->indpred == NIL)
! {
! min_IO_cost = spc_random_page_cost;
! if (pages_fetched > 1)
! min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
! }
! else
! {
! /*
! * For a partial index perfectly correlated with the table
! * ordering, consecutive pages fetched are not guarenteed to
! * be adjacent in the table. Instead use
! * ordered_page_read_cost to estimate the cost of reading
! * pages from the heap.
! */
! min_IO_cost = ordered_page_read_cost(pages_fetched,
! baserel->pages,
! spc_seq_page_cost,
! spc_seq_page_cost);
! }
}
else
min_IO_cost = 0;
***************
*** 934,946 **** cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
Cost indexTotalCost;
QualCost qpqual_cost;
Cost cpu_per_tuple;
- Cost cost_per_page;
Cost cpu_run_cost;
double tuples_fetched;
double pages_fetched;
double spc_seq_page_cost,
spc_random_page_cost;
- double T;
/* Should only be applied to base relations */
Assert(IsA(baserel, RelOptInfo));
--- 954,964 ----
Hmm... It seems the command I used for obtaining a patch I got from
here https://wiki.postgresql.org/wiki/Working_with_Git truncated part
of the patch. I've attached the file generated from git diff
--patience master improve-partial-index-correlation-calculation
--no-color > improve-correlated-partial-index-cost-v2.patch to this
email. What is the correct command for generating a context diff?
Attachments:
improve-correlated-partial-index-cost-v2.patchtext/x-patch; charset=US-ASCII; name=improve-correlated-partial-index-cost-v2.patchDownload
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 051a854..58224e6 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -163,7 +163,10 @@ static void set_rel_width(PlannerInfo *root, RelOptInfo *rel);
static double relation_byte_size(double tuples, int width);
static double page_size(double tuples, int width);
static double get_parallel_divisor(Path *path);
-
+static double ordered_page_read_cost(double pages_fetched,
+ int baserel_pages,
+ double spc_seq_page_cost,
+ double spc_random_page_cost);
/*
* clamp_row_est
@@ -652,9 +655,26 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
if (pages_fetched > 0)
{
- min_IO_cost = spc_random_page_cost;
- if (pages_fetched > 1)
- min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
+ if (index->indpred == NIL)
+ {
+ min_IO_cost = spc_random_page_cost;
+ if (pages_fetched > 1)
+ min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
+ }
+ else
+ {
+ /*
+ * For a partial index perfectly correlated with the table
+ * ordering, consecutive pages fetched are not guarenteed to
+ * be adjacent in the table. Instead use
+ * ordered_page_read_cost to estimate the cost of reading
+ * pages from the heap.
+ */
+ min_IO_cost = ordered_page_read_cost(pages_fetched,
+ baserel->pages,
+ spc_seq_page_cost,
+ spc_seq_page_cost);
+ }
}
else
min_IO_cost = 0;
@@ -934,13 +954,11 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
Cost indexTotalCost;
QualCost qpqual_cost;
Cost cpu_per_tuple;
- Cost cost_per_page;
Cost cpu_run_cost;
double tuples_fetched;
double pages_fetched;
double spc_seq_page_cost,
spc_random_page_cost;
- double T;
/* Should only be applied to base relations */
Assert(IsA(baserel, RelOptInfo));
@@ -961,28 +979,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
&tuples_fetched);
startup_cost += indexTotalCost;
- T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
/* Fetch estimated page costs for tablespace containing table. */
get_tablespace_page_costs(baserel->reltablespace,
&spc_random_page_cost,
&spc_seq_page_cost);
- /*
- * For small numbers of pages we should charge spc_random_page_cost
- * apiece, while if nearly all the table's pages are being read, it's more
- * appropriate to charge spc_seq_page_cost apiece. The effect is
- * nonlinear, too. For lack of a better idea, interpolate like this to
- * determine the cost per page.
- */
- if (pages_fetched >= 2.0)
- cost_per_page = spc_random_page_cost -
- (spc_random_page_cost - spc_seq_page_cost)
- * sqrt(pages_fetched / T);
- else
- cost_per_page = spc_random_page_cost;
-
- run_cost += pages_fetched * cost_per_page;
+ run_cost += ordered_page_read_cost(pages_fetched,
+ baserel->pages,
+ spc_seq_page_cost,
+ spc_random_page_cost);
/*
* Estimate CPU costs per tuple.
@@ -5183,3 +5189,34 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
return pages_fetched;
}
+
+
+/*
+ * Estimate the total cost of reading possibly nonconsecutive pages from the
+ * heap in order.
+ */
+static double
+ordered_page_read_cost(double pages_fetched, int baserel_pages,
+ double spc_seq_page_cost, double spc_random_page_cost)
+{
+ double cost_per_page;
+ double T;
+
+ T = (baserel_pages > 1) ? (double) baserel_pages : 1.0;
+
+ /*
+ * For small numbers of pages we should charge spc_random_page_cost
+ * apiece, while if nearly all the table's pages are being read, it's more
+ * appropriate to charge spc_seq_page_cost apiece. The effect is
+ * nonlinear, too. For lack of a better idea, interpolate like this to
+ * determine the cost per page.
+ */
+ if (pages_fetched >= 2.0)
+ cost_per_page = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * sqrt(pages_fetched / T);
+ else
+ cost_per_page = spc_random_page_cost;
+
+ return pages_fetched * cost_per_page;
+}
Michael Malis wrote:
Hmm... It seems the command I used for obtaining a patch I got from
here https://wiki.postgresql.org/wiki/Working_with_Git truncated part
of the patch. I've attached the file generated from git diff
--patience master improve-partial-index-correlation-calculation
--no-color > improve-correlated-partial-index-cost-v2.patch to this
email. What is the correct command for generating a context diff?
Yeah, I've had patches truncated by that too and I've never cared enough
to see about getting it fixed. I think it's a bug in the filterdiff
utility, but I got a stupid answer from the Debian maintainer when I
reported it and didn't care to follow up any further. Eventually I gave
up on using context diffs because of this problem.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 27, 2017 at 8:31 PM, Michael Malis <michaelmalis2@gmail.com> wrote:
(Sorry David. I initially replied only to you)
Ok. I've attached a patch of a proof-of-concept. I have a few
questions about tests.What is typical workflow to add tests for changes to the planner?
Add submitted patches at commitfest.postgresql.org
Also
I ran make check and it appears one of the existing tests is failing.
What is a typical way for going about discovering why the query plan
for a specific query changed?
I don't have any magic answer on this point.
Also, how should I go about changing the
old test? Should I replace the old test output with the new test
output or modify the old test slightly to get it to produce the same
case as before?
That's a judgement call, based on what you think the point of the test was.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers