bitmaps and correlation
It's a new year and I'm getting reflective, so resuming a portion of
conversation we had here:
/messages/by-id/CAMkU=1yVbwEAugaCmKWxjaX15ZduWee45+_DqCw--d_3N_O_=Q@mail.gmail.com
Find attached patch which implements use of correlation statistic in costing
for bitmap scans.
An opened question in my mind is how to combine the correlation statistic with
existing cost_per_page:
. sqrt(a^2+b^2)
. MIN()
. multiply existing computation by new correlation component
On Wed, Dec 20, 2017 at 09:55:40PM -0800, Jeff Janes wrote:
On Tue, Dec 19, 2017 at 7:25 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
I started playing with this weeks ago (probably during Vitaliy's problem
report). Is there any reason cost_bitmap_heap_scan shouldn't interpolate
based on correlation from seq_page_cost to rand_page_cost, same as
cost_index ?I think that doing something like that is a good idea in general, but someone
has to implement the code, and so far no one seems enthused to do so. You
seem pretty interested in the topic, so....
I tested patch using CDR data which was causing issues for us years ago:
/messages/by-id/20160524173914.GA11880@telsasoft.com
Note: since that original problem report:
. the SAN is backed by SSD rather than rotational storage;
. we're using relkind=p partitioned tables;
. PG12 uses pread() rather than lstat()+read(), so overhead of seek()+read()
is avoided (but probably wasn't a substantial component of the problem);
Unpatched, I'm unable to get bitmap scan without disabling indexscan and seqscan.
| Bitmap Heap Scan on cdrs_huawei_pgwrecord_2018_12_25 (cost=55764.07..1974230.46 rows=2295379 width=1375)
| Recheck Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-25 10:00:00-06'::timestamp with time zone))
| -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2018_12_25_recordopeningtime_idx (cost=0.00..55190.22 rows=2295379 width=0)
| Index Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-25 10:00:00-06'::timestamp with time zone))
Patched, I get bitmap scan when random_page_cost is reduced enough that startup
cost (index scan component) is not prohibitive. But for simplicity, this just
forces bitmap by setting enable_indexscan=off;
| Bitmap Heap Scan on cdrs_huawei_pgwrecord_2018_12_25 (cost=55764.07..527057.45 rows=2295379 width=1375)
| Recheck Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-25 10:00:00-06'::timestamp with time zone))
| -> Bitmap Index Scan on cdrs_huawei_pgwrecord_2018_12_25_recordopeningtime_idx (cost=0.00..55190.22 rows=2295379 width=0)
| Index Cond: ((recordopeningtime >= '2018-12-25 05:00:00-06'::timestamp with time zone) AND (recordopeningtime <= '2018-12-25 10:00:00-06'::timestamp with time zone))
That addresses the issue we had in part. A remaining problem is that
correlation fails to distinguish between "fresh" index and fragmented index,
and so heap access of a correlated index may looks cheaper than it is. Which
is why I still have to set random_page_cost to get bitmap scan.
Justin
Attached for real.
Attachments:
v0-bitmap-correlation.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 88780c0..1c25f36 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -548,11 +548,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a
+ * bitmap scan doesn't care.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -671,6 +672,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
else
min_IO_cost = 0;
}
+elog(DEBUG4, "index_pages_fetched2 fetched:%d", (int)pages_fetched);
if (partial_path)
{
@@ -711,6 +713,9 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
csquared = indexCorrelation * indexCorrelation;
run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
+ Cost heapCost=max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
+ elog(DEBUG4, "index_cost1 startup:%f heap:%f run:%f idxtotal:%f total:%f",
+ indexStartupCost, heapCost, run_cost, indexTotalCost, path->path.total_cost);
/*
* Estimate CPU costs per tuple.
@@ -744,6 +749,9 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
path->path.startup_cost = startup_cost;
path->path.total_cost = startup_cost + run_cost;
+
+ elog(DEBUG4, "index_cost1 startup:%f heap:%f run:%f idxtotal:%f total:%f",
+ indexStartupCost, heapCost, run_cost, indexTotalCost, path->path.total_cost);
}
/*
@@ -876,6 +884,7 @@ index_pages_fetched(double tuples_fetched, BlockNumber pages,
}
pages_fetched = ceil(pages_fetched);
}
+elog(DEBUG4, "index_pages_fetched T=pages=%d total_pages:%d ECC:%d bECC:%d fetched:%d", (int)T, (int)total_pages, (int)effective_cache_size, (int)b, (int)pages_fetched);
return pages_fetched;
}
@@ -888,6 +897,7 @@ index_pages_fetched(double tuples_fetched, BlockNumber pages,
* not completely clear, and detecting duplicates is difficult, so ignore it
* for now.
*/
+
static double
get_indexpath_pages(Path *bitmapqual)
{
@@ -925,6 +935,66 @@ get_indexpath_pages(Path *bitmapqual)
}
/*
+ * get_indexpath_correlation
+ * Recursively determine correlation of an bitmap path by XXX:min/max of correlation of indices scanned.
+ */
+static double
+UNUSED_get_indexpath_correlation(PlannerInfo *root, Path *bitmapqual)
+{
+ double x=0, result = 0;
+ int loop = 0;
+ ListCell *l;
+
+ if (IsA(bitmapqual, IndexPath))
+ {
+ IndexPath *ipath = (IndexPath *) bitmapqual;
+ amcostestimate_function amcostestimate = ipath->indexinfo->amcostestimate;
+ double junk;
+ Cost junk2;
+ Selectivity junk3;
+ amcostestimate(root, ipath, 1/*loop_count*/,
+ &junk2/*&indexStartupCost*/, &junk2/*&indexTotalCost*/,
+ &junk3/*&indexSelectivity*/, &x,
+ &junk/*&index_pages*/);
+ // result = 0.1*x*x;
+ // result = fabs(x);
+ result = (x*x + fabs(x))/2;
+ elog(DEBUG4, "get_indexpath_correlation %f", result);
+ /* The correlation values below here will be between [0,1] and not further squared */
+ } else if (IsA(bitmapqual, BitmapAndPath)) {
+ BitmapAndPath *apath = (BitmapAndPath *) bitmapqual;
+
+ foreach(l, apath->bitmapquals)
+ {
+ x = UNUSED_get_indexpath_correlation(root, (Path *) lfirst(l));
+ // XXX: should use the correlation of the most-selective path..
+ if (loop) result = Max(result, x);
+ else result = x;
+ loop=1;
+ }
+ }
+ else if (IsA(bitmapqual, BitmapOrPath))
+ {
+ BitmapOrPath *opath = (BitmapOrPath *) bitmapqual;
+
+ foreach(l, opath->bitmapquals)
+ {
+ /* Min is probably not right: should use the correlation of the
+ * least-selective path */
+ x = UNUSED_get_indexpath_correlation(root, (Path *) lfirst(l));
+ if (loop) result = Min(result, x);
+ else result = x;
+ loop=1;
+ }
+ }
+
+ else
+ elog(ERROR, "unrecognized node type: %d", nodeTag(bitmapqual));
+
+ return result;
+}
+
+/*
* cost_bitmap_heap_scan
* Determines and returns the cost of scanning a relation using a bitmap
* index-then-heap plan.
@@ -954,7 +1024,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
double pages_fetched;
double spc_seq_page_cost,
spc_random_page_cost;
- double T;
/* Should only be applied to base relations */
Assert(IsA(baserel, RelOptInfo));
@@ -975,7 +1044,6 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
&tuples_fetched);
startup_cost += indexTotalCost;
- T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
/* Fetch estimated page costs for tablespace containing table. */
get_tablespace_page_costs(baserel->reltablespace,
@@ -988,12 +1056,35 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ double T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
+ if (pages_fetched >= 2.0) {
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+ elog(DEBUG4, "cost_bitmap_heap_scan old pages:%d fetched:%f fraction:%f cost_per_page:%f", baserel->pages, pages_fetched, pages_fetched/T, cost_per_page);
+
+ // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost?
+ // a highly correlated bitmap scan 1) likely reads fewer pages; and,
+ // 2) at higher "density" (more sequential).
+ // double correlation = get_indexpath_correlation(root, bitmapqual);
+ double correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ double cost_per_page2 = spc_seq_page_cost +
+ (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ?
+ elog(DEBUG4, "cost_bitmap_heap_scan new corr=%f cost_per_page=%f", correlation, cost_per_page2);
+ // There are two variables: fraction of pages(T) and correlation.
+ // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation.
+ // If correlation is high, we (probably) want low cost per page.
+ // ...the exception is if someone does an =ANY() query on a list of non-consecutive values.
+ // Something like start_time=ANY('2017-01-01', '2017-02-01',...)
+ // which reads small number of rows from pages all across the length of the table.
+ // But index scan doesn't seem to do address that at all, so leave it alone for now.
+ cost_per_page=Min(cost_per_page, cost_per_page2);
+ // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1024,6 +1115,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
path->rows = clamp_row_est(path->rows / parallel_divisor);
}
+ elog(DEBUG4, "cost_bitmap_heap_scan run_cost0: %f", run_cost );
run_cost += cpu_run_cost;
@@ -1031,21 +1123,24 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
startup_cost += path->pathtarget->cost.startup;
run_cost += path->pathtarget->cost.per_tuple * path->rows;
+ elog(DEBUG4, "cost_bitmap_heap_scan startup:%f cpu_run_cost:%f run_cost2:%f", startup_cost, cpu_run_cost, run_cost);
+
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1059,11 +1154,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
*cost = path->total_cost;
*selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1086,8 +1183,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1099,22 +1197,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec<=minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
+ // ??? XXX && !IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1130,8 +1238,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1144,23 +1253,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec>=maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5390,8 +5508,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5400,13 +5521,15 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
*/
tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);
+ elog(DEBUG4, "compute_bitmap old tuples_fetched:%f indexSelectivity:%f baserel->tuples:%f",
+ tuples_fetched, indexSelectivity, baserel->tuples);
T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
/*
@@ -5414,7 +5537,16 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
+
+ // pages_fetched = pages_fetchedMAX - indexCorrelation*indexCorrelation * (pages_fetchedMIN-pages_fetchedMAX);
+elog(DEBUG4, "compute_bitmap pages_fetched high: %f low: %f corr=%f val: %f", pages_fetchedMAX, pages_fetchedMIN, indexCorrelation, pages_fetched);
+
/*
* Calculate the number of pages fetched from the heap. Then based on
Attached is a fixed and rebasified patch for cfbot.
Included inline for conceptual review.
From f3055a5696924427dda280da702c41d2d2796a24 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v2] Use correlation statistic in costing bitmap scans..
Same as for an index scan, an uncorrelated bitmap (like modulus) which access a
certain number of pages across the entire length of a table should have cost
estimate heavily weighted by random access, compared to an bitmap scan which
accesses same number of pages across a small portion of the table.
Note, Tom points out that there are cases where a column could be
tightly-clumped without being hightly-ordered. Since we have correlation
already, we use that for now, and if someone creates a statistic for
clumpiness, we'll re-evaluate at some later date.
---
src/backend/optimizer/path/costsize.c | 84 ++++++++++++++++++++++++++++-------
src/backend/optimizer/path/indxpath.c | 8 ++--
src/include/nodes/pathnodes.h | 3 ++
src/include/optimizer/cost.h | 2 +-
4 files changed, 77 insertions(+), 20 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index c5f6593..aaac29a 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -549,11 +549,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a
+ * bitmap scan doesn't care.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -986,12 +987,33 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation, cost_per_page2;
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+
+ // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost?
+ // a highly correlated bitmap scan 1) likely reads fewer pages; and,
+ // 2) at higher "density" (more sequential).
+ // double correlation = get_indexpath_correlation(root, bitmapqual);
+ correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ cost_per_page2 = spc_seq_page_cost +
+ (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ?
+ // There are two variables: fraction of pages(T) and correlation.
+ // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation.
+ // If correlation is high, we (probably) want low cost per page.
+ // ...the exception is if someone does an =ANY() query on a list of non-consecutive values.
+ // Something like start_time=ANY('2017-01-01', '2017-02-01',...)
+ // which reads small number of rows from pages all across the length of the table.
+ // But index scan doesn't seem to do address that at all, so leave it alone for now.
+ cost_per_page=Min(cost_per_page, cost_per_page2);
+ // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1035,15 +1057,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1057,11 +1080,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
*cost = path->total_cost;
*selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1084,8 +1109,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1097,22 +1123,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec<=minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
+ // ??? XXX && !IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1128,8 +1164,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1142,23 +1179,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec>=maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5510,8 +5556,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5520,7 +5569,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5534,7 +5583,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 37b257c..2a3db34 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1467,8 +1467,8 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
Selectivity nselec;
Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1580,8 +1580,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 23a06d7..beaac03 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1181,6 +1181,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1261,6 +1262,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1274,6 +1276,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b3d0b4f..9a28665 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
--
2.7.4
Attachments:
v2-0001-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From f3055a5696924427dda280da702c41d2d2796a24 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v2] Use correlation statistic in costing bitmap scans..
Same as for an index scan, an uncorrelated bitmap (like modulus) which access a
certain number of pages across the entire length of a table should have cost
estimate heavily weighted by random access, compared to an bitmap scan which
accesses same number of pages across a small portion of the table.
Note, Tom points out that there are cases where a column could be
tightly-clumped without being hightly-ordered. Since we have correlation
already, we use that for now, and if someone creates a statistic for
clumpiness, we'll re-evaluate at some later date.
---
src/backend/optimizer/path/costsize.c | 84 ++++++++++++++++++++++++++++-------
src/backend/optimizer/path/indxpath.c | 8 ++--
src/include/nodes/pathnodes.h | 3 ++
src/include/optimizer/cost.h | 2 +-
4 files changed, 77 insertions(+), 20 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index c5f6593..aaac29a 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -549,11 +549,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a
+ * bitmap scan doesn't care.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -986,12 +987,33 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation, cost_per_page2;
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+
+ // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost?
+ // a highly correlated bitmap scan 1) likely reads fewer pages; and,
+ // 2) at higher "density" (more sequential).
+ // double correlation = get_indexpath_correlation(root, bitmapqual);
+ correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ cost_per_page2 = spc_seq_page_cost +
+ (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ?
+ // There are two variables: fraction of pages(T) and correlation.
+ // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation.
+ // If correlation is high, we (probably) want low cost per page.
+ // ...the exception is if someone does an =ANY() query on a list of non-consecutive values.
+ // Something like start_time=ANY('2017-01-01', '2017-02-01',...)
+ // which reads small number of rows from pages all across the length of the table.
+ // But index scan doesn't seem to do address that at all, so leave it alone for now.
+ cost_per_page=Min(cost_per_page, cost_per_page2);
+ // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1035,15 +1057,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1057,11 +1080,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
*cost = path->total_cost;
*selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1084,8 +1109,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1097,22 +1123,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec<=minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
+ // ??? XXX && !IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1128,8 +1164,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1142,23 +1179,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec>=maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5510,8 +5556,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5520,7 +5569,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5534,7 +5583,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 37b257c..2a3db34 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1467,8 +1467,8 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
Selectivity nselec;
Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1580,8 +1580,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 23a06d7..beaac03 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1181,6 +1181,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1261,6 +1262,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1274,6 +1276,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b3d0b4f..9a28665 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
--
2.7.4
On Sat, Nov 02, 2019 at 03:26:17PM -0500, Justin Pryzby wrote:
Attached is a fixed and rebasified patch for cfbot.
Included inline for conceptual review.
Your patch still causes two regression tests to fail per Mr Robot's
report: join and select. Could you look at those problems? I have
moved the patch to next CF, waiting on author.
--
Michael
On Sun, Dec 01, 2019 at 12:34:37PM +0900, Michael Paquier wrote:
On Sat, Nov 02, 2019 at 03:26:17PM -0500, Justin Pryzby wrote:
Attached is a fixed and rebasified patch for cfbot.
Included inline for conceptual review.Your patch still causes two regression tests to fail per Mr Robot's
report: join and select. Could you look at those problems? I have
moved the patch to next CF, waiting on author.
The regression failures seem to be due to deliberate, expected plan changes.
I think the patch still needs some discussion, but find attached rebasified
patch including regression diffs.
Justin
Attachments:
v3-0001-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From cb2b02d6288bc0bc5432a7e7f5214ffcddec010e Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v3] Use correlation statistic in costing bitmap scans..
Same as for an index scan, an uncorrelated bitmap (like modulus) which accesses
a certain number of pages across the entire length of a table should have its
cost estimate heavily weighted by random access, compared to an bitmap scan
which accesses same number of pages across a small portion of the table.
Note, Tom points out that there are cases where a column could be
tightly-"clumped" without being hightly-ordered. Since we have correlation
already, we use that until such time as someone implements a new statistic for
clumpiness.
---
src/backend/optimizer/path/costsize.c | 84 ++++++++++++++++++++++++++++------
src/backend/optimizer/path/indxpath.c | 8 ++--
src/include/nodes/pathnodes.h | 3 ++
src/include/optimizer/cost.h | 2 +-
src/test/regress/expected/identity.out | 2 +-
src/test/regress/expected/join.out | 42 ++++++++++-------
src/test/regress/expected/select.out | 16 +++----
src/test/regress/sql/identity.sql | 2 +-
8 files changed, 110 insertions(+), 49 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index c5f6593..aaac29a 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -549,11 +549,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a
+ * bitmap scan doesn't care.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -986,12 +987,33 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation, cost_per_page2;
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+
+ // XXX: interpolate based on correlation from seq_page_cost to rand_page_cost?
+ // a highly correlated bitmap scan 1) likely reads fewer pages; and,
+ // 2) at higher "density" (more sequential).
+ // double correlation = get_indexpath_correlation(root, bitmapqual);
+ correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ cost_per_page2 = spc_seq_page_cost +
+ (1-correlation*correlation)*(spc_random_page_cost - spc_seq_page_cost); // XXX: *sqrt(pages_fetched/T) ?
+ // There are two variables: fraction of pages(T) and correlation.
+ // If T is high, giving sequential reads, we want low cost_per_page regardless of correlation.
+ // If correlation is high, we (probably) want low cost per page.
+ // ...the exception is if someone does an =ANY() query on a list of non-consecutive values.
+ // Something like start_time=ANY('2017-01-01', '2017-02-01',...)
+ // which reads small number of rows from pages all across the length of the table.
+ // But index scan doesn't seem to do address that at all, so leave it alone for now.
+ cost_per_page=Min(cost_per_page, cost_per_page2);
+ // cost_per_page=sqrt(cost_per_page*cost_per_page + cost_per_page2*cost_per_page2);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1035,15 +1057,16 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation) *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1057,11 +1080,13 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
*cost = path->total_cost;
*selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation) *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1084,8 +1109,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1097,22 +1123,32 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec<=minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
+ // ??? XXX && !IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1128,8 +1164,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1142,23 +1179,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec>=maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5510,8 +5556,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5520,7 +5569,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5534,7 +5583,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 37b257c..2a3db34 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1467,8 +1467,8 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
Selectivity nselec;
Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1580,8 +1580,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 23a06d7..beaac03 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1181,6 +1181,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1261,6 +1262,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1274,6 +1276,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index b3d0b4f..9a28665 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out
index 36a2393..67aacf5 100644
--- a/src/test/regress/expected/identity.out
+++ b/src/test/regress/expected/identity.out
@@ -316,7 +316,7 @@ SELECT * FROM itest6;
102 |
(3 rows)
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2,3,4;
table_name | column_name | is_identity | identity_generation
------------+-------------+-------------+---------------------
itest6 | a | YES | BY DEFAULT
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index b58d560..8299426 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3144,22 +3144,26 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------
Nested Loop
-> Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Nested Loop
- -> Seq Scan on onerow
- -> Seq Scan on onerow onerow_1
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Nested Loop
+ -> Seq Scan on onerow
+ -> Seq Scan on onerow onerow_1
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-> Seq Scan on int4_tbl i1
Filter: (f1 = 0)
-(13 rows)
+(17 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
@@ -3212,18 +3216,22 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------
Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Seq Scan on int4_tbl i1
- Filter: (f1 = 0)
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Seq Scan on int4_tbl i1
+ Filter: (f1 = 0)
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-(9 rows)
+(13 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index c441049..675bf63 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -861,17 +861,13 @@ RESET enable_indexscan;
explain (costs off)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
- QUERY PLAN
---------------------------------------------------------------------------------
+ QUERY PLAN
+---------------------------------------------
Bitmap Heap Scan on onek2
- Recheck Cond: (((unique2 = 11) AND (stringu1 < 'B'::name)) OR (unique1 = 0))
- Filter: (stringu1 < 'B'::name)
- -> BitmapOr
- -> Bitmap Index Scan on onek2_u2_prtl
- Index Cond: (unique2 = 11)
- -> Bitmap Index Scan on onek2_u1_prtl
- Index Cond: (unique1 = 0)
-(8 rows)
+ Recheck Cond: (stringu1 < 'B'::name)
+ Filter: ((unique2 = 11) OR (unique1 = 0))
+ -> Bitmap Index Scan on onek2_u2_prtl
+(4 rows)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
diff --git a/src/test/regress/sql/identity.sql b/src/test/regress/sql/identity.sql
index 4b03d24..8dab134 100644
--- a/src/test/regress/sql/identity.sql
+++ b/src/test/regress/sql/identity.sql
@@ -191,7 +191,7 @@ INSERT INTO itest6 DEFAULT VALUES;
INSERT INTO itest6 DEFAULT VALUES;
SELECT * FROM itest6;
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2,3,4;
ALTER TABLE itest6 ALTER COLUMN b SET INCREMENT BY 2; -- fail, not identity
--
2.7.4
Find attached cleaned up patch.
For now, I updated the regress/expected/, but I think the test maybe has to be
updated to do what it was written to do.
Attachments:
v1-0001-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From 36f547d69b8fee25869d6ef3ef26d327a8ba1205 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v1] Use correlation statistic in costing bitmap scans..
Same as for an index scan, a correlated bitmap which accesses pages across a
small portion of the table should have a cost estimate much less than an
uncorrelated scan (like modulus) across the entire length of the table, the
latter having a high component of random access.
Note, Tom points out that there are cases where a column could be
tightly-"clumped" without being highly-ordered. Since we have correlation
already, we use that until such time as someone implements a new statistic for
clumpiness. This patch only intends to make costing of bitmap heap scan on par
with the same cost of index scan without bitmap.
---
contrib/postgres_fdw/expected/postgres_fdw.out | 15 ++--
src/backend/optimizer/path/costsize.c | 94 +++++++++++++++++++++-----
src/backend/optimizer/path/indxpath.c | 10 ++-
src/include/nodes/pathnodes.h | 3 +
src/include/optimizer/cost.h | 2 +-
src/test/regress/expected/create_index.out | 16 ++---
src/test/regress/expected/join.out | 18 +++--
src/test/regress/sql/create_index.sql | 8 ++-
8 files changed, 118 insertions(+), 48 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c915885..fb4c7f2 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2257,11 +2257,12 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
-> Foreign Scan on public.ft1
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
- -> Materialize
+ -> Sort
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
+ Sort Key: ft2.c1
-> Foreign Scan on public.ft2
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) ORDER BY "C 1" ASC NULLS LAST FOR UPDATE
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
-> Sort
Output: ft4.c1, ft4.c2, ft4.c3, ft4.*
Sort Key: ft4.c1
@@ -2276,7 +2277,7 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 4" FOR UPDATE
-> Index Scan using local_tbl_pkey on public.local_tbl
Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, local_tbl.ctid
-(47 rows)
+(48 rows)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 FOR UPDATE;
@@ -3318,10 +3319,12 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Sort Key: t1.c2
-> Nested Loop
Output: t1.c2, qry.sum
- -> Index Scan using t1_pkey on "S 1"."T 1" t1
+ -> Bitmap Heap Scan on "S 1"."T 1" t1
Output: t1."C 1", t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Index Cond: (t1."C 1" < 100)
+ Recheck Cond: (t1."C 1" < 100)
Filter: (t1.c2 < 3)
+ -> Bitmap Index Scan on t1_pkey
+ Index Cond: (t1."C 1" < 100)
-> Subquery Scan on qry
Output: qry.sum, t2.c1
Filter: ((t1.c2 * 2) = qry.sum)
@@ -3329,7 +3332,7 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Output: (sum((t2.c1 + t1."C 1"))), t2.c1
Relations: Aggregate on (public.ft2 t2)
Remote SQL: SELECT sum(("C 1" + $1::integer)), "C 1" FROM "S 1"."T 1" GROUP BY 2
-(16 rows)
+(18 rows)
select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum from ft2 t2 group by t2.c1) qry where t1.c2 * 2 = qry.sum and t1.c2 < 3 and t1."C 1" < 100 order by 1;
c2 | sum
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033..d39e12d 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -549,11 +549,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a bitmap scan
+ * can't start until the index scan completes, so only cares about its
+ * total cost.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -986,12 +988,31 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ double cost_per_page_corr;
+ /*
+ * Interpolate based on pages_fetched and correlation from seq_page_cost to rand_page_cost.
+ * A highly correlated bitmap scan 1) likely reads fewer pages (handled in
+ * compute_bitmap_pages); and, 2) at higher "density" (more sequential).
+ */
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+ cost_per_page_corr = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * (1-correlation*correlation);
+
+ /*
+ * We expect sequential reads and low cost_per_page when *either*
+ * T is high or correlation is high.
+ */
+ cost_per_page = Min(cost_per_page,cost_per_page_corr);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1035,15 +1056,18 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
- *selec = ((IndexPath *) path)->indexselectivity;
+ if (selec)
+ *selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation)
+ *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1056,12 +1080,18 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
else if (IsA(path, BitmapAndPath))
{
*cost = path->total_cost;
- *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
- *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1084,8 +1114,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1097,22 +1128,31 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec <= minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1128,8 +1168,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1142,23 +1183,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec >= maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5510,8 +5560,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5520,7 +5573,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5534,7 +5587,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index f42926e..5d111fc 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1483,11 +1483,9 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
/* duplicate clauseids, keep the cheaper one */
Cost ncost;
Cost ocost;
- Selectivity nselec;
- Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, NULL, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, NULL, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1599,8 +1597,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be19..19ddf5b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1183,6 +1183,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1263,6 +1264,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1276,6 +1278,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba..7d5edb5 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 6446907..5bd8aae 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1770,24 +1770,23 @@ DROP TABLE onek_with_null;
--
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
- QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1
- Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 3)) OR ((thousand = 42) AND (tenthous = 42)))
+ Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 42)))
-> BitmapOr
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 1))
-> Bitmap Index Scan on tenk1_thous_tenthous
- Index Cond: ((thousand = 42) AND (tenthous = 3))
- -> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 42))
-(9 rows)
+(7 rows)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4
---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx
@@ -1818,6 +1817,7 @@ SELECT count(*) FROM tenk1
10
(1 row)
+RESET cpu_operator_cost;
--
-- Check behavior with duplicate index column contents
--
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b..bc011ee 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3584,24 +3584,28 @@ select b.unique1 from
join int4_tbl i1 on b.thousand = f1
right join int4_tbl i2 on i2.f1 = b.tenthous
order by 1;
- QUERY PLAN
------------------------------------------------------------------------------------------
+ QUERY PLAN
+-------------------------------------------------------------------------------
Sort
Sort Key: b.unique1
- -> Nested Loop Left Join
- -> Seq Scan on int4_tbl i2
+ -> Hash Right Join
+ Hash Cond: (b.tenthous = i2.f1)
-> Nested Loop Left Join
Join Filter: (b.unique1 = 42)
-> Nested Loop
-> Nested Loop
-> Seq Scan on int4_tbl i1
- -> Index Scan using tenk1_thous_tenthous on tenk1 b
- Index Cond: ((thousand = i1.f1) AND (tenthous = i2.f1))
+ -> Bitmap Heap Scan on tenk1 b
+ Recheck Cond: (thousand = i1.f1)
+ -> Bitmap Index Scan on tenk1_thous_tenthous
+ Index Cond: (thousand = i1.f1)
-> Index Scan using tenk1_unique1 on tenk1 a
Index Cond: (unique1 = b.unique2)
-> Index Only Scan using tenk1_thous_tenthous on tenk1 c
Index Cond: (thousand = a.thousand)
-(15 rows)
+ -> Hash
+ -> Seq Scan on int4_tbl i2
+(19 rows)
select b.unique1 from
tenk1 a join tenk1 b on a.unique1 = b.unique2
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 3c0c1cd..6607a7b 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -666,11 +666,13 @@ DROP TABLE onek_with_null;
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
+
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
@@ -678,6 +680,8 @@ SELECT count(*) FROM tenk1
SELECT count(*) FROM tenk1
WHERE hundred = 42 AND (thousand = 42 OR thousand = 99);
+RESET cpu_operator_cost;
+
--
-- Check behavior with duplicate index column contents
--
--
2.7.4
On Tue, Jan 7, 2020 at 1:29 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
Find attached cleaned up patch.
For now, I updated the regress/expected/, but I think the test maybe has to be
updated to do what it was written to do.
I have noticed that in "cost_index" we have used the indexCorrelation
for computing the run_cost, not the number of pages whereas in your
patch you have used it for computing the number of pages. Any reason
for the same?
cost_index
{
..
/*
* Now interpolate based on estimated index order correlation to get total
* disk I/O cost for main table accesses.
*/
csquared = indexCorrelation * indexCorrelation;
run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
}
Patch
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX +
indexCorrelation*indexCorrelation*(pages_fetchedMIN -
pages_fetchedMAX);
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Tue, Jan 07, 2020 at 09:21:03AM +0530, Dilip Kumar wrote:
On Tue, Jan 7, 2020 at 1:29 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
Find attached cleaned up patch.
For now, I updated the regress/expected/, but I think the test maybe has to be
updated to do what it was written to do.I have noticed that in "cost_index" we have used the indexCorrelation
for computing the run_cost, not the number of pages whereas in your
patch you have used it for computing the number of pages. Any reason
for the same?
As Jeff has pointed out, high correlation has two effects in cost_index():
1) the number of pages read will be less;
2) the pages will be read more sequentially;
cost_index reuses the pages_fetched variable, so (1) isn't particularly clear,
but does essentially:
/* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
pages_fetched(MIN) = index_pages_fetched(tuples_fetched,
baserel->pages,
(double) index->pages,
root);
max_IO_cost = pages_fetchedMIN * spc_random_page_cost;
/* min_IO_cost is for the perfectly correlated case (csquared=1) */
pages_fetched(MAX) = ceil(indexSelectivity * (double) baserel->pages);
min_IO_cost = (pages_fetchedMAX - 1) * spc_seq_page_cost;
My patch 1) changes compute_bitmap_pages() to interpolate pages_fetched using
the correlation; pages_fetchedMIN is new:
Patch - pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched); + + /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */ + pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages); + + pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
And, 2) also computes cost_per_page by interpolation between seq_page and
random_page cost:
+ cost_per_page_corr = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * (1-correlation*correlation);
Thanks for looking. I'll update the name of pages_fetchedMIN/MAX in my patch
for consistency with cost_index.
Justin
On Mon, Jan 06, 2020 at 11:26:06PM -0600, Justin Pryzby wrote:
As Jeff has pointed out, high correlation has two effects in cost_index():
1) the number of pages read will be less;
2) the pages will be read more sequentially;cost_index reuses the pages_fetched variable, so (1) isn't particularly clear,
I tried to make this more clear in 0001
+ cost_per_page_corr = spc_random_page_cost - + (spc_random_page_cost - spc_seq_page_cost) + * (1-correlation*correlation);
And fixed bug: this should be c*c not 1-c*c.
Attachments:
v4-0001-Make-more-clear-the-computation-of-min-max-IO.patchtext/x-diff; charset=us-asciiDownload
From d0819177ef1c6f86a588e3d2700ecff638f83b4a Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 8 Jan 2020 19:23:51 -0600
Subject: [PATCH v4 1/2] Make more clear the computation of min/max IO..
..and specifically the double use and effect of correlation.
Avoid re-use of the "pages_fetched" variable
---
src/backend/optimizer/path/costsize.c | 47 +++++++++++++++++++----------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033..bdc23a0 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -491,12 +491,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
csquared;
double spc_seq_page_cost,
spc_random_page_cost;
- Cost min_IO_cost,
+ double min_pages_fetched, /* The min and max page count based on index correlation */
+ max_pages_fetched;
+ Cost min_IO_cost, /* The min and max cost based on index correlation */
max_IO_cost;
QualCost qpqual_cost;
Cost cpu_per_tuple;
double tuples_fetched;
- double pages_fetched;
double rand_heap_pages;
double index_pages;
@@ -579,7 +580,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* (just after a CLUSTER, for example), the number of pages fetched should
* be exactly selectivity * table_size. What's more, all but the first
* will be sequential fetches, not the random fetches that occur in the
- * uncorrelated case. So if the number of pages is more than 1, we
+ * uncorrelated case (the index is expected to read fewer pages, *and* each
+ * page read is cheaper). So if the number of pages is more than 1, we
* ought to charge
* spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
* For partially-correlated indexes, we ought to charge somewhere between
@@ -604,17 +606,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* pro-rate the costs for one scan. In this case we assume all the
* fetches are random accesses.
*/
- pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
+ max_pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- max_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ max_IO_cost = (max_pages_fetched * spc_random_page_cost) / loop_count;
/*
* In the perfectly correlated case, the number of pages touched by
@@ -626,17 +628,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* where such a plan is actually interesting, only one page would get
* fetched per scan anyway, so it shouldn't matter much.)
*/
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
- pages_fetched = index_pages_fetched(pages_fetched * loop_count,
+ min_pages_fetched = index_pages_fetched(min_pages_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- min_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ min_IO_cost = (min_pages_fetched * spc_random_page_cost) / loop_count;
}
else
{
@@ -644,30 +646,31 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* Normal case: apply the Mackert and Lohman formula, and then
* interpolate between that and the correlation-derived result.
*/
- pages_fetched = index_pages_fetched(tuples_fetched,
+
+ /* For the perfectly uncorrelated case (csquared=0) */
+ max_pages_fetched = index_pages_fetched(tuples_fetched,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- /* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
- max_IO_cost = pages_fetched * spc_random_page_cost;
+ max_IO_cost = max_pages_fetched * spc_random_page_cost;
- /* min_IO_cost is for the perfectly correlated case (csquared=1) */
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ /* For the perfectly correlated case (csquared=1) */
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- if (pages_fetched > 0)
+ if (min_pages_fetched > 0)
{
min_IO_cost = spc_random_page_cost;
- if (pages_fetched > 1)
- min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
+ if (min_pages_fetched > 1)
+ min_IO_cost += (min_pages_fetched - 1) * spc_seq_page_cost;
}
else
min_IO_cost = 0;
--
2.7.4
v4-0002-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From 2608f1743c750cefa7c9d7658ea8c2a4543aef5f Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v4 2/2] Use correlation statistic in costing bitmap scans..
Same as for an index scan, a correlated bitmap which accesses pages across a
small portion of the table should have a cost estimate much less than an
uncorrelated scan (like modulus) across the entire length of the table, the
latter having a high component of random access.
Note, Tom points out that there are cases where a column could be
tightly-"clumped" without being highly-ordered. Since we have correlation
already, we use that until such time as someone implements a new statistic for
clumpiness. This patch only intends to make costing of bitmap heap scan on par
with the same cost of index scan without bitmap.
---
contrib/postgres_fdw/expected/postgres_fdw.out | 15 ++--
src/backend/optimizer/path/costsize.c | 94 +++++++++++++++++++++-----
src/backend/optimizer/path/indxpath.c | 10 ++-
src/include/nodes/pathnodes.h | 3 +
src/include/optimizer/cost.h | 2 +-
src/test/regress/expected/create_index.out | 16 ++---
src/test/regress/expected/identity.out | 2 +-
src/test/regress/expected/join.out | 42 +++++++-----
src/test/regress/expected/select.out | 16 ++---
src/test/regress/sql/create_index.sql | 8 ++-
src/test/regress/sql/identity.sql | 2 +-
11 files changed, 140 insertions(+), 70 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 0912d6c..732e66c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2269,11 +2269,12 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
-> Foreign Scan on public.ft1
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
- -> Materialize
+ -> Sort
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
+ Sort Key: ft2.c1
-> Foreign Scan on public.ft2
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) ORDER BY "C 1" ASC NULLS LAST FOR UPDATE
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
-> Sort
Output: ft4.c1, ft4.c2, ft4.c3, ft4.*
Sort Key: ft4.c1
@@ -2288,7 +2289,7 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 4" FOR UPDATE
-> Index Scan using local_tbl_pkey on public.local_tbl
Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, local_tbl.ctid
-(47 rows)
+(48 rows)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 FOR UPDATE;
@@ -3330,10 +3331,12 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Sort Key: t1.c2
-> Nested Loop
Output: t1.c2, qry.sum
- -> Index Scan using t1_pkey on "S 1"."T 1" t1
+ -> Bitmap Heap Scan on "S 1"."T 1" t1
Output: t1."C 1", t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Index Cond: (t1."C 1" < 100)
+ Recheck Cond: (t1."C 1" < 100)
Filter: (t1.c2 < 3)
+ -> Bitmap Index Scan on t1_pkey
+ Index Cond: (t1."C 1" < 100)
-> Subquery Scan on qry
Output: qry.sum, t2.c1
Filter: ((t1.c2 * 2) = qry.sum)
@@ -3341,7 +3344,7 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Output: (sum((t2.c1 + t1."C 1"))), t2.c1
Relations: Aggregate on (public.ft2 t2)
Remote SQL: SELECT sum(("C 1" + $1::integer)), "C 1" FROM "S 1"."T 1" GROUP BY 2
-(16 rows)
+(18 rows)
select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum from ft2 t2 group by t2.c1) qry where t1.c2 * 2 = qry.sum and t1.c2 < 3 and t1."C 1" < 100 order by 1;
c2 | sum
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index bdc23a0..9cd314f 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -550,11 +550,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a bitmap scan
+ * can't start until the index scan completes, so only cares about its
+ * total cost.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -989,12 +991,31 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ double cost_per_page_corr;
+ /*
+ * Interpolate based on pages_fetched and correlation from seq_page_cost to rand_page_cost.
+ * A highly correlated bitmap scan 1) likely reads fewer pages (handled in
+ * compute_bitmap_pages); and, 2) at higher "density" (more sequential).
+ */
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+ cost_per_page_corr = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * (correlation*correlation);
+
+ /*
+ * We expect sequential reads and low cost_per_page when *either*
+ * T is high or correlation is high.
+ */
+ cost_per_page = Min(cost_per_page,cost_per_page_corr);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1038,15 +1059,18 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
- *selec = ((IndexPath *) path)->indexselectivity;
+ if (selec)
+ *selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation)
+ *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1059,12 +1083,18 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
else if (IsA(path, BitmapAndPath))
{
*cost = path->total_cost;
- *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
- *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1087,8 +1117,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1100,22 +1131,31 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec <= minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1131,8 +1171,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1145,23 +1186,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec >= maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5513,8 +5563,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5523,7 +5576,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5537,7 +5590,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 2a50272..8ace295 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1464,11 +1464,9 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
/* duplicate clauseids, keep the cheaper one */
Cost ncost;
Cost ocost;
- Selectivity nselec;
- Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, NULL, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, NULL, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1580,8 +1578,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be19..19ddf5b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1183,6 +1183,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1263,6 +1264,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1276,6 +1278,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba..7d5edb5 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 6446907..5bd8aae 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1770,24 +1770,23 @@ DROP TABLE onek_with_null;
--
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
- QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1
- Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 3)) OR ((thousand = 42) AND (tenthous = 42)))
+ Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 42)))
-> BitmapOr
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 1))
-> Bitmap Index Scan on tenk1_thous_tenthous
- Index Cond: ((thousand = 42) AND (tenthous = 3))
- -> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 42))
-(9 rows)
+(7 rows)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4
---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx
@@ -1818,6 +1817,7 @@ SELECT count(*) FROM tenk1
10
(1 row)
+RESET cpu_operator_cost;
--
-- Check behavior with duplicate index column contents
--
diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out
index 36a2393..2642740 100644
--- a/src/test/regress/expected/identity.out
+++ b/src/test/regress/expected/identity.out
@@ -316,7 +316,7 @@ SELECT * FROM itest6;
102 |
(3 rows)
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2;
table_name | column_name | is_identity | identity_generation
------------+-------------+-------------+---------------------
itest6 | a | YES | BY DEFAULT
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b..80cfab8 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3144,22 +3144,26 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------
Nested Loop
-> Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Nested Loop
- -> Seq Scan on onerow
- -> Seq Scan on onerow onerow_1
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Nested Loop
+ -> Seq Scan on onerow
+ -> Seq Scan on onerow onerow_1
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-> Seq Scan on int4_tbl i1
Filter: (f1 = 0)
-(13 rows)
+(17 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
@@ -3212,18 +3216,22 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------
Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Seq Scan on int4_tbl i1
- Filter: (f1 = 0)
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Seq Scan on int4_tbl i1
+ Filter: (f1 = 0)
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-(9 rows)
+(13 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index c441049..675bf63 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -861,17 +861,13 @@ RESET enable_indexscan;
explain (costs off)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
- QUERY PLAN
---------------------------------------------------------------------------------
+ QUERY PLAN
+---------------------------------------------
Bitmap Heap Scan on onek2
- Recheck Cond: (((unique2 = 11) AND (stringu1 < 'B'::name)) OR (unique1 = 0))
- Filter: (stringu1 < 'B'::name)
- -> BitmapOr
- -> Bitmap Index Scan on onek2_u2_prtl
- Index Cond: (unique2 = 11)
- -> Bitmap Index Scan on onek2_u1_prtl
- Index Cond: (unique1 = 0)
-(8 rows)
+ Recheck Cond: (stringu1 < 'B'::name)
+ Filter: ((unique2 = 11) OR (unique1 = 0))
+ -> Bitmap Index Scan on onek2_u2_prtl
+(4 rows)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index 3c0c1cd..6607a7b 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -666,11 +666,13 @@ DROP TABLE onek_with_null;
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
+
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
@@ -678,6 +680,8 @@ SELECT count(*) FROM tenk1
SELECT count(*) FROM tenk1
WHERE hundred = 42 AND (thousand = 42 OR thousand = 99);
+RESET cpu_operator_cost;
+
--
-- Check behavior with duplicate index column contents
--
diff --git a/src/test/regress/sql/identity.sql b/src/test/regress/sql/identity.sql
index 4b03d24..a19b99b 100644
--- a/src/test/regress/sql/identity.sql
+++ b/src/test/regress/sql/identity.sql
@@ -191,7 +191,7 @@ INSERT INTO itest6 DEFAULT VALUES;
INSERT INTO itest6 DEFAULT VALUES;
SELECT * FROM itest6;
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2;
ALTER TABLE itest6 ALTER COLUMN b SET INCREMENT BY 2; -- fail, not identity
--
2.7.4
There were no comments last month, so rebased, fixed tests, and kicked to next
CF.
--
Justin
Attachments:
v5-0001-Make-more-clear-the-computation-of-min-max-IO.patchtext/x-diff; charset=us-asciiDownload
From e754a93aff10cb435f5ecef923a810b9edc02d68 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 8 Jan 2020 19:23:51 -0600
Subject: [PATCH v5 1/2] Make more clear the computation of min/max IO..
..and specifically the double use and effect of correlation.
Avoid re-use of the "pages_fetched" variable
---
src/backend/optimizer/path/costsize.c | 47 ++++++++++++++-------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..bdc23a075f 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -491,12 +491,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
csquared;
double spc_seq_page_cost,
spc_random_page_cost;
- Cost min_IO_cost,
+ double min_pages_fetched, /* The min and max page count based on index correlation */
+ max_pages_fetched;
+ Cost min_IO_cost, /* The min and max cost based on index correlation */
max_IO_cost;
QualCost qpqual_cost;
Cost cpu_per_tuple;
double tuples_fetched;
- double pages_fetched;
double rand_heap_pages;
double index_pages;
@@ -579,7 +580,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* (just after a CLUSTER, for example), the number of pages fetched should
* be exactly selectivity * table_size. What's more, all but the first
* will be sequential fetches, not the random fetches that occur in the
- * uncorrelated case. So if the number of pages is more than 1, we
+ * uncorrelated case (the index is expected to read fewer pages, *and* each
+ * page read is cheaper). So if the number of pages is more than 1, we
* ought to charge
* spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
* For partially-correlated indexes, we ought to charge somewhere between
@@ -604,17 +606,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* pro-rate the costs for one scan. In this case we assume all the
* fetches are random accesses.
*/
- pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
+ max_pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- max_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ max_IO_cost = (max_pages_fetched * spc_random_page_cost) / loop_count;
/*
* In the perfectly correlated case, the number of pages touched by
@@ -626,17 +628,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* where such a plan is actually interesting, only one page would get
* fetched per scan anyway, so it shouldn't matter much.)
*/
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
- pages_fetched = index_pages_fetched(pages_fetched * loop_count,
+ min_pages_fetched = index_pages_fetched(min_pages_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- min_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ min_IO_cost = (min_pages_fetched * spc_random_page_cost) / loop_count;
}
else
{
@@ -644,30 +646,31 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* Normal case: apply the Mackert and Lohman formula, and then
* interpolate between that and the correlation-derived result.
*/
- pages_fetched = index_pages_fetched(tuples_fetched,
+
+ /* For the perfectly uncorrelated case (csquared=0) */
+ max_pages_fetched = index_pages_fetched(tuples_fetched,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- /* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
- max_IO_cost = pages_fetched * spc_random_page_cost;
+ max_IO_cost = max_pages_fetched * spc_random_page_cost;
- /* min_IO_cost is for the perfectly correlated case (csquared=1) */
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ /* For the perfectly correlated case (csquared=1) */
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- if (pages_fetched > 0)
+ if (min_pages_fetched > 0)
{
min_IO_cost = spc_random_page_cost;
- if (pages_fetched > 1)
- min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
+ if (min_pages_fetched > 1)
+ min_IO_cost += (min_pages_fetched - 1) * spc_seq_page_cost;
}
else
min_IO_cost = 0;
--
2.17.0
v5-0002-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From 27f18ecc49645c0428ff2e785721d47d816afbde Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v5 2/2] Use correlation statistic in costing bitmap scans..
Same as for an index scan, a correlated bitmap which accesses pages across a
small portion of the table should have a cost estimate much less than an
uncorrelated scan (like modulus) across the entire length of the table, the
latter having a high component of random access.
Note, Tom points out that there are cases where a column could be
tightly-"clumped" without being highly-ordered. Since we have correlation
already, we use that until such time as someone implements a new statistic for
clumpiness. This patch only intends to make costing of bitmap heap scan on par
with the same cost of index scan without bitmap.
---
.../postgres_fdw/expected/postgres_fdw.out | 15 +--
src/backend/optimizer/path/costsize.c | 94 +++++++++++++++----
src/backend/optimizer/path/indxpath.c | 10 +-
src/include/nodes/pathnodes.h | 3 +
src/include/optimizer/cost.h | 2 +-
src/test/regress/expected/create_index.out | 16 ++--
src/test/regress/expected/identity.out | 2 +-
src/test/regress/expected/join.out | 42 +++++----
src/test/regress/expected/plancache.out | 24 +++--
src/test/regress/expected/select.out | 16 ++--
src/test/regress/sql/create_index.sql | 8 +-
src/test/regress/sql/identity.sql | 2 +-
12 files changed, 154 insertions(+), 80 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 62c2697920..22d06725c2 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2269,11 +2269,12 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
-> Foreign Scan on public.ft1
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
- -> Materialize
+ -> Sort
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
+ Sort Key: ft2.c1
-> Foreign Scan on public.ft2
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) ORDER BY "C 1" ASC NULLS LAST FOR UPDATE
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
-> Sort
Output: ft4.c1, ft4.c2, ft4.c3, ft4.*
Sort Key: ft4.c1
@@ -2288,7 +2289,7 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 4" FOR UPDATE
-> Index Scan using local_tbl_pkey on public.local_tbl
Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, local_tbl.ctid
-(47 rows)
+(48 rows)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 FOR UPDATE;
@@ -3330,10 +3331,12 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Sort Key: t1.c2
-> Nested Loop
Output: t1.c2, qry.sum
- -> Index Scan using t1_pkey on "S 1"."T 1" t1
+ -> Bitmap Heap Scan on "S 1"."T 1" t1
Output: t1."C 1", t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Index Cond: (t1."C 1" < 100)
+ Recheck Cond: (t1."C 1" < 100)
Filter: (t1.c2 < 3)
+ -> Bitmap Index Scan on t1_pkey
+ Index Cond: (t1."C 1" < 100)
-> Subquery Scan on qry
Output: qry.sum, t2.c1
Filter: ((t1.c2 * 2) = qry.sum)
@@ -3341,7 +3344,7 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Output: (sum((t2.c1 + t1."C 1"))), t2.c1
Relations: Aggregate on (public.ft2 t2)
Remote SQL: SELECT sum(("C 1" + $1::integer)), "C 1" FROM "S 1"."T 1" GROUP BY 2
-(16 rows)
+(18 rows)
select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum from ft2 t2 group by t2.c1) qry where t1.c2 * 2 = qry.sum and t1.c2 < 3 and t1."C 1" < 100 order by 1;
c2 | sum
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index bdc23a075f..9cd314f0b5 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -550,11 +550,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a bitmap scan
+ * can't start until the index scan completes, so only cares about its
+ * total cost.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -989,12 +991,31 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ double cost_per_page_corr;
+ /*
+ * Interpolate based on pages_fetched and correlation from seq_page_cost to rand_page_cost.
+ * A highly correlated bitmap scan 1) likely reads fewer pages (handled in
+ * compute_bitmap_pages); and, 2) at higher "density" (more sequential).
+ */
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+ cost_per_page_corr = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * (correlation*correlation);
+
+ /*
+ * We expect sequential reads and low cost_per_page when *either*
+ * T is high or correlation is high.
+ */
+ cost_per_page = Min(cost_per_page,cost_per_page_corr);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1038,15 +1059,18 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
- *selec = ((IndexPath *) path)->indexselectivity;
+ if (selec)
+ *selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation)
+ *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1059,12 +1083,18 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
else if (IsA(path, BitmapAndPath))
{
*cost = path->total_cost;
- *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
- *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1087,8 +1117,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1100,22 +1131,31 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec <= minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1131,8 +1171,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1145,23 +1186,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec >= maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5513,8 +5563,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5523,7 +5576,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5537,7 +5590,12 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 2a50272da6..8ace2952e8 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1464,11 +1464,9 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
/* duplicate clauseids, keep the cheaper one */
Cost ncost;
Cost ocost;
- Selectivity nselec;
- Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, NULL, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, NULL, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1580,8 +1578,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 0ceb809644..d407f85e6b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1183,6 +1183,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1263,6 +1264,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1276,6 +1278,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..7d5edb5411 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -79,7 +79,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index ae95bb38a6..f2bc6bb214 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1795,24 +1795,23 @@ DROP TABLE onek_with_null;
--
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
- QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1
- Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 3)) OR ((thousand = 42) AND (tenthous = 42)))
+ Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 42)))
-> BitmapOr
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 1))
- -> Bitmap Index Scan on tenk1_thous_tenthous
- Index Cond: ((thousand = 42) AND (tenthous = 3))
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 42))
-(9 rows)
+(7 rows)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4
---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx
@@ -1843,6 +1842,7 @@ SELECT count(*) FROM tenk1
10
(1 row)
+RESET cpu_operator_cost;
--
-- Check behavior with duplicate index column contents
--
diff --git a/src/test/regress/expected/identity.out b/src/test/regress/expected/identity.out
index 7322b28765..b90d634b53 100644
--- a/src/test/regress/expected/identity.out
+++ b/src/test/regress/expected/identity.out
@@ -316,7 +316,7 @@ SELECT * FROM itest6;
102 |
(3 rows)
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2;
table_name | column_name | is_identity | identity_generation
------------+-------------+-------------+---------------------
itest6 | a | YES | BY DEFAULT
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..80cfab8e88 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3144,22 +3144,26 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------
Nested Loop
-> Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Nested Loop
- -> Seq Scan on onerow
- -> Seq Scan on onerow onerow_1
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Nested Loop
+ -> Seq Scan on onerow
+ -> Seq Scan on onerow onerow_1
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-> Seq Scan on int4_tbl i1
Filter: (f1 = 0)
-(13 rows)
+(17 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
@@ -3212,18 +3216,22 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------
Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Seq Scan on int4_tbl i1
- Filter: (f1 = 0)
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Seq Scan on int4_tbl i1
+ Filter: (f1 = 0)
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-(9 rows)
+(13 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 7d289b8c5e..40c243be9f 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -296,12 +296,14 @@ explain (costs off) execute test_mode_pp(2);
-- force generic plan
set plan_cache_mode to force_generic_plan;
explain (costs off) execute test_mode_pp(2);
- QUERY PLAN
------------------------------
+ QUERY PLAN
+--------------------------------------------------
Aggregate
- -> Seq Scan on test_mode
- Filter: (a = $1)
-(3 rows)
+ -> Bitmap Heap Scan on test_mode
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on test_mode_a_idx
+ Index Cond: (a = $1)
+(5 rows)
-- get to generic plan by 5 executions
set plan_cache_mode to auto;
@@ -337,12 +339,14 @@ execute test_mode_pp(1); -- 5x
-- we should now get a really bad plan
explain (costs off) execute test_mode_pp(2);
- QUERY PLAN
------------------------------
+ QUERY PLAN
+--------------------------------------------------
Aggregate
- -> Seq Scan on test_mode
- Filter: (a = $1)
-(3 rows)
+ -> Bitmap Heap Scan on test_mode
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on test_mode_a_idx
+ Index Cond: (a = $1)
+(5 rows)
-- but we can force a custom plan
set plan_cache_mode to force_custom_plan;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index c441049f41..675bf632d0 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -861,17 +861,13 @@ RESET enable_indexscan;
explain (costs off)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
- QUERY PLAN
---------------------------------------------------------------------------------
+ QUERY PLAN
+---------------------------------------------
Bitmap Heap Scan on onek2
- Recheck Cond: (((unique2 = 11) AND (stringu1 < 'B'::name)) OR (unique1 = 0))
- Filter: (stringu1 < 'B'::name)
- -> BitmapOr
- -> Bitmap Index Scan on onek2_u2_prtl
- Index Cond: (unique2 = 11)
- -> Bitmap Index Scan on onek2_u1_prtl
- Index Cond: (unique1 = 0)
-(8 rows)
+ Recheck Cond: (stringu1 < 'B'::name)
+ Filter: ((unique2 = 11) OR (unique1 = 0))
+ -> Bitmap Index Scan on onek2_u2_prtl
+(4 rows)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index c3246cb296..7e09fc895e 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -691,11 +691,13 @@ DROP TABLE onek_with_null;
-- Check bitmap index path planning
--
+SET cpu_operator_cost=0;
+
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
@@ -703,6 +705,8 @@ SELECT count(*) FROM tenk1
SELECT count(*) FROM tenk1
WHERE hundred = 42 AND (thousand = 42 OR thousand = 99);
+RESET cpu_operator_cost;
+
--
-- Check behavior with duplicate index column contents
--
diff --git a/src/test/regress/sql/identity.sql b/src/test/regress/sql/identity.sql
index b4cdd21bdd..5814197e64 100644
--- a/src/test/regress/sql/identity.sql
+++ b/src/test/regress/sql/identity.sql
@@ -191,7 +191,7 @@ INSERT INTO itest6 DEFAULT VALUES;
INSERT INTO itest6 DEFAULT VALUES;
SELECT * FROM itest6;
-SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6';
+SELECT table_name, column_name, is_identity, identity_generation FROM information_schema.columns WHERE table_name = 'itest6' ORDER BY 1,2;
ALTER TABLE itest6 ALTER COLUMN b SET INCREMENT BY 2; -- fail, not identity
--
2.17.0
Status update for a commitfest entry
According to cfbot, the patch fails to apply. Could you please send a rebased version?
I wonder why this patch hangs so long without a review. Maybe it will help to move discussion forward, if you provide more examples of queries that can benefit from this imporovement?
The first patch is simply a refactoring and don't see any possible objections against it.
The second patch also looks fine to me. The logic is understandable and the code is neat.
It wouldn't hurt to add a comment for this computation, though.
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
The new status of this patch is: Waiting on Author
On Fri, Nov 06, 2020 at 01:51:26PM +0000, Anastasia Lubennikova wrote:
I wonder why this patch hangs so long without a review. Maybe it will help to move discussion forward, if you provide more examples of queries that can benefit from this imporovement?
Thanks for looking.
The explanation is that the planner currently gives index scans a cost
"discount" for correlation. Jeff Janes has pointed out that there are two
discounts: 1) fewer pages are read; and, 2) lower cost-per-page. This patch
aims to give bitmap scans the same benefits. A "dense" bitmap will read fewer
pages, more sequentially.
Tom pointed out that the "correlation" isn't a perfect metric for this, since
the index might be "clumpy" without being well-ordered, which doesn't matter
for bitmap scans, which access in physical order anyway. In those cases, the
correlation logic would fail to reduce the estimated cost of bitmap scan, even
though the actual cost is less (same as now). This is true, but my goal is to
give bitmap scans at least the same benefit as index scans, even if there's
additional "discounts" which aren't yet being considered.
This was an issue for me in the past when the planner chose a to scan a index,
but it was slower than projected (for reasons unrelated to this patch). Making
bitmap cost account for high correlation was one step towards addressing that.
Since then, we switched to brin indexes, which force bitmap scans.
/messages/by-id/20160524173914.GA11880@telsasoft.com
Here's an example.
CREATE TABLE t AS SELECT a,b FROM generate_series(1,999) a, generate_series(1,999) b ORDER BY a+b/9.9;
CREATE INDEX ON t(a);
postgres=# SELECT attname, correlation FROM pg_stats WHERE tablename ='t';
a | 0.9951819
b | 0.10415093
postgres=# explain analyze SELECT * FROM t WHERE a BETWEEN 55 AND 77;
Index Scan using t_a_idx on t (cost=0.42..810.89 rows=22683 width=8) (actual time=0.292..66.657 rows=22977 loops=1)
vs (without my patch, with SET enable_indexscan=off);
Bitmap Heap Scan on t (cost=316.93..5073.17 rows=22683 width=8) (actual time=10.810..26.633 rows=22977 loops=1)
vs (with my patch, with SET enable_indexscan=off):
postgres=# explain analyze SELECT * FROM t WHERE a BETWEEN 55 AND 77;
Bitmap Heap Scan on t (cost=316.93..823.84 rows=22683 width=8) (actual time=10.742..33.279 rows=22977 loops=1)
So bitmap scan is cheaper, but the cost estimate is a lot higher. My patch
improves but doesn't completely "fix" that - bitmap scan is still costed as
more expensive, but happens to be. This is probably not even a particularly
good example, as it's a small table cached in RAM. There's always going to be
cases like this, certainly near the costs where the plan changes "shape". I
think a cost difference of 10 here is very reasonable (cpu_oper_cost,
probably), but a cost difference of 5x is not.
There's not many regression tests changed. Probably partially because bitmap
scans have an overhead (the heap scan cannot start until after the index scan
finishes), and we avoid large tests.
If there's no interest in the patch, I guess we should just close it rather
than letting it rot.
The first patch is simply a refactoring and don't see any possible objections against it.
The second patch also looks fine to me. The logic is understandable and the code is neat.It wouldn't hurt to add a comment for this computation, though.
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);
You're right. It's like this:
// interpolate between c==0: pages_fetched=max and c==1: pages_fetched=min
pages_fetched = min + (max-min)*(1-c**2)
pages_fetched = min + max*(1-c**2) - min*(1-c**2)
pages_fetched = max*(1-c**2) + min - min*(1-c**2)
pages_fetched = max*(1-c**2) + min*(c**2)
pages_fetched = max - max*c**2 + min*(c**2)
pages_fetched = max + min*(c**2) - max*c**2
pages_fetched = max + c**2 * (min-max)
I'm not sure if there's a reason why it's written like that, but (min-max)
looks odd, so I wrote it like:
pages_fetched = max - c**2 * (max-min)
Show quoted text
The new status of this patch is: Waiting on Author
Attachments:
v6-0001-Make-more-clear-the-computation-of-min-max-IO.patchtext/x-diff; charset=us-asciiDownload
From af1e640af2b1a80430191a38b80dde1f2b750757 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Wed, 8 Jan 2020 19:23:51 -0600
Subject: [PATCH v6 1/2] Make more clear the computation of min/max IO..
..and specifically the double use and effect of correlation.
Avoid re-use of the "pages_fetched" variable
---
src/backend/optimizer/path/costsize.c | 47 ++++++++++++++-------------
1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f1dfdc1a4a..083448def7 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -503,12 +503,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
csquared;
double spc_seq_page_cost,
spc_random_page_cost;
- Cost min_IO_cost,
+ double min_pages_fetched, /* The min and max page count based on index correlation */
+ max_pages_fetched;
+ Cost min_IO_cost, /* The min and max cost based on index correlation */
max_IO_cost;
QualCost qpqual_cost;
Cost cpu_per_tuple;
double tuples_fetched;
- double pages_fetched;
double rand_heap_pages;
double index_pages;
@@ -591,7 +592,8 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* (just after a CLUSTER, for example), the number of pages fetched should
* be exactly selectivity * table_size. What's more, all but the first
* will be sequential fetches, not the random fetches that occur in the
- * uncorrelated case. So if the number of pages is more than 1, we
+ * uncorrelated case (the index is expected to read fewer pages, *and* each
+ * page read is cheaper). So if the number of pages is more than 1, we
* ought to charge
* spc_random_page_cost + (pages_fetched - 1) * spc_seq_page_cost
* For partially-correlated indexes, we ought to charge somewhere between
@@ -616,17 +618,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* pro-rate the costs for one scan. In this case we assume all the
* fetches are random accesses.
*/
- pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
+ max_pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- max_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ max_IO_cost = (max_pages_fetched * spc_random_page_cost) / loop_count;
/*
* In the perfectly correlated case, the number of pages touched by
@@ -638,17 +640,17 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* where such a plan is actually interesting, only one page would get
* fetched per scan anyway, so it shouldn't matter much.)
*/
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
- pages_fetched = index_pages_fetched(pages_fetched * loop_count,
+ min_pages_fetched = index_pages_fetched(min_pages_fetched * loop_count,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- min_IO_cost = (pages_fetched * spc_random_page_cost) / loop_count;
+ min_IO_cost = (min_pages_fetched * spc_random_page_cost) / loop_count;
}
else
{
@@ -656,30 +658,31 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
* Normal case: apply the Mackert and Lohman formula, and then
* interpolate between that and the correlation-derived result.
*/
- pages_fetched = index_pages_fetched(tuples_fetched,
+
+ /* For the perfectly uncorrelated case (csquared=0) */
+ max_pages_fetched = index_pages_fetched(tuples_fetched,
baserel->pages,
(double) index->pages,
root);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ max_pages_fetched = ceil(max_pages_fetched * (1.0 - baserel->allvisfrac));
- rand_heap_pages = pages_fetched;
+ rand_heap_pages = max_pages_fetched;
- /* max_IO_cost is for the perfectly uncorrelated case (csquared=0) */
- max_IO_cost = pages_fetched * spc_random_page_cost;
+ max_IO_cost = max_pages_fetched * spc_random_page_cost;
- /* min_IO_cost is for the perfectly correlated case (csquared=1) */
- pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
+ /* For the perfectly correlated case (csquared=1) */
+ min_pages_fetched = ceil(indexSelectivity * (double) baserel->pages);
if (indexonly)
- pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));
+ min_pages_fetched = ceil(min_pages_fetched * (1.0 - baserel->allvisfrac));
- if (pages_fetched > 0)
+ if (min_pages_fetched > 0)
{
min_IO_cost = spc_random_page_cost;
- if (pages_fetched > 1)
- min_IO_cost += (pages_fetched - 1) * spc_seq_page_cost;
+ if (min_pages_fetched > 1)
+ min_IO_cost += (min_pages_fetched - 1) * spc_seq_page_cost;
}
else
min_IO_cost = 0;
--
2.17.0
v6-0002-Use-correlation-statistic-in-costing-bitmap-scans.patchtext/x-diff; charset=us-asciiDownload
From 2a17c1ce20d53b8140fe9403a22fbee6efc02770 Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue, 1 Jan 2019 16:17:28 -0600
Subject: [PATCH v6 2/2] Use correlation statistic in costing bitmap scans..
Same as for an index scan, a correlated bitmap which accesses pages across a
small portion of the table should have a cost estimate much less than an
uncorrelated scan (like modulus) across the entire length of the table, the
latter having a high component of random access.
Note, Tom points out that there are cases where a column could be
tightly-"clumped" without being highly-ordered. Since we have correlation
already, we use that until such time as someone implements a new statistic for
clumpiness. This patch only intends to make costing of bitmap heap scan on par
with the same cost of index scan without bitmap.
---
.../postgres_fdw/expected/postgres_fdw.out | 15 +--
src/backend/optimizer/path/costsize.c | 98 +++++++++++++++----
src/backend/optimizer/path/indxpath.c | 10 +-
src/include/nodes/pathnodes.h | 3 +
src/include/optimizer/cost.h | 2 +-
src/test/regress/expected/create_index.out | 14 ++-
src/test/regress/expected/join.out | 42 ++++----
src/test/regress/expected/plancache.out | 24 +++--
src/test/regress/expected/select.out | 16 ++-
src/test/regress/sql/create_index.sql | 4 +-
10 files changed, 150 insertions(+), 78 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 2d88d06358..9a96f24d43 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2261,11 +2261,12 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
-> Foreign Scan on public.ft1
Output: ft1.c1, ft1.c2, ft1.c3, ft1.c4, ft1.c5, ft1.c6, ft1.c7, ft1.c8, ft1.*
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
- -> Materialize
+ -> Sort
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
+ Sort Key: ft2.c1
-> Foreign Scan on public.ft2
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft2.*
- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) ORDER BY "C 1" ASC NULLS LAST FOR UPDATE
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 100)) FOR UPDATE
-> Sort
Output: ft4.c1, ft4.c2, ft4.c3, ft4.*
Sort Key: ft4.c1
@@ -2280,7 +2281,7 @@ SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = f
Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 4" FOR UPDATE
-> Index Scan using local_tbl_pkey on public.local_tbl
Output: local_tbl.c1, local_tbl.c2, local_tbl.c3, local_tbl.ctid
-(47 rows)
+(48 rows)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 FOR UPDATE;
@@ -3322,10 +3323,12 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Sort Key: t1.c2
-> Nested Loop
Output: t1.c2, qry.sum
- -> Index Scan using t1_pkey on "S 1"."T 1" t1
+ -> Bitmap Heap Scan on "S 1"."T 1" t1
Output: t1."C 1", t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Index Cond: (t1."C 1" < 100)
+ Recheck Cond: (t1."C 1" < 100)
Filter: (t1.c2 < 3)
+ -> Bitmap Index Scan on t1_pkey
+ Index Cond: (t1."C 1" < 100)
-> Subquery Scan on qry
Output: qry.sum, t2.c1
Filter: ((t1.c2 * 2) = qry.sum)
@@ -3333,7 +3336,7 @@ select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum fr
Output: (sum((t2.c1 + t1."C 1"))), t2.c1
Relations: Aggregate on (public.ft2 t2)
Remote SQL: SELECT sum(("C 1" + $1::integer)), "C 1" FROM "S 1"."T 1" GROUP BY 2
-(16 rows)
+(18 rows)
select c2, sum from "S 1"."T 1" t1, lateral (select sum(t2.c1 + t1."C 1") sum from ft2 t2 group by t2.c1) qry where t1.c2 * 2 = qry.sum and t1.c2 < 3 and t1."C 1" < 100 order by 1;
c2 | sum
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 083448def7..db90451c73 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -562,11 +562,13 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count,
/*
* Save amcostestimate's results for possible use in bitmap scan planning.
- * We don't bother to save indexStartupCost or indexCorrelation, because a
- * bitmap scan doesn't care about either.
+ * We don't bother to save indexStartupCost, because a bitmap scan
+ * can't start until the index scan completes, so only cares about its
+ * total cost.
*/
path->indextotalcost = indexTotalCost;
path->indexselectivity = indexSelectivity;
+ path->indexCorrelation = indexCorrelation;
/* all costs for touching index itself included here */
startup_cost += indexStartupCost;
@@ -1001,12 +1003,31 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
+ * Note this works at PAGE granularity, so even if we read 1% of a
+ * table's tuples, if we have to read nearly every page, it should be
+ * considered sequential.
*/
- if (pages_fetched >= 2.0)
+ if (pages_fetched >= 2.0) {
+ double correlation = ((IndexPath *)bitmapqual)->indexCorrelation;
+ double cost_per_page_corr;
+ /*
+ * Interpolate based on pages_fetched and correlation from seq_page_cost to rand_page_cost.
+ * A highly correlated bitmap scan 1) likely reads fewer pages (handled in
+ * compute_bitmap_pages); and, 2) at higher "density" (more sequential).
+ */
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
- else
+ cost_per_page_corr = spc_random_page_cost -
+ (spc_random_page_cost - spc_seq_page_cost)
+ * (correlation*correlation);
+
+ /*
+ * We expect sequential reads and low cost_per_page when *either*
+ * T is high or correlation is high.
+ */
+ cost_per_page = Min(cost_per_page,cost_per_page_corr);
+ } else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;
@@ -1050,15 +1071,18 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
/*
* cost_bitmap_tree_node
- * Extract cost and selectivity from a bitmap tree node (index/and/or)
+ * Extract cost, selectivity, and correlation from a bitmap tree node (index/and/or)
*/
void
-cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
+cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation)
{
if (IsA(path, IndexPath))
{
*cost = ((IndexPath *) path)->indextotalcost;
- *selec = ((IndexPath *) path)->indexselectivity;
+ if (selec)
+ *selec = ((IndexPath *) path)->indexselectivity;
+ if (correlation)
+ *correlation = ((IndexPath *) path)->indexCorrelation;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
@@ -1071,12 +1095,18 @@ cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
else if (IsA(path, BitmapAndPath))
{
*cost = path->total_cost;
- *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapAndPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapAndPath *) path)->bitmapcorrelation;
}
else if (IsA(path, BitmapOrPath))
{
*cost = path->total_cost;
- *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (selec)
+ *selec = ((BitmapOrPath *) path)->bitmapselectivity;
+ if (correlation)
+ *correlation = ((BitmapOrPath *) path)->bitmapcorrelation;
}
else
{
@@ -1099,8 +1129,9 @@ void
cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, minsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate AND selectivity on the assumption that the inputs are
@@ -1112,22 +1143,31 @@ cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root)
* definitely too simplistic?
*/
totalCost = 0.0;
- selec = 1.0;
+ minsubselec = selec = 1.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec *= subselec;
+ /* For an AND node, use the correlation of its most-selective subpath */
+ if (subselec <= minsubselec) {
+ correlation = subcorrelation;
+ minsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = selec;
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -1143,8 +1183,9 @@ void
cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
{
Cost totalCost;
- Selectivity selec;
+ Selectivity selec, maxsubselec;
ListCell *l;
+ double correlation;
/*
* We estimate OR selectivity on the assumption that the inputs are
@@ -1157,23 +1198,32 @@ cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root)
* optimized out when the inputs are BitmapIndexScans.
*/
totalCost = 0.0;
- selec = 0.0;
+ maxsubselec = selec = 0.0;
+ correlation = 0;
foreach(l, path->bitmapquals)
{
Path *subpath = (Path *) lfirst(l);
Cost subCost;
Selectivity subselec;
+ double subcorrelation;
- cost_bitmap_tree_node(subpath, &subCost, &subselec);
+ cost_bitmap_tree_node(subpath, &subCost, &subselec, &subcorrelation);
selec += subselec;
+ /* For an OR node, use the correlation of its least-selective subpath */
+ if (subselec >= maxsubselec) {
+ correlation = subcorrelation;
+ maxsubselec = subselec;
+ }
+
totalCost += subCost;
if (l != list_head(path->bitmapquals) &&
!IsA(subpath, IndexPath))
totalCost += 100.0 * cpu_operator_cost;
}
path->bitmapselectivity = Min(selec, 1.0);
+ path->bitmapcorrelation = correlation;
path->path.rows = 0; /* per above, not used */
path->path.startup_cost = totalCost;
path->path.total_cost = totalCost;
@@ -5806,8 +5856,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
{
Cost indexTotalCost;
Selectivity indexSelectivity;
+ double indexCorrelation;
double T;
- double pages_fetched;
+ double pages_fetched,
+ pages_fetchedMIN,
+ pages_fetchedMAX;
double tuples_fetched;
double heap_pages;
long maxentries;
@@ -5816,7 +5869,7 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
*/
- cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
+ cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity, &indexCorrelation);
/*
* Estimate number of main-table pages fetched.
@@ -5830,7 +5883,16 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
*/
- pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+ pages_fetchedMAX = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
+
+ /* pages_fetchedMIN is for the perfectly correlated case (csquared=1) */
+ pages_fetchedMIN = ceil(indexSelectivity * (double) baserel->pages);
+
+ /*
+ * interpolate between MIN and MAX pages based on correlation**2
+ * This is the same computation as in cost_index().
+ */
+ pages_fetched = pages_fetchedMAX - indexCorrelation*indexCorrelation*(pages_fetchedMAX - pages_fetchedMIN);
/*
* Calculate the number of pages fetched from the heap. Then based on
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index bcb1bc6097..773277ce5b 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1454,11 +1454,9 @@ choose_bitmap_and(PlannerInfo *root, RelOptInfo *rel, List *paths)
/* duplicate clauseids, keep the cheaper one */
Cost ncost;
Cost ocost;
- Selectivity nselec;
- Selectivity oselec;
- cost_bitmap_tree_node(pathinfo->path, &ncost, &nselec);
- cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, &oselec);
+ cost_bitmap_tree_node(pathinfo->path, &ncost, NULL, NULL);
+ cost_bitmap_tree_node(pathinfoarray[i]->path, &ocost, NULL, NULL);
if (ncost < ocost)
pathinfoarray[i] = pathinfo;
}
@@ -1570,8 +1568,8 @@ path_usage_comparator(const void *a, const void *b)
Selectivity aselec;
Selectivity bselec;
- cost_bitmap_tree_node(pa->path, &acost, &aselec);
- cost_bitmap_tree_node(pb->path, &bcost, &bselec);
+ cost_bitmap_tree_node(pa->path, &acost, &aselec, NULL);
+ cost_bitmap_tree_node(pb->path, &bcost, &bselec, NULL);
/*
* If costs are the same, sort by selectivity.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8f62d61702..4ff0b17ada 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1209,6 +1209,7 @@ typedef struct IndexPath
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
+ double indexCorrelation;
} IndexPath;
/*
@@ -1289,6 +1290,7 @@ typedef struct BitmapAndPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapOrPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapAndPath;
/*
@@ -1302,6 +1304,7 @@ typedef struct BitmapOrPath
Path path;
List *bitmapquals; /* IndexPaths and BitmapAndPaths */
Selectivity bitmapselectivity;
+ double bitmapcorrelation;
} BitmapOrPath;
/*
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 6141654e47..992eb4856d 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -80,7 +80,7 @@ extern void cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *bas
Path *bitmapqual, double loop_count);
extern void cost_bitmap_and_node(BitmapAndPath *path, PlannerInfo *root);
extern void cost_bitmap_or_node(BitmapOrPath *path, PlannerInfo *root);
-extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec);
+extern void cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec, double *correlation);
extern void cost_tidscan(Path *path, PlannerInfo *root,
RelOptInfo *baserel, List *tidquals, ParamPathInfo *param_info);
extern void cost_subqueryscan(SubqueryScanPath *path, PlannerInfo *root,
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 93a8736a3f..bfb78ec8eb 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1797,22 +1797,20 @@ DROP TABLE onek_with_null;
--
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
- QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
+ QUERY PLAN
+-------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1
- Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 3)) OR ((thousand = 42) AND (tenthous = 42)))
+ Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 42)))
-> BitmapOr
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 1))
- -> Bitmap Index Scan on tenk1_thous_tenthous
- Index Cond: ((thousand = 42) AND (tenthous = 3))
-> Bitmap Index Scan on tenk1_thous_tenthous
Index Cond: ((thousand = 42) AND (tenthous = 42))
-(9 rows)
+(7 rows)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even | stringu1 | stringu2 | string4
---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
42 | 5530 | 0 | 2 | 2 | 2 | 42 | 42 | 42 | 42 | 42 | 84 | 85 | QBAAAA | SEIAAA | OOOOxx
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 6c9a5e26dd..d4b0443ba9 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3144,22 +3144,26 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------------
Nested Loop
-> Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Nested Loop
- -> Seq Scan on onerow
- -> Seq Scan on onerow onerow_1
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Nested Loop
+ -> Seq Scan on onerow
+ -> Seq Scan on onerow onerow_1
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-> Seq Scan on int4_tbl i1
Filter: (f1 = 0)
-(13 rows)
+(17 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
@@ -3212,18 +3216,22 @@ select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
left join tenk1 t2
on (subq1.y1 = t2.unique1)
where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
- QUERY PLAN
------------------------------------------------------------------
+ QUERY PLAN
+------------------------------------------------------
Nested Loop
Join Filter: (t1.stringu1 > t2.stringu2)
- -> Nested Loop
- -> Seq Scan on int4_tbl i1
- Filter: (f1 = 0)
- -> Index Scan using tenk1_unique2 on tenk1 t1
- Index Cond: ((unique2 = (11)) AND (unique2 < 42))
+ -> Hash Join
+ Hash Cond: (t1.unique2 = (11))
+ -> Bitmap Heap Scan on tenk1 t1
+ Recheck Cond: (unique2 < 42)
+ -> Bitmap Index Scan on tenk1_unique2
+ Index Cond: (unique2 < 42)
+ -> Hash
+ -> Seq Scan on int4_tbl i1
+ Filter: (f1 = 0)
-> Index Scan using tenk1_unique1 on tenk1 t2
Index Cond: (unique1 = (3))
-(9 rows)
+(13 rows)
select t1.unique2, t1.stringu1, t2.unique1, t2.stringu2 from
tenk1 t1
diff --git a/src/test/regress/expected/plancache.out b/src/test/regress/expected/plancache.out
index 4e59188196..1727173ccd 100644
--- a/src/test/regress/expected/plancache.out
+++ b/src/test/regress/expected/plancache.out
@@ -311,12 +311,14 @@ select name, generic_plans, custom_plans from pg_prepared_statements
-- force generic plan
set plan_cache_mode to force_generic_plan;
explain (costs off) execute test_mode_pp(2);
- QUERY PLAN
------------------------------
+ QUERY PLAN
+--------------------------------------------------
Aggregate
- -> Seq Scan on test_mode
- Filter: (a = $1)
-(3 rows)
+ -> Bitmap Heap Scan on test_mode
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on test_mode_a_idx
+ Index Cond: (a = $1)
+(5 rows)
select name, generic_plans, custom_plans from pg_prepared_statements
where name = 'test_mode_pp';
@@ -373,12 +375,14 @@ select name, generic_plans, custom_plans from pg_prepared_statements
-- we should now get a really bad plan
explain (costs off) execute test_mode_pp(2);
- QUERY PLAN
------------------------------
+ QUERY PLAN
+--------------------------------------------------
Aggregate
- -> Seq Scan on test_mode
- Filter: (a = $1)
-(3 rows)
+ -> Bitmap Heap Scan on test_mode
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on test_mode_a_idx
+ Index Cond: (a = $1)
+(5 rows)
-- but we can force a custom plan
set plan_cache_mode to force_custom_plan;
diff --git a/src/test/regress/expected/select.out b/src/test/regress/expected/select.out
index c441049f41..675bf632d0 100644
--- a/src/test/regress/expected/select.out
+++ b/src/test/regress/expected/select.out
@@ -861,17 +861,13 @@ RESET enable_indexscan;
explain (costs off)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
- QUERY PLAN
---------------------------------------------------------------------------------
+ QUERY PLAN
+---------------------------------------------
Bitmap Heap Scan on onek2
- Recheck Cond: (((unique2 = 11) AND (stringu1 < 'B'::name)) OR (unique1 = 0))
- Filter: (stringu1 < 'B'::name)
- -> BitmapOr
- -> Bitmap Index Scan on onek2_u2_prtl
- Index Cond: (unique2 = 11)
- -> Bitmap Index Scan on onek2_u1_prtl
- Index Cond: (unique1 = 0)
-(8 rows)
+ Recheck Cond: (stringu1 < 'B'::name)
+ Filter: ((unique2 = 11) OR (unique1 = 0))
+ -> Bitmap Index Scan on onek2_u2_prtl
+(4 rows)
select unique1, unique2 from onek2
where (unique2 = 11 or unique1 = 0) and stringu1 < 'B';
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index b27643cad6..431864826d 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -693,9 +693,9 @@ DROP TABLE onek_with_null;
EXPLAIN (COSTS OFF)
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
SELECT * FROM tenk1
- WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42);
+ WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 42);
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
--
2.17.0
On 06/11/2020 19:57, Justin Pryzby wrote:
On Fri, Nov 06, 2020 at 01:51:26PM +0000, Anastasia Lubennikova wrote:
The first patch is simply a refactoring and don't see any possible objections against it.
The second patch also looks fine to me. The logic is understandable and the code is neat.
+1
It wouldn't hurt to add a comment for this computation, though.
+ pages_fetched = pages_fetchedMAX + indexCorrelation*indexCorrelation*(pages_fetchedMIN - pages_fetchedMAX);You're right. It's like this:
// interpolate between c==0: pages_fetched=max and c==1: pages_fetched=min
pages_fetched = min + (max-min)*(1-c**2)
pages_fetched = min + max*(1-c**2) - min*(1-c**2)
pages_fetched = max*(1-c**2) + min - min*(1-c**2)
pages_fetched = max*(1-c**2) + min*(c**2)
pages_fetched = max - max*c**2 + min*(c**2)
pages_fetched = max + min*(c**2) - max*c**2
pages_fetched = max + c**2 * (min-max)I'm not sure if there's a reason why it's written like that, but (min-max)
looks odd, so I wrote it like:
pages_fetched = max - c**2 * (max-min)
I agree min-max looks odd. max - c**2 * (max-min) looks a bit odd too,
though. Whatever we do here, though, I'd suggest that we keep it
consistent with cost_index().
Other than that, and a quick pgdindent run, this seems ready to me. I'll
mark it as Ready for Committer.
- Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes:
Other than that, and a quick pgdindent run, this seems ready to me. I'll
mark it as Ready for Committer.
I dunno, this seems largely misguided to me.
It's already the case that index correlation is just not the right
stat for this purpose, since it doesn't give you much of a toehold
on whether a particular scan is going to be accessing tightly-clumped
data. For specific kinds of index conditions, such as a range query
on a btree index, maybe you could draw that conclusion ... but this
patch isn't paying any attention to the index condition in use.
And then the rules for bitmap AND and OR correlations, if not just
plucked out of the air, still seem *far* too optimistic. As an
example, even if my individual indexes are perfectly correlated and
so a probe would touch only one page, OR'ing ten such probes together
is likely going to touch ten different pages. But unless I'm
misreading the patch, it's going to report back an OR correlation
that corresponds to touching one page.
Even if we assume that the correlation is nonetheless predictive of
how big a part of the table we'll be examining, I don't see a lot
of basis for deciding that the equations the patch adds to
cost_bitmap_heap_scan are the right ones.
I'd have expected this thread to focus a whole lot more on actual
examples than it has done, so that we could have some confidence
that these equations have something to do with reality.
regards, tom lane
On Sat, Nov 28, 2020 at 5:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
Other than that, and a quick pgdindent run, this seems ready to me. I'll
mark it as Ready for Committer.I dunno, this seems largely misguided to me.
It's already the case that index correlation is just not the right
stat for this purpose, since it doesn't give you much of a toehold
on whether a particular scan is going to be accessing tightly-clumped
data. For specific kinds of index conditions, such as a range query
on a btree index, maybe you could draw that conclusion ... but this
patch isn't paying any attention to the index condition in use.And then the rules for bitmap AND and OR correlations, if not just
plucked out of the air, still seem *far* too optimistic. As an
example, even if my individual indexes are perfectly correlated and
so a probe would touch only one page, OR'ing ten such probes together
is likely going to touch ten different pages. But unless I'm
misreading the patch, it's going to report back an OR correlation
that corresponds to touching one page.Even if we assume that the correlation is nonetheless predictive of
how big a part of the table we'll be examining, I don't see a lot
of basis for deciding that the equations the patch adds to
cost_bitmap_heap_scan are the right ones.I'd have expected this thread to focus a whole lot more on actual
examples than it has done, so that we could have some confidence
that these equations have something to do with reality.
Status update for a commitfest entry.
The discussion has been inactive since the end of the last CF. It
seems to me that we need some discussion on the point Tom mentioned.
It looks either "Needs Review" or "Ready for Committer" status but
Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you
working on this, Justin?
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Thu, Jan 28, 2021 at 9:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sat, Nov 28, 2020 at 5:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
Other than that, and a quick pgdindent run, this seems ready to me. I'll
mark it as Ready for Committer.I dunno, this seems largely misguided to me.
It's already the case that index correlation is just not the right
stat for this purpose, since it doesn't give you much of a toehold
on whether a particular scan is going to be accessing tightly-clumped
data. For specific kinds of index conditions, such as a range query
on a btree index, maybe you could draw that conclusion ... but this
patch isn't paying any attention to the index condition in use.And then the rules for bitmap AND and OR correlations, if not just
plucked out of the air, still seem *far* too optimistic. As an
example, even if my individual indexes are perfectly correlated and
so a probe would touch only one page, OR'ing ten such probes together
is likely going to touch ten different pages. But unless I'm
misreading the patch, it's going to report back an OR correlation
that corresponds to touching one page.Even if we assume that the correlation is nonetheless predictive of
how big a part of the table we'll be examining, I don't see a lot
of basis for deciding that the equations the patch adds to
cost_bitmap_heap_scan are the right ones.I'd have expected this thread to focus a whole lot more on actual
examples than it has done, so that we could have some confidence
that these equations have something to do with reality.Status update for a commitfest entry.
The discussion has been inactive since the end of the last CF. It
seems to me that we need some discussion on the point Tom mentioned.
It looks either "Needs Review" or "Ready for Committer" status but
Justin set it to "Waiting on Author" on 2020-12-03 by himself. Are you
working on this, Justin?
Status update for a commitfest entry.
This patch, which you submitted to this CommitFest, has been awaiting
your attention for more than one month. As such, we have moved it to
"Returned with Feedback" and removed it from the reviewing queue.
Depending on timing, this may be reversable, so let us know if there
are extenuating circumstances. In any case, you are welcome to address
the feedback you have received, and resubmit the patch to the next
CommitFest.
Thank you for contributing to PostgreSQL.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/