Improved Cost Calculation for IndexOnlyScan
In one of my earlier emails [1]/messages/by-id/CANugjhsnh0OBMOYc7qKcC+ZsVvAXCeF7QiidLuFvg6zmHy1C7A@mail.gmail.com, I mentioned that there seems to be a
problem with how the cost for index only scans is being calculated.
[1]: /messages/by-id/CANugjhsnh0OBMOYc7qKcC+ZsVvAXCeF7QiidLuFvg6zmHy1C7A@mail.gmail.com
/messages/by-id/CANugjhsnh0OBMOYc7qKcC+ZsVvAXCeF7QiidLuFvg6zmHy1C7A@mail.gmail.com
My concern is that there seems to be a bigger disconnect between the cost
of index only scan and the execution time. Having tested this on 3
different systems, docker, laptop and a server with RAID 5 SSD configured,
at the threshold where index only scan cost exceeds that of sequential
scan, index only is still around 30% faster than the sequential scan.
My initial hunch was that perhaps we need to consider a different approach
when considering cost for index only scan. However, the solution seems
somewhat simple.
cost_index function in costsize.c, in case of indexonlyscan, multiplies the
number of pages fetched by a factor of (1.0 - baserel->allvisfrac) which is
then used to calculate the max_IO_cost and min_IO_cost.
This is very similar to the cost estimate methods for indexes internally
call genericostesimate function. This function primarily gets the number of
pages for the indexes and multiplies that with random page cost
(spc_random_page_cost) to get the total disk access cost.
I believe that in case of index only scan, we should adjust the
spc_random_page_cost in context of baserel->allvisfrac so that it accounts
for random pages for only the fraction that needs to be read for the
relation and excludes that the index page fetches.
With the attached POC for this approach (based on the current master
branch), index only scan which was previously bailing out at much earlier,
approximately around when 50% of the index/table was being scanned. With
the attached patch, this percentage goes upto 70%. The execution time for
index only still remains well below that of the sequential scan, however,
this is a significant improvement IMHO.
Following is the simple SQL that I was using to see how this patch impacts
that indexonlyscan costing and execution time for the scan. You may tweak
the table size or the number of rows being scanned for your environment..
SQL:
-----
CREATE TABLE index_test AS (SELECT GENERATE_SERIES(1, 1000000)::int id,
'hello world' AS name);
CREATE INDEX on index_test(id);
VACUUM ANALYZE index_test;
EXPLAIN ANALYZE SELECT COUNT(*) FROM index_test WHERE id < 7000000;
--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
Attachments:
poc_indexonly_costing.patchapplication/octet-stream; name=poc_indexonly_costing.patchDownload
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 00c7afc66f..6550cfcc94 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -6003,6 +6003,8 @@ genericcostestimate(PlannerInfo *root,
GenericCosts *costs)
{
IndexOptInfo *index = path->indexinfo;
+ RelOptInfo *baserel = index->rel;
+ bool indexonly = (path->path.pathtype == T_IndexOnlyScan);
List *indexQuals = get_quals_from_indexclauses(path->indexclauses);
List *indexOrderBys = path->indexorderbys;
Cost indexStartupCost;
@@ -6011,7 +6013,9 @@ genericcostestimate(PlannerInfo *root,
double indexCorrelation;
double numIndexPages;
double numIndexTuples;
+ double spc_seq_page_cost;
double spc_random_page_cost;
+ double index_random_page_cost;
double num_sa_scans;
double num_outer_scans;
double num_scans;
@@ -6102,7 +6106,23 @@ genericcostestimate(PlannerInfo *root,
/* fetch estimated page cost for tablespace containing index */
get_tablespace_page_costs(index->reltablespace,
&spc_random_page_cost,
- NULL);
+ &spc_seq_page_cost);
+
+ /* If it is an index only scan, random page cost should be incurred for
+ * the percentage of pages that need to be fetched from disk for the
+ * relation. Considering that, random page cost should be adjusted such
+ * that when no tuples are to be fetched from the table, we end up with a
+ * sequential page cost, otherwise, we hover betwen that and random
+ * page cost for the tablespace.
+ *
+ * Otherwise, we default to the random page cost for table space.
+ */
+ if (indexonly)
+ index_random_page_cost = Min(spc_seq_page_cost +
+ spc_random_page_cost * (1.0 - baserel->allvisfrac),
+ spc_random_page_cost);
+ else
+ index_random_page_cost = spc_random_page_cost;
/*
* Now compute the disk access costs.
@@ -6142,16 +6162,16 @@ genericcostestimate(PlannerInfo *root,
* share for each outer scan. (Don't pro-rate for ScalarArrayOpExpr,
* since that's internal to the indexscan.)
*/
- indexTotalCost = (pages_fetched * spc_random_page_cost)
+ indexTotalCost = (pages_fetched * index_random_page_cost)
/ num_outer_scans;
}
else
{
/*
- * For a single index scan, we just charge spc_random_page_cost per
+ * For a single index scan, we just charge random page cost per
* page touched.
*/
- indexTotalCost = numIndexPages * spc_random_page_cost;
+ indexTotalCost = numIndexPages * index_random_page_cost;
}
/*
On 29/09/2020 10:06, Hamid Akhtar wrote:
In one of my earlier emails [1], I mentioned that there seems to be a
problem with how the cost for index only scans is being calculated.
[1]
/messages/by-id/CANugjhsnh0OBMOYc7qKcC+ZsVvAXCeF7QiidLuFvg6zmHy1C7A@mail.gmail.comMy concern is that there seems to be a bigger disconnect between the
cost of index only scan and the execution time. Having tested this on 3
different systems, docker, laptop and a server with RAID 5 SSD
configured, at the threshold where index only scan cost exceeds that of
sequential scan, index only is still around 30% faster than the
sequential scan.
A 30% discrepancy doesn't sound too bad, to be honest. The exact
threshold depends on so many factors.
My initial hunch was that perhaps we need to consider a different
approach when considering cost for index only scan. However, the
solution seems somewhat simple.cost_index function in costsize.c, in case of indexonlyscan, multiplies
the number of pages fetched by a factor of (1.0 - baserel->allvisfrac)
which is then used to calculate the max_IO_cost and min_IO_cost.This is very similar to the cost estimate methods for indexes internally
call genericostesimate function. This function primarily gets the number
of pages for the indexes and multiplies that with random page cost
(spc_random_page_cost) to get the total disk access cost.I believe that in case of index only scan, we should adjust the
spc_random_page_cost in context of baserel->allvisfrac so that it
accounts for random pages for only the fraction that needs to be read
for the relation and excludes that the index page fetches.
That doesn't sound right to me. The genericcostestimate() function
calculates the number of *index* pages fetched. It makes no difference
if it's an Index Scan or an Index Only Scan.
genericcostestimate() could surely be made smarter. Currently, it
multiplies the number of index pages fetched with random_page_cost, even
though a freshly created index is mostly physically ordered by the keys.
seq_page_cost with some fudge factor might be more appropriate, whether
or not it's an Index Only Scan. Not sure what the exact formula should
be, just replacing random_page_cost with seq_page_cost is surely not
right either.
- Heikki
On Tue, Sep 29, 2020 at 1:08 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 29/09/2020 10:06, Hamid Akhtar wrote:
In one of my earlier emails [1], I mentioned that there seems to be a
problem with how the cost for index only scans is being calculated.
[1]/messages/by-id/CANugjhsnh0OBMOYc7qKcC+ZsVvAXCeF7QiidLuFvg6zmHy1C7A@mail.gmail.com
My concern is that there seems to be a bigger disconnect between the
cost of index only scan and the execution time. Having tested this on 3
different systems, docker, laptop and a server with RAID 5 SSD
configured, at the threshold where index only scan cost exceeds that of
sequential scan, index only is still around 30% faster than the
sequential scan.A 30% discrepancy doesn't sound too bad, to be honest. The exact
threshold depends on so many factors.My initial hunch was that perhaps we need to consider a different
approach when considering cost for index only scan. However, the
solution seems somewhat simple.cost_index function in costsize.c, in case of indexonlyscan, multiplies
the number of pages fetched by a factor of (1.0 - baserel->allvisfrac)
which is then used to calculate the max_IO_cost and min_IO_cost.This is very similar to the cost estimate methods for indexes internally
call genericostesimate function. This function primarily gets the number
of pages for the indexes and multiplies that with random page cost
(spc_random_page_cost) to get the total disk access cost.I believe that in case of index only scan, we should adjust the
spc_random_page_cost in context of baserel->allvisfrac so that it
accounts for random pages for only the fraction that needs to be read
for the relation and excludes that the index page fetches.That doesn't sound right to me. The genericcostestimate() function
calculates the number of *index* pages fetched. It makes no difference
if it's an Index Scan or an Index Only Scan.genericcostestimate() could surely be made smarter. Currently, it
multiplies the number of index pages fetched with random_page_cost, even
though a freshly created index is mostly physically ordered by the keys.
seq_page_cost with some fudge factor might be more appropriate, whether
or not it's an Index Only Scan. Not sure what the exact formula should
be, just replacing random_page_cost with seq_page_cost is surely not
right either.- Heikki
So, not actually random replacement here, rather a change with
baserel->allvisfrac taken into consideration (as given below):
----
index_random_page_cost = Min(spc_seq_page_cost + spc_random_page_cost *
(1.0 - baserel->allvisfrac), spc_random_page_cost);
----
Does this make sense?
--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On 29/09/2020 11:49, Hamid Akhtar wrote:
So, not actually random replacement here, rather a change with
baserel->allvisfrac taken into consideration (as given below):
----
index_random_page_cost = Min(spc_seq_page_cost + spc_random_page_cost *
(1.0 - baserel->allvisfrac), spc_random_page_cost);
----Does this make sense?
No. genericcostestimate() is only concerned with accesses to the index,
not the the heap accesses that are needed with Index Scans. 'allvisfrac'
should not affect the number of *index* pages fetched in any way.
- Heikki
On Tue, Sep 29, 2020 at 2:57 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 29/09/2020 11:49, Hamid Akhtar wrote:
So, not actually random replacement here, rather a change with
baserel->allvisfrac taken into consideration (as given below):
----
index_random_page_cost = Min(spc_seq_page_cost + spc_random_page_cost *
(1.0 - baserel->allvisfrac), spc_random_page_cost);
----Does this make sense?
No. genericcostestimate() is only concerned with accesses to the index,
not the the heap accesses that are needed with Index Scans. 'allvisfrac'
should not affect the number of *index* pages fetched in any way.- Heikki
Currently, the costing for indexonlyscan only differs based on
'allvisfrac'. IIUC, the current implementation changes the number of pages
being fetched based on 'allvisfrac'.
This patch actually makes indexonlyscan specific changes
to genericcostestimate function. Currently, regardless of the value of
'allvisfrac', it is being assumed that the cost of fetching index pages is
random page cost. That is not aligned with the current cost calculation for
indexonlyscan. Therefore, I'm suggesting to reduce the random page in a
similar fashion in case of indexonlyscan.
I'm adding this to the commitfest.
--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Mon, Oct 12, 2020 at 3:46 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
On Tue, Sep 29, 2020 at 2:57 PM Heikki Linnakangas <hlinnaka@iki.fi>
wrote:On 29/09/2020 11:49, Hamid Akhtar wrote:
So, not actually random replacement here, rather a change with
baserel->allvisfrac taken into consideration (as given below):
----
index_random_page_cost = Min(spc_seq_page_cost + spc_random_page_cost *
(1.0 - baserel->allvisfrac), spc_random_page_cost);
----Does this make sense?
No. genericcostestimate() is only concerned with accesses to the index,
not the the heap accesses that are needed with Index Scans. 'allvisfrac'
should not affect the number of *index* pages fetched in any way.- Heikki
Currently, the costing for indexonlyscan only differs based on
'allvisfrac'. IIUC, the current implementation changes the number of pages
being fetched based on 'allvisfrac'.This patch actually makes indexonlyscan specific changes
to genericcostestimate function. Currently, regardless of the value of
'allvisfrac', it is being assumed that the cost of fetching index pages is
random page cost. That is not aligned with the current cost calculation for
indexonlyscan. Therefore, I'm suggesting to reduce the random page in a
similar fashion in case of indexonlyscan.I'm adding this to the commitfest.
Retrospectively looking at the patch, I see your point. Your criticism is
valid. I'll revalidate this issue and rework the patch if necessary.
--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus