How is execution plan cost calculated for index scan
Hi all:
I want to see the explain plan for a simple query. My question is : How
is the cost calculated?
The cost parameter is:
random_page_cost = 4
seq_page_cost = 1
cpu_tuple_cost =0.01
cpu_operator_cost =0.0025
And the table and its index physical situation are as following:
postgres=# select relpages, reltuples from pg_class where relname =
'pg_proc';
relpages | reltuples
----------+-----------
62 |
2490
postgres=# select relpages, reltuples from pg_class where relname =
'pg_proc_oid_index';
relpages | reltuples
----------+-----------
9 |
2490
The explain plan is:
postgres=# explain SELECT * FROM pg_proc where
oid=1;
QUERY
PLAN
-----------------------------------------------------------------------------------
Index Scan using pg_proc_oid_index on pg_proc (cost=0.00..8.27 rows=1
width=548)
Index Cond: (oid =
1::oid)
(2
rows)
I think in the worst situation ,
Firstly, database need to search for 9 index pages by sequential to find
the index entry. For each index page in memory, every “index tuple” need
to be scanned.
Then , using the key entry, it need to make a random page read for the
real data into memory, then scan the data tuple is scanned until the reall
one is found
(or just directly locate to the data block after read the data page into
memory )
So at least the evaluated max cost should be bigger than 9 index pages *
seq_page_cost , so it should be bigger than 9. Here I haven't added the
random page read cost for data.
But what I got is max is 8.27. How is the result of 8.27 be calculated?
Furthermore, I tried to find the logic in source code, I think it might be
costsize.c in src/backend/optimizer/, by debugging it, I found that:
When I use [ explain SELECT * FROM pg_proc where oid=1;] , I can found
that cost_index function is called.
The result returned for path->path.total_cost is 86698968. And
86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this
is still a little odd.
In the above case, can I say that the cost formula for index scan is
in-- the cost_index function ?
Thanks in advance
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
Hi all:
I want to see the explain plan for a simple query. My question is : How
is the cost calculated?The cost parameter is:
random_page_cost = 4
seq_page_cost = 1
cpu_tuple_cost =0.01
cpu_operator_cost =0.0025
The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.
I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).
I was surprised the multiplier for cpu_operator_cost was that high.
The two random_page_costs are one for the index leaf page and one for
the table page. Higher pages in the index are assumed to be cached
and thus not charged for IO.
...
Firstly, database need to search for 9 index pages by sequential to find
the index entry. For each index page in memory, every “index tuple” need
to be scanned.
That is not how indexes are traversed.
Cheers,
Jeff
=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes:
I want to see the explain plan for a simple query. My question is : How
is the cost calculated?
In the case you're looking at, it's basically one random index page
fetch plus one random heap page fetch (hence 8.0), plus assorted CPU
costs making up the other 0.27 cost units.
The argument for charging only for the index leaf-page fetch, and not
upper levels of the index btree, is basically that all but the leaf
level are likely to be in cache. This is pretty handwavy I know, but
the costs seem to come out reasonably in line with reality that way.
The result returned for path->path.total_cost is 86698968. And
86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this
is still a little odd.
Your debugger isn't doing you any favors ... that field is a double.
In the above case, can I say that the cost formula for index scan is
in-- the cost_index function ?
cost_index is only responsible for the heap-access part of the charges.
The index-access part is in btcostestimate and genericcostestimate in
utils/adt/selfuncs.c.
regards, tom lane
Hi Jeff
Thank you very much.
I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).
your method is so smart!
Jian Gao
2012/11/9 Jeff Janes <jeff.janes@gmail.com>
Show quoted text
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
Hi all:
I want to see the explain plan for a simple query. My question is :
How
is the cost calculated?
The cost parameter is:
random_page_cost = 4
seq_page_cost = 1
cpu_tuple_cost =0.01
cpu_operator_cost =0.0025
The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).I was surprised the multiplier for cpu_operator_cost was that high.
The two random_page_costs are one for the index leaf page and one for
the table page. Higher pages in the index are assumed to be cached
and thus not charged for IO....
Firstly, database need to search for 9 index pages by sequential to
find
the index entry. For each index page in memory, every “index tuple”
need
to be scanned.
That is not how indexes are traversed.
Cheers,
Jeff