Thread: How is execution plan cost calculated for index scan
Hi all:
I want to see the explain plan for a simple query. My question is : How is the cost calculated?
The cost parameter is:
random_page_cost = 4
seq_page_cost = 1
cpu_tuple_cost =0.01
cpu_operator_cost =0.0025
And the table and its index physical situation are as following:
postgres=# select relpages, reltuples from pg_class where relname = 'pg_proc';
relpages | reltuples
----------+-----------
62 | 2490
postgres=# select relpages, reltuples from pg_class where relname = 'pg_proc_oid_index';
relpages | reltuples
----------+-----------
9 | 2490
The explain plan is:
postgres=# explain SELECT * FROM pg_proc where oid=1;
QUERY PLAN
-----------------------------------------------------------------------------------
Index Scan using pg_proc_oid_index on pg_proc (cost=0.00..8.27 rows=1 width=548)
Index Cond: (oid = 1::oid)
(2 rows)
I think in the worst situation ,
Firstly, database need to search for 9 index pages by sequential to find the index entry. For each index page in memory, every “index tuple” need to be scanned.
Then , using the key entry, it need to make a random page read for the real data into memory, then scan the data tuple is scanned until the reall one is found
(or just directly locate to the data block after read the data page into memory )
So at least the evaluated max cost should be bigger than 9 index pages * seq_page_cost , so it should be bigger than 9. Here I haven't added the random page read cost for data.
But what I got is max is 8.27. How is the result of 8.27 be calculated?
Furthermore, I tried to find the logic in source code, I think it might be costsize.c in src/backend/optimizer/, by debugging it, I found that:
When I use [ explain SELECT * FROM pg_proc where oid=1;] , I can found that cost_index function is called.
The result returned for path->path.total_cost is 86698968. And 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this is still a little odd.
In the above case, can I say that the cost formula for index scan is in-- the cost_index function ?
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote: > Hi all: > > > > I want to see the explain plan for a simple query. My question is : How > is the cost calculated? > > > > The cost parameter is: > > > > random_page_cost = 4 > > seq_page_cost = 1 > > cpu_tuple_cost =0.01 > > cpu_operator_cost =0.0025 The cost is estimates as 2*random_page_cost + cpu_tuple_cost + cpu_index_tuple_cost + 100* cpu_operator_cost. I determined this by changing each cost parameter and running explain, to see how much each one changed the cost estimate (after verifying the overall plan did not change). I was surprised the multiplier for cpu_operator_cost was that high. The two random_page_costs are one for the index leaf page and one for the table page. Higher pages in the index are assumed to be cached and thus not charged for IO. ... > Firstly, database need to search for 9 index pages by sequential to find > the index entry. For each index page in memory, every “index tuple” need > to be scanned. That is not how indexes are traversed. Cheers, Jeff
=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes: > I want to see the explain plan for a simple query. My question is : How > is the cost calculated? In the case you're looking at, it's basically one random index page fetch plus one random heap page fetch (hence 8.0), plus assorted CPU costs making up the other 0.27 cost units. The argument for charging only for the index leaf-page fetch, and not upper levels of the index btree, is basically that all but the leaf level are likely to be in cache. This is pretty handwavy I know, but the costs seem to come out reasonably in line with reality that way. > The result returned for path->path.total_cost is 86698968. And > 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this > is still a little odd. Your debugger isn't doing you any favors ... that field is a double. > In the above case, can I say that the cost formula for index scan is > in-- the cost_index function ? cost_index is only responsible for the heap-access part of the charges. The index-access part is in btcostestimate and genericcostestimate in utils/adt/selfuncs.c. regards, tom lane
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
> Hi all:
>
>
>
> I want to see the explain plan for a simple query. My question is : How
> is the cost calculated?
>
>
>
> The cost parameter is:
>
>
>
> random_page_cost = 4
>
> seq_page_cost = 1
>
> cpu_tuple_cost =0.01
>
> cpu_operator_cost =0.0025
cpu_index_tuple_cost + 100* cpu_operator_cost.
I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).
I was surprised the multiplier for cpu_operator_cost was that high.
The two random_page_costs are one for the index leaf page and one for
the table page. Higher pages in the index are assumed to be cached
and thus not charged for IO.
...That is not how indexes are traversed.
> Firstly, database need to search for 9 index pages by sequential to find
> the index entry. For each index page in memory, every “index tuple” need
> to be scanned.
Cheers,
Jeff