Home > mailing lists

Thread: How is execution plan cost calculated for index scan

How is execution plan cost calculated for index scan

From

高健

Date:

08 November 2012, 07:17:18

Hi all:

I want to see the explain plan for a simple query. My question is : How is the cost calculated?

The cost parameter is:

random_page_cost = 4

seq_page_cost = 1

cpu_tuple_cost =0.01

cpu_operator_cost =0.0025

And the table and its index physical situation are as following:

postgres=# select relpages, reltuples from pg_class where relname = 'pg_proc';

relpages | reltuples

----------+-----------

62 | 2490

postgres=# select relpages, reltuples from pg_class where relname = 'pg_proc_oid_index';

relpages | reltuples

----------+-----------

9 | 2490

The explain plan is:

postgres=# explain SELECT * FROM pg_proc where oid=1;

QUERY PLAN

-----------------------------------------------------------------------------------

Index Scan using pg_proc_oid_index on pg_proc (cost=0.00..8.27 rows=1 width=548)

Index Cond: (oid = 1::oid)

(2 rows)

I think in the worst situation ,

Firstly, database need to search for 9 index pages by sequential to find the index entry. For each index page in memory, every “index tuple” need to be scanned.

Then , using the key entry, it need to make a random page read for the real data into memory, then scan the data tuple is scanned until the reall one is found

(or just directly locate to the data block after read the data page into memory )

So at least the evaluated max cost should be bigger than 9 index pages * seq_page_cost , so it should be bigger than 9. Here I haven't added the random page read cost for data.

But what I got is max is 8.27. How is the result of 8.27 be calculated?

Furthermore, I tried to find the logic in source code, I think it might be costsize.c in src/backend/optimizer/, by debugging it, I found that:

When I use [ explain SELECT * FROM pg_proc where oid=1;] , I can found that cost_index function is called.

The result returned for path->path.total_cost is 86698968. And 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this is still a little odd.

In the above case, can I say that the cost formula for index scan is in-- the cost_index function ?

Thanks in advance

Re: How is execution plan cost calculated for index scan

From

Jeff Janes

Date:

08 November 2012, 16:40:42

On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
> Hi all:
>
>
>
> I  want to see the explain plan for a simple query.   My question is :  How
> is  the cost  calculated?
>
>
>
> The cost parameter is:
>
>
>
>  random_page_cost    = 4
>
>  seq_page_cost          = 1
>
>  cpu_tuple_cost          =0.01
>
>  cpu_operator_cost     =0.0025

The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.

I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).

I was surprised the multiplier for cpu_operator_cost was that high.

The two random_page_costs are one for the index leaf page and one for
the table page.  Higher pages in the index are assumed to be cached
and thus not charged for IO.

...

> Firstly, database need to search for 9  index pages by sequential  to find
> the index entry.  For each index page in memory, every  “index tuple” need
> to be scanned.

That is not how indexes are traversed.

Cheers,

Jeff

Re: How is execution plan cost calculated for index scan

From

Tom Lane

Date:

08 November 2012, 16:45:45

=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes:
> I  want to see the explain plan for a simple query.   My question is :  How
> is  the cost  calculated?

In the case you're looking at, it's basically one random index page
fetch plus one random heap page fetch (hence 8.0), plus assorted CPU
costs making up the other 0.27 cost units.

The argument for charging only for the index leaf-page fetch, and not
upper levels of the index btree, is basically that all but the leaf
level are likely to be in cache.  This is pretty handwavy I know, but
the costs seem to come out reasonably in line with reality that way.

> The result returned for  path->path.total_cost  is    86698968.    And
> 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this
> is still a little odd.

Your debugger isn't doing you any favors ... that field is a double.

> In the above case,    can I say that  the cost formula for index scan is
> in-- the cost_index function ?

cost_index is only responsible for the heap-access part of the charges.
The index-access part is in btcostestimate and genericcostestimate in
utils/adt/selfuncs.c.

            regards, tom lane

Re: How is execution plan cost calculated for index scan

From

高健

Date:

09 November 2012, 01:59:10

Hi Jeff

Thank you very much.

>I determined this by changing each cost parameter and running explain,

>to see how much each one changed the cost estimate (after verifying

>the overall plan did not change).

your method is so smart!

Jian Gao

2012/11/9 Jeff Janes <jeff.janes@gmail.com>

On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
> Hi all:
>
>
>
> I want to see the explain plan for a simple query. My question is : How
> is the cost calculated?
>
>
>
> The cost parameter is:
>
>
>
> random_page_cost = 4
>
> seq_page_cost = 1
>
> cpu_tuple_cost =0.01
>
> cpu_operator_cost =0.0025

The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.

I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).

I was surprised the multiplier for cpu_operator_cost was that high.

The two random_page_costs are one for the index leaf page and one for
the table page. Higher pages in the index are assumed to be cached
and thus not charged for IO.

...

> Firstly, database need to search for 9 index pages by sequential to find
> the index entry. For each index page in memory, every “index tuple” need
> to be scanned.

That is not how indexes are traversed.

Cheers,

Jeff