Thread: How is execution plan cost calculated for index scan

How is execution plan cost calculated for index scan

From
高健
Date:

Hi all:

 

I  want to see the explain plan for a simple query.   My question is :  How is  the cost  calculated?

 

The cost parameter is:

                                                                               

 random_page_cost    = 4                                            

 seq_page_cost          = 1

 cpu_tuple_cost          =0.01

 cpu_operator_cost     =0.0025                                   

 

And the table and its index physical  situation are as following:

 

postgres=# select relpages, reltuples  from pg_class where relname = 'pg_proc';                                                                                                                                              

 relpages | reltuples                                                                                                      

----------+-----------                                                                                                         

       62 |      2490                                                                                                                 

postgres=# select relpages, reltuples  from pg_class where relname = 'pg_proc_oid_index';                                                                                                                                      

 relpages | reltuples                                                                                                      

----------+-----------                                                                                                         

        9 |      2490                                                                                                                  

 

The explain plan is:

postgres=# explain SELECT * FROM pg_proc where oid=1;                                                                                          

                                    QUERY PLAN                                                                                                                                 

-----------------------------------------------------------------------------------                                                                                     

 Index Scan using pg_proc_oid_index on pg_proc  (cost=0.00..8.27 rows=1 width=548)                                                                                                                                   

   Index Cond: (oid = 1::oid)                                                                                         

(2 rows)                                                                                                                              

 

I think in the worst situation ,

Firstly, database need to search for 9  index pages by sequential  to find the index entry.  For each index page in memory, every  “index tuple” need to be scanned.

Then , using the key entry, it need to make a random page read  for the real data into memory, then scan the data tuple is scanned until the reall one is found

(or just directly locate to the data block after read the data page into memory )

 

So at least the evaluated max cost should be bigger  than  9 index pages *   seq_page_cost  , so it should be bigger than  9. Here I haven't added the random page read cost for data.

But what I got is max is 8.27. How is  the result of  8.27 be calculated?  

 

Furthermore, I tried to find the logic in source code, I think it might be  costsize.c  in  src/backend/optimizer/,  by debugging it, I found  that:

 

When I use [ explain SELECT * FROM pg_proc where oid=1;] , I can found that  cost_index function is called.

The result returned for  path->path.total_cost  is    86698968.    And 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this is still a little odd.

 In the above case,    can I say that  the cost formula for index scan is in-- the cost_index function ?


Thanks in advance 

Re: How is execution plan cost calculated for index scan

From
Jeff Janes
Date:
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
> Hi all:
>
>
>
> I  want to see the explain plan for a simple query.   My question is :  How
> is  the cost  calculated?
>
>
>
> The cost parameter is:
>
>
>
>  random_page_cost    = 4
>
>  seq_page_cost          = 1
>
>  cpu_tuple_cost          =0.01
>
>  cpu_operator_cost     =0.0025

The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.

I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).

I was surprised the multiplier for cpu_operator_cost was that high.

The two random_page_costs are one for the index leaf page and one for
the table page.  Higher pages in the index are assumed to be cached
and thus not charged for IO.

...

> Firstly, database need to search for 9  index pages by sequential  to find
> the index entry.  For each index page in memory, every  “index tuple” need
> to be scanned.

That is not how indexes are traversed.

Cheers,

Jeff


Re: How is execution plan cost calculated for index scan

From
Tom Lane
Date:
=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes:
> I  want to see the explain plan for a simple query.   My question is :  How
> is  the cost  calculated?

In the case you're looking at, it's basically one random index page
fetch plus one random heap page fetch (hence 8.0), plus assorted CPU
costs making up the other 0.27 cost units.

The argument for charging only for the index leaf-page fetch, and not
upper levels of the index btree, is basically that all but the leaf
level are likely to be in cache.  This is pretty handwavy I know, but
the costs seem to come out reasonably in line with reality that way.

> The result returned for  path->path.total_cost  is    86698968.    And
> 86698968/1024/1024 = 82.68258 . If devided by 10 , is near 8.27. but this
> is still a little odd.

Your debugger isn't doing you any favors ... that field is a double.

> In the above case,    can I say that  the cost formula for index scan is
> in-- the cost_index function ?

cost_index is only responsible for the heap-access part of the charges.
The index-access part is in btcostestimate and genericcostestimate in
utils/adt/selfuncs.c.

            regards, tom lane


Re: How is execution plan cost calculated for index scan

From
高健
Date:
Hi Jeff

Thank you very much.

>I determined this by changing each cost parameter and running explain,
>to see how much each one changed the cost estimate (after verifying
>the overall plan did not change).

your method is so smart!

Jian Gao

2012/11/9 Jeff Janes <jeff.janes@gmail.com>
On Wed, Nov 7, 2012 at 11:17 PM, 高健 <luckyjackgao@gmail.com> wrote:
> Hi all:
>
>
>
> I  want to see the explain plan for a simple query.   My question is :  How
> is  the cost  calculated?
>
>
>
> The cost parameter is:
>
>
>
>  random_page_cost    = 4
>
>  seq_page_cost          = 1
>
>  cpu_tuple_cost          =0.01
>
>  cpu_operator_cost     =0.0025

The cost is estimates as 2*random_page_cost + cpu_tuple_cost +
cpu_index_tuple_cost + 100* cpu_operator_cost.

I determined this by changing each cost parameter and running explain,
to see how much each one changed the cost estimate (after verifying
the overall plan did not change).

I was surprised the multiplier for cpu_operator_cost was that high.

The two random_page_costs are one for the index leaf page and one for
the table page.  Higher pages in the index are assumed to be cached
and thus not charged for IO.

...

> Firstly, database need to search for 9  index pages by sequential  to find
> the index entry.  For each index page in memory, every  “index tuple” need
> to be scanned.

That is not how indexes are traversed.

Cheers,

Jeff