Thread: Understanding explain costs

Understanding explain costs

From
David Link
Date:
Hi,
Trying to understand the planner estimate costs  ... one index scan
seems to be much more expensive then another.  Here are the facts:

  tiger=# \dbk_inv
             Table "bk_inv"
   Attribute |     Type     | Modifier
  -----------+--------------+----------
   store     | varchar(5)   |
   isbn      | varchar(10)  | not null
   qty       | numeric(5,0) |
   week      | numeric(6,0) |
  Indices: bk_inv_isbn_idx,
           bk_inv_store_idx

  tiger=# \dbk_inv_isbn_idx
   Index "bk_inv_isbn_idx"
   Attribute |    Type
  -----------+-------------
   isbn      | varchar(10)
  btree

  tiger=# \dbk_inv_store_idx
  Index "bk_inv_store_idx"
   Attribute |    Type
  -----------+------------
   store     | varchar(5)
  btree

There are about 50,000,000 rows and you can see a full scan on the table
is expensive:

  tiger=# explain select * from bk_inv;
  NOTICE:  QUERY PLAN:

  Seq Scan on bk_inv  (cost=0.00..999623.77 rows=46790877 width=48)


However using the isbn index costs are much less:

  tiger=# explain select * from bk_inv where isbn = 'foo';
  NOTICE:  QUERY PLAN:

  Index Scan using bk_inv_isbn_idx on bk_inv  (cost=0.00..53.13
  rows=13 width=48)


However, however using the store index costs are still rather high  -
why is that!!??

  tiger=# explain select * from bk_inv where store = 'foo';
  NOTICE:  QUERY PLAN:

  Index Scan using bk_inv_store_idx on bk_inv  (cost=0.00..53456.09
  rows=13488 width=48)

Incidently the store index is slightly smaller than the isbn index ...

[postgres@roma tiger]$ ls -l bk_inv*
-rw-------    1 postgres postgres 1073741824 Oct 10 14:28 bk_inv
-rw-------    1 postgres postgres 1073741824 Oct 10 10:15 bk_inv.1
-rw-------    1 postgres postgres 1073741824 Oct 10 10:17 bk_inv.2
-rw-------    1 postgres postgres 1073741824 Oct 10 10:19 bk_inv.3
-rw-------    1 postgres postgres 60841984 Oct 11 15:51 bk_inv.4
-rw-------    1 postgres postgres 1073741824 Oct 10 13:37
bk_inv_isbn_idx
-rw-------    1 postgres postgres 566288384 Oct 10 14:31
bk_inv_isbn_idx.1
-rw-------    1 postgres postgres 1073741824 Oct 11 13:13
bk_inv_store_idx
-rw-------    1 postgres postgres 65921024 Oct 11 13:13
bk_inv_store_idx.1


Am I missing certain fundamentals about the planner/executor?

Thanks, David Link

Re: Understanding explain costs

From
David Link
Date:
Tom Lane wrote:
>
> David Link <dlink@soundscan.com> writes:
> >   tiger=# explain select * from bk_inv where isbn = 'foo';
> >   NOTICE:  QUERY PLAN:
>
> >   Index Scan using bk_inv_isbn_idx on bk_inv  (cost=0.00..53.13
> >   rows=13 width=48)
>
> > However, however using the store index costs are still rather high  -
> > why is that!!??
>
> >   tiger=# explain select * from bk_inv where store = 'foo';
> >   NOTICE:  QUERY PLAN:
>
> >   Index Scan using bk_inv_store_idx on bk_inv  (cost=0.00..53456.09
> >   rows=13488 width=48)
>
> Note the difference in estimated numbe.r of rows retrieved; that's where
> the cost difference comes from.  You may care to read
>
> http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/performance-tips.html
>
>                         regards, tom lane

Yes I read that Tom, thanks.  It seems to be the only thing out there on
performance tuning for Postgres that I have been able to find.

Thanks for the explaination.  That makes sense.  There are far fewer
unique isbns then there are vendors in my table.

--David

Re: Understanding explain costs

From
Tom Lane
Date:
David Link <dlink@soundscan.com> writes:
>   tiger=# explain select * from bk_inv where isbn = 'foo';
>   NOTICE:  QUERY PLAN:

>   Index Scan using bk_inv_isbn_idx on bk_inv  (cost=0.00..53.13
>   rows=13 width=48)

> However, however using the store index costs are still rather high  -
> why is that!!??

>   tiger=# explain select * from bk_inv where store = 'foo';
>   NOTICE:  QUERY PLAN:

>   Index Scan using bk_inv_store_idx on bk_inv  (cost=0.00..53456.09
>   rows=13488 width=48)

Note the difference in estimated number of rows retrieved; that's where
the cost difference comes from.  You may care to read

http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/performance-tips.html

            regards, tom lane