What's the difference? - Mailing list pgsql-bugs

From Victor Wagner
Subject What's the difference?
Date
Msg-id Pine.LNX.4.44.0204051207000.21521-100000@banquet.lan.ice.ru
Whole thread Raw
Responses Re: What's the difference?
List pgsql-bugs
As far as I understand, following three queries are exactly equivalent:

 select item.item_id, item.title, author.email
      from item, author, item_link
      where
        item.item_id=author.item_id and item_link.active=item.item_id and
        item_link.linktype_id='MODERATES' and
        item_link.passive='bob_news';

  select item.item_id, item.title, author.email
     from item natural join author, item_link
     where item.item_id=item_link.active and
         item_link.linktype_id='MODERATES'
         and item_link.passive='bob_news';

  select item,item_id, item.title, author.email
      from item inner join author on (item.item_id = author.item_id),
           item_link
      where item.item_id=item_link.active and
          item_link.linktype_id='MODERATES'
          and item_link.passive='bob_news';

However, first query takes 0.004 seconds to execute
with following execution plan:
Nested Loop  (cost=0.00..17.63 rows=1 width=68)
  ->  Nested Loop  (cost=0.00..11.67 rows=1 width=37)
        ->  Index Scan using active_linked on item_link  (cost=0.00..6.01
                rows=1  width=10)
        ->  Index Scan using item_key on item  (cost=0.00..5.65 rows=1
                  width=27)
  ->  Index Scan using author_key on author  (cost=0.00..5.95 rows=1
                width=31)

And second two  - about 1.5 seconds with following plan
Nested Loop  (cost=97.34..10078.92 rows=1 width=68)
  ->  Index Scan using active_linked on item_link  (cost=0.00..6.01 rows=1
             width=10)
  ->  Materialize  (cost=10025.58..10025.58 rows=3787 width=58)
        ->  Hash Join  (cost=97.34..10025.58 rows=3787 width=58)
              ->  Seq Scan on item  (cost=0.00..8250.76 rows=108676
                     width=27)
              ->  Hash  (cost=87.87..87.87 rows=3787 width=31)
                    ->  Seq Scan on author  (cost=0.00..87.87 rows=3787
                       width=31

I've tried to reorder tables in the FROM clause, putting ITEM_LINK first,
but it makes no difference.

Additional information about database:

Cardinality of tables:
  item: 108941
  autor:  3791
  item_link: 132031

Primary key of author and item tables consists of one field item_id,
and there are no other identically named tables.

Primary key of item_link consists of fields active, passive, linktype_id
and there are several supplementary indices. Index active_linked was
created by following command:
create UNIQUE index ACTIVE_LINKED on ITEM_LINK(PASSIVE,LINKTYPE_ID,ACTIVE);

All key fields are VARCHAR(20), database created with encoding WIN, but
these fields contain ascii characters only.

PostgreSQL version 7.2.0.

BTW, if I use  item left outer join author, execution plan is simular
with second one and execution time is about 4 seconds. I feel that
it is possible to use execution plan simular with first one,
and make outer join queries fly (as it happens in Oracle).




--
Victor Wagner            vitus@ice.ru
Chief Technical Officer        Office:7-(095)-748-53-88
Communiware.Net         Home: 7-(095)-135-46-61
http://www.communiware.net      http://www.ice.ru/~vitus

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug #629: "for ... select ... loop" repeat first 9 rows
Next
From: Ron Mayer
Date:
Subject: Re: Inconsistant use of index.