Re: Seq scan on join, not on subselect? analyze this - Mailing list pgsql-sql

From Helio Campos Mello de Andrade
Subject Re: Seq scan on join, not on subselect? analyze this
Date
Msg-id 29e3942f0811100328s438d9595h61ae4ad714fe08c5@mail.gmail.com
Whole thread Raw
In response to Seq scan on join, not on subselect? analyze this  (Bryce Nesbitt <bryce2@obviously.com>)
List pgsql-sql
Bryce,

 - I think that the difference between the two queries has to do with the way postgresql execute them.

   In the first the SGDB does:
         1º Creates a temporary table with "m" X "n" rows where the "m" and "n" are the number of the rows in the tables been joined.
         2º Take only the rows that has the same "work_key"
         3º It restricts using the where clause.

        OBS: Maybe It use the where clause first on the tables just to minimize the "m" and "n". I not sure about that. Still it creates and "m" X "n" temporary table with lots of bad rows.

   In the second query the SGDB:
      1º Select in "article_words" only the rows that correspond with the restriction to that "context_key". It results in a much smaller number of rows. "k" <<< "n".
      2º It uses "k-results" and look for the for the rows where "word_key" is in the group created by the INNER Query.

That's why you have the difference between the query's "Total runtime".

Regards

--
Helio Campos Mello de Andrade



On Sun, Nov 2, 2008 at 3:51 PM, Bryce Nesbitt <bryce2@obviously.com> wrote:
I'm a bit confused why the query planner is not restricting my join, and
not using the index.  Two explain analyze statements follow.
Why is the second so much better?

lyell5=> select version();
PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)

lyell5=> explain analyze select * from article_words join words using
(word_key) where context_key=535462;
+------------------------------------------------------------------------------------------------------------------------------------------------+
|
QUERY
PLAN                                                                   |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| Hash Join  (cost=192092.90..276920.93 rows=45327 width=17) (actual
time=6020.932..60084.817 rows=777
loops=1)                                  |
|   Hash Cond: (article_words.word_key =
words.word_key)
|
|   ->  Index Scan using article_word_idx on article_words
(cost=0.00..55960.50 rows=45327 width=8) (actual time=0.031..0.547
rows=777 loops=1) |
|         Index Cond: (context_key =
535462)
|
|   ->  Hash  (cost=93819.62..93819.62 rows=5653462 width=13) (actual
time=6020.605..6020.605 rows=5651551 loops=1)                              |
|         ->  Seq Scan on words  (cost=0.00..93819.62 rows=5653462
width=13) (actual time=0.006..2010.962 rows=5651551
loops=1)                  |
| Total runtime: 60085.616
ms
|
+------------------------------------------------------------------------------------------------------------------------------------------------+


lyell5=> explain analyze select * from words where word_key in (select
word_key from article_words where context_key=535462);
+------------------------------------------------------------------------------------------------------------------------------------------------------+
|
QUERY
PLAN                                                                      |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop  (cost=56073.81..56091.41 rows=2 width=13) (actual
time=0.808..4.723 rows=777
loops=1)                                                   |
|   ->  HashAggregate  (cost=56073.81..56073.83 rows=2 width=4) (actual
time=0.795..1.072 rows=777
loops=1)                                            |
|         ->  Index Scan using article_word_idx on article_words
(cost=0.00..55960.50 rows=45327 width=4) (actual time=0.030..0.344
rows=777 loops=1) |
|               Index Cond: (context_key =
535462)
|
|   ->  Index Scan using words_pkey on words  (cost=0.00..8.78 rows=1
width=13) (actual time=0.003..0.004 rows=1
loops=777)                            |
|         Index Cond: (words.word_key =
article_words.word_key)
|
| Total runtime: 4.936
ms
|
+------------------------------------------------------------------------------------------------------------------------------------------------------+


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

pgsql-sql by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Query optimizing
Next
From: prakash
Date:
Subject: [PERFORM] Can we activate WAL runtime?