Re: Seq scan on join, not on subselect? analyze this - Mailing list pgsql-sql
From | Helio Campos Mello de Andrade |
---|---|
Subject | Re: Seq scan on join, not on subselect? analyze this |
Date | |
Msg-id | 29e3942f0811100328s438d9595h61ae4ad714fe08c5@mail.gmail.com Whole thread Raw |
In response to | Seq scan on join, not on subselect? analyze this (Bryce Nesbitt <bryce2@obviously.com>) |
List | pgsql-sql |
Bryce,<br /><br /> - I think that the difference between the two queries has to do with the way postgresql execute them.<br/><br /> In the first the SGDB does:<br /> 1º Creates a temporary table with "m" X "n" rows where the"m" and "n" are the number of the rows in the tables been joined.<br /> 2º Take only the rows that has the same"work_key"<br /> 3º It restricts using the where clause.<br /><br /> OBS: Maybe It use the where clausefirst on the tables just to minimize the "m" and "n". I not sure about that. Still it creates and "m" X "n" temporarytable with lots of bad rows.<br /><br /> In the second query the SGDB:<br /> 1º Select in "article_words"only the rows that correspond with the restriction to that "context_key". It results in a much smaller numberof rows. "k" <<< "n".<br /> 2º It uses "k-results" and look for the for the rows where "word_key" isin the group created by the INNER Query.<br /><br />That's why you have the difference between the query's "Total runtime".<br/><br />Regards<br /><br />--<br />Helio Campos Mello de Andrade<br /><br /><br /><br /><div class="gmail_quote">OnSun, Nov 2, 2008 at 3:51 PM, Bryce Nesbitt <span dir="ltr"><<a href="mailto:bryce2@obviously.com">bryce2@obviously.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I'm a bit confused why thequery planner is not restricting my join, and<br /> not using the index. Two explain analyze statements follow.<br />Why is the second so much better?<br /><br /> lyell5=> select version();<br /> PostgreSQL 8.3.4 on x86_64-pc-linux-gnu,compiled by GCC cc (GCC) 4.1.2<br /> 20061115 (prerelease) (Debian 4.1.1-21)<br /><br /> lyell5=>explain analyze select * from article_words join words using<br /> (word_key) where context_key=535462;<br /> +------------------------------------------------------------------------------------------------------------------------------------------------+<br />|<br /> QUERY<br /> PLAN |<br /> +------------------------------------------------------------------------------------------------------------------------------------------------+<br />| Hash Join (cost=192092.90..276920.93 rows=45327 width=17) (actual<br /> time=6020.932..60084.817 rows=777<br /> loops=1) |<br /> | Hash Cond: (article_words.word_key =<br /> words.word_key)<br /> |<br/> | -> Index Scan using article_word_idx on article_words<br /> (cost=0.00..55960.50 rows=45327 width=8) (actualtime=0.031..0.547<br /> rows=777 loops=1) |<br /> | Index Cond: (context_key =<br /> 535462)<br /> |<br />| -> Hash (cost=93819.62..93819.62 rows=5653462 width=13) (actual<br /> time=6020.605..6020.605 rows=5651551 loops=1) |<br /> | -> Seq Scan on words (cost=0.00..93819.62 rows=5653462<br />width=13) (actual time=0.006..2010.962 rows=5651551<br /> loops=1) |<br /> | Total runtime: 60085.616<br/> ms<br /> |<br /> +------------------------------------------------------------------------------------------------------------------------------------------------+<br /><br/><br /> lyell5=> explain analyze select * from words where word_key in (select<br /> word_key from article_wordswhere context_key=535462);<br /> +------------------------------------------------------------------------------------------------------------------------------------------------------+<br />|<br /> QUERY<br /> PLAN |<br /> +------------------------------------------------------------------------------------------------------------------------------------------------------+<br />| Nested Loop (cost=56073.81..56091.41 rows=2 width=13) (actual<br /> time=0.808..4.723 rows=777<br /> loops=1) |<br /> | -> HashAggregate (cost=56073.81..56073.83 rows=2 width=4) (actual<br/> time=0.795..1.072 rows=777<br /> loops=1) |<br /> | -> Index Scan using article_word_idx on article_words<br /> (cost=0.00..55960.50 rows=45327 width=4) (actual time=0.030..0.344<br/> rows=777 loops=1) |<br /> | Index Cond: (context_key =<br /> 535462)<br /> |<br /> | -> Index Scan using words_pkey on words (cost=0.00..8.78 rows=1<br /> width=13) (actual time=0.003..0.004 rows=1<br/> loops=777) |<br /> | Index Cond: (words.word_key =<br /> article_words.word_key)<br/> |<br /> | Total runtime: 4.936<br /> ms<br /> |<br /> +------------------------------------------------------------------------------------------------------------------------------------------------------+<br /><fontcolor="#888888"><br /><br /> --<br /> Sent via pgsql-sql mailing list (<a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-sql"target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br /></font></blockquote></div>