Greg Stark <gsstark@mit.edu> wrote:
> On Thu, Mar 31, 2011 at 11:33 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> Greg Stark <gsstark@mit.edu> wrote:
>>
>>> your query does require reading all the data.
>>
>> Huh? It requires reading all the data from at least *one* of the
>> tables.
>
> The query he posted a plan for was:
>
> EXPLAIN ANALYZE select oi.id from order_items oi inner join
> clients_orders co on oi.order_id = co.id;
>
> And the plan for it looks like it's optimal to me:
>
> 'Hash Join (cost=780.55..1908023.16 rows=1027457 width=4) (actual
> time=63.506..85607.003 rows=33768 loops=1)'
> ' Hash Cond: (oi.order_id = co.id)'
> ' -> Seq Scan on order_items oi (cost=0.00..1558536.52
> rows=33843152 width=8) (actual time=0.005..69718.563 rows=33909137
> loops=1)'
> ' -> Hash (cost=480.80..480.80 rows=23980 width=4) (actual
> time=13.072..13.072 rows=23980 loops=1)'
> ' Buckets: 4096 Batches: 1 Memory Usage: 844kB'
> ' -> Seq Scan on clients_orders co (cost=0.00..480.80
> rows=23980 width=4) (actual time=0.006..6.570 rows=23980 loops=1)'
> 'Total runtime: 85613.391 ms'
It may or may not be optimal, but the assertion that all 33.9
*million* order_items rows must be read to pick out the needed 33.8
*thousand* is just plain incorrect. Personally, I won't be shocked
if using the index to cut the tuples accessed by three orders of
magnitude is faster.
-Kevin