Re: Help with a seq scan on multi-million row table - Mailing list pgsql-sql

From
Subject Re: Help with a seq scan on multi-million row table
Date
Msg-id 20060511013410.83818.qmail@web50307.mail.yahoo.com
Whole thread Raw
In response to Re: Help with a seq scan on multi-million row table  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Help with a seq scan on multi-million row table
List pgsql-sql
Aha!  set hashjoin=off did the trick.
The PG version is: 8.0.3

NB: I removed that redundant "DISTINCT" after the SELECT.

EXPLAIN ANALYZE select userurltag0_.tag as x0_0_, COUNT(*) as x1_0_ from user_url_tag userurltag0_, user_url userurl1_
where(((userurl1_.user_id=1 ))AND((userurltag0_.user_url_id=userurl1_.id ))) group by  userurltag0_.tag order by
count(*)DESC;                                                                                 QUERY PLAN 
 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Sort
(cost=155766.79..155774.81 rows=3207 width=10) (actual time=2387.756..2396.578 rows=2546 loops=1)  Sort Key: count(*)
-> HashAggregate  (cost=155572.02..155580.03 rows=3207 width=10) (actual time=2365.643..2376.626 rows=2546 loops=1)
  ->  Nested Loop  (cost=0.00..155552.68 rows=3867 width=10) (actual time=0.135..2222.028 rows=8544 loops=1)
 ->  Index Scan using ix_user_url_user_id_url_id on user_url userurl1_  (cost=0.00..2798.12 rows=963 width=4) (actual
time=0.067..9.744rows=1666 loops=1)                    Index Cond: (user_id = 1)              ->  Index Scan using
ix_user_url_tag_user_url_idon user_url_tag userurltag0_  (cost=0.00..157.34 rows=103 width=14) (actual
time=1.223..1.281rows=5 loops=1666)                    Index Cond: (userurltag0_.user_url_id = "outer".id)Total
runtime:2405.691 ms
 
(9 rows)


Are you still interested in other "its second-choice join type"?  If you are, please tell me what join types those are,
thisis a bit beyond me. :(
 

Is there a way to force PG to use the index automatically?  This query is executed from something called Hibernate, and
I'mnot sure if that will let me set enable_hashjoin=off through its API...
 

Thanks,
Otis


----- Original Message ----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: ogjunk-pgjedan@yahoo.com
Cc: pgsql-sql@postgresql.org
Sent: Wednesday, May 10, 2006 8:27:01 PM
Subject: Re: [SQL] Help with a seq scan on multi-million row table 

<ogjunk-pgjedan@yahoo.com> writes:
>                ->  Hash Join  (cost=2797.65..140758.50 rows=3790 width=10) (actual time=248.530..380635.132 rows=8544
loops=1) 
 
>                      Hash Cond: ("outer".user_url_id = "inner".id)  
>                      ->  Seq Scan on user_url_tag userurltag0_  (cost=0.00..106650.30 rows=6254530 width=14) (actual
time=0.017..212256.630rows=6259553 loops=1)  
 
>                      ->  Hash  (cost=2795.24..2795.24 rows=962 width=4) (actual time=199.840..199.840 rows=0 loops=1)

 
>                            ->  Index Scan using ix_user_url_user_id_url_id on user_url userurl1_  (cost=0.00..2795.24
rows=962width=4) (actual time=0.048..193.707 rows=1666 loops=1)  
 
>                                  Index Cond: (user_id = 1)  

Hm, I'm not sure why it's choosing that join plan.  A nestloop indexscan
wouldn't be terribly cheap, but just counting on my fingers it seems
like it ought to come in at less than 100000 cost units.  What do you
get if you set enable_hashjoin off?  (Then try disabling its
second-choice join type too --- I'm interested to see EXPLAIN ANALYZE
output for all three join types.)

What PG version is this exactly?
           regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend





pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Help with a seq scan on multi-million row table
Next
From: Tom Lane
Date:
Subject: Re: Help with a seq scan on multi-million row table