Help with a seq scan on multi-million row table - Mailing list pgsql-sql
From | |
---|---|
Subject | Help with a seq scan on multi-million row table |
Date | |
Msg-id | 20060510181359.58201.qmail@web50306.mail.yahoo.com Whole thread Raw |
Responses |
Re: Help with a seq scan on multi-million row table
Re: Help with a seq scan on multi-million row table |
List | pgsql-sql |
Hello, I have a little 2-table JOIN, GROUP BY, ORDER BY query that does a sequential scan on a multi-million row table. I _thought_I had all the appropriate indices, but apparently I do not. I was wondering if anyone can spot a way I can speedup this query. The query currently takes... *gulp*: 381119.201 ms :( There are only 2 tables in the game: user_url and user_url_tag. The latter has FKs pointing to the former. The sequentialscan happens on the latter - user_url_tag: EXPLAIN ANALYZE select DISTINCT userurltag0_.tag as x0_0_, COUNT(*) as x1_0_ from user_url_tag userurltag0_, user_url userurl1_WHERE (((userurl1_.user_id=1 )) AND ((userurltag0_.user_url_id=userurl1_.id ))) GROUP BY userurltag0_.tag ORDERBY count(*) DESC; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=140972.22..140996.28 rows=3207 width=10) (actual time=381082.868..381110.094 rows=2546 loops=1) -> Sort (cost=140972.22..140980.24 rows=3207 width=10) (actual time=381082.858..381091.733 rows=2546 loops=1) SortKey: count(*), userurltag0_.tag -> HashAggregate (cost=140777.45..140785.46 rows=3207 width=10) (actual time=381032.844..381064.068rows=2546 loops=1) -> Hash Join (cost=2797.65..140758.50 rows=3790 width=10)(actual time=248.530..380635.132 rows=8544 loops=1) Hash Cond: ("outer".user_url_id = "inner".id) -> Seq Scan on user_url_tag userurltag0_ (cost=0.00..106650.30 rows=6254530 width=14) (actualtime=0.017..212256.630 rows=6259553 loops=1) -> Hash (cost=2795.24..2795.24 rows=962 width=4)(actual time=199.840..199.840 rows=0 loops=1) -> Index Scan using ix_user_url_user_id_url_idon user_url userurl1_ (cost=0.00..2795.24 rows=962 width=4) (actual time=0.048..193.707 rows=1666loops=1) Index Cond: (user_id = 1) Total runtime: 381119.201 ms (11 rows) This is what the two tables look like (extra colums removed): Table "public.user_url_tag" Column | Type | Modifiers -------------+-----------------------+-------------------------------------------------------------- id | integer | not null default nextval('public.user_url_tag_id_seq'::text) user_url_id | integer | tag | character varying(64) | Indexes: "pk_user_url_tag_id" PRIMARY KEY, btree (id) "ix_user_url_tag_tag" btree (tag) "ix_user_url_tag_user_url_id"btree (user_url_id) Foreign-key constraints: "fk_user_url_tag_user_url_id" FOREIGN KEY (user_url_id) REFERENCES user_url(id) Table "public.user_url" Column | Type | Modifiers ------------------+-----------------------------+---------------------------------------------------------- id | integer | not null default nextval('public.user_url_id_seq'::text) user_id | integer | url_id | integer | Indexes: "pk_user_url_id" PRIMARY KEY, btree (id) "ix_user_url_url_id_user_id" UNIQUE, btree (url_id, user_id) "ix_user_url_user_id_url_id" UNIQUE, btree (user_id, url_id) Does anyone see a way to speed up this s-l-o-w query? I cache DB results, but I'd love to get rid of that sequential scan. Thanks, Otis