[8.1.4] Help optimizing query - Mailing list pgsql-performance
From | Hayes |
---|---|
Subject | [8.1.4] Help optimizing query |
Date | |
Msg-id | aperez-2D52F6.16115426082006@news.hub.org Whole thread Raw |
Responses |
Re: [8.1.4] Help optimizing query
|
List | pgsql-performance |
Hi all, I need help making the below query go faster. There are about 5 of these being done in one report. Some use joins instead of subselects but they all go about the same speed. I'm not much of a SQL person so I just threw indices on everything involved but it still takes about at least 19sec and sometimes as much as 60s. I'd be happy to get it to about 5s. Other info: random_page_cost is 4 as I/O on this box as kinda slow. It's a converted desktop with a single IDE drive. shared_buffers is 12500 and effective_page_cache is 100M. I upped the statistics on entry_date and session_id to 1000. I analyzed the tables after modifying the statistics. The actual deployment platform is a lot beefier but I'd like these queries to at least be tolerable on this machine. I can see that the estimate for the GroupAggregate is off, if I'm interpreting things correctly, but I don't know what to do about it. tia, arturo Query: SELECT subscription_id, to_char(sum(session_length), 'HH24:MI:SS') as session_length, sum(hits) as hits, 2006 as theYear, 2 as theQuarter, sum(count) as count FROM ( SELECT subscription_id, count(distinct session_id) as count, age(MAX(entry_date),MIN(entry_date)) as session_length, COUNT(action) as hits FROM extended_user JOIN user_tracking USING (user_id) WHERE subscription_id > 0 AND EXTRACT(year from entry_date) = 2006 AND EXTRACT(quarter from entry_date) = 2 GROUP BY session_id, subscription_id ) as session_stuff WHERE subscription_id > 0 GROUP BY subscription_id ORDER BY subscription_id; Sort (cost=123305.88..123306.38 rows=200 width=36) (actual time=75039.706..75040.500 rows=258 loops=1) Sort Key: session_stuff.subscription_id -> HashAggregate (cost=123294.24..123298.24 rows=200 width=36) (actual time=75036.487..75038.360 rows=258 loops=1) -> GroupAggregate (cost=108839.34..118475.94 rows=240915 width=72) (actual time=68016.583..74702.710 rows=38369 loops=1) -> Sort (cost=108839.34..109441.63 rows=240915 width=72) (actual time=67978.193..68982.962 rows=245727 loops=1) Sort Key: user_tracking.session_id, extended_user.subscription_id -> Hash Join (cost=7746.59..75492.37 rows=240915 width=72) (actual time=16944.487..50737.230 rows=245727 loops=1) Hash Cond: ("outer".user_id = "inner".user_id) -> Bitmap Heap Scan on user_tracking (cost=7524.10..68644.10 rows=240950 width=72) (actual time=16843.695..48306.383 rows=258923 loops=1) Recheck Cond: ((date_part('quarter'::text, entry_date) = 2::double precision) AND (date_part('year'::text, entry_date) = 2006::double precision)) -> BitmapAnd (cost=7524.10..7524.10 rows=240950 width=0) (actual time=16779.178..16779.178 rows=0 loops=1) -> Bitmap Index Scan on user_tracking_quarter_idx (cost=0.00..3331.51 rows=533288 width=0) (actual time=9079.545..9079.545 rows=533492 loops=1) Index Cond: (date_part('quarter'::text, entry_date) = 2::double precision) -> Bitmap Index Scan on user_tracking_year_idx (cost=0.00..4192.34 rows=671239 width=0) (actual time=7685.906..7685.906 rows=671787 loops=1) Index Cond: (date_part('year'::text, entry_date) = 2006::double precision) -> Hash (cost=206.42..206.42 rows=6428 width=8) (actual time=100.754..100.754 rows=6411 loops=1) -> Seq Scan on extended_user (cost=0.00..206.42 rows=6428 width=8) (actual time=0.020..28.873 rows=6411 loops=1) Filter: ((subscription_id > 0) AND (subscription_id > 0)) Total runtime: 75069.453 ms Tables: This one has about 6-7k rows. Table "public.extended_user" Column | Type | Modifiers -------------------+--------------------------+----------- create_date | timestamp with time zone | not null email | character varying(99) | first_name | character varying(99) | not null last_name | character varying(99) | not null license_agreement | boolean | not null license_date | timestamp with time zone | password | character varying(32) | not null subscription_id | integer | not null user_id | integer | not null user_name | character varying(99) | not null Indexes: "extended_user_pkey" PRIMARY KEY, btree (user_id) "extended_user_subscription_id_idx" btree (subscription_id) "extended_user_subscription_idx" btree (subscription_id) Foreign-key constraints: "extended_user_subscription_id_fkey" FOREIGN KEY (subscription_id) REFERENCES subscription(subscription_id) DEFERRABLE INITIALLY DEFERRED This one has about 2k rows. Table "public.subscription" Column | Type | Modifiers ------------------+--------------------------+----------- allow_printing | boolean | not null company_id | character varying(50) | not null company_name | character varying(100) | not null end_date | timestamp with time zone | licenses | integer | not null pass_through_key | character varying(50) | start_date | timestamp with time zone | not null subscription_id | integer | not null Indexes: "subscription_pkey" PRIMARY KEY, btree (subscription_id) This one has about 1.4M rows. It's kind of a log of pages visited. Table "public.user_tracking" Column | Type | Modifiers ------------------+-----------------------------+------------------------ -------------------------------------------------- action | character varying(255) | not null entry_date | timestamp without time zone | not null note | text | report_id | integer | session_id | character varying(255) | not null user_id | integer | user_tracking_id | integer | not null default nextval('user_tracking_user_tracking_id_seq'::regclass) Indexes: "user_tracking_pkey" PRIMARY KEY, btree (user_tracking_id) "user_tracking_entry_date_idx" btree (entry_date) "user_tracking_month_idx" btree (date_part('month'::text, entry_date)) "user_tracking_quarter_idx" btree (date_part('quarter'::text, entry_date)) "user_tracking_report_id_idx" btree (report_id) "user_tracking_session_idx" btree (session_id) "user_tracking_user_id_idx" btree (user_id) "user_tracking_year_idx" btree (date_part('year'::text, entry_date)) Foreign-key constraints: "user_tracking_report_id_fkey" FOREIGN KEY (report_id) REFERENCES article(article_id) DEFERRABLE INITIALLY DEFERRED "user_tracking_user_id_fkey" FOREIGN KEY (user_id) REFERENCES extended_user(user_id) DEFERRABLE INITIALLY DEFERRED
pgsql-performance by date: