how to speed up these queries ? - Mailing list pgsql-sql

From Dracula 007
Subject how to speed up these queries ?
Date
Msg-id 4226EF8A.3020403@atlas.cz
Whole thread Raw
Responses Re: how to speed up these queries ?  (Sean Davis <sdavis2@mail.nih.gov>)
List pgsql-sql
Hello,
   I have two "large" tables - "sessions" (about 1.500.000 rows) and
"actions" (about 4.000.000 rows), and the "actions" table is connected
to the "sessions" (it contains a key from it). The simplified structure
of these tables is

sessions (session_id int4,visitor_id int4,session_ip inet,session_date timestamp
)

actions (action_id int4,session_id int4, -- foreign key, references sessions(session_id)action_date
timestamp,action_yearint2,action_month int2,action_day int2
 
)

I run SQL queries like
   SELECT      COUNT(actions.session_id) AS sessions_count,      COUNT(DISTINCT visitor_id) AS visitors_count,
COUNT(DISTINCTsession_ip) AS ips_count   FROM actions LEFT JOIN sessions USING (session_id)   GROUP BY action_year,
action_month,action_day
 

but it's really really slow. I've tried to use different indexes on
different columns, but no matter what I've tried I can't get it faster. 
The explain analyze of the query is

-------------------------------------------------- Aggregate  (cost=347276.05..347276.05 rows=1 width=23) (actual 
time=210060.349..210060.350 rows=1 loops=1)   ->  Hash Left Join  (cost=59337.55..305075.27 rows=4220077 width=23) 
(actualtime=24202.338..119924.254 rows=4220077 loops=1)         Hash Cond: ("outer".session_id = "inner".session_id)
    ->  Seq Scan on actions  (cost=0.00..114714.77 rows=4220077 
 
width=8) (actual time=7539.653..44585.023 rows=4220077 loops=1)         ->  Hash  (cost=47650.64..47650.64 rows=1484764
width=19)
 
(actual time=16628.790..16628.790 rows=0 loops=1)               ->  Seq Scan on sessions  (cost=0.00..47650.64 
rows=1484764 width=19) (actual time=0.041..13378.667 rows=1484764 loops=1) Total runtime: 210061.073 ms
--------------------------------------------------

As you can see it runs for about 4 mins, which is not too fast. Is there 
some way to speed up such queries?


pgsql-sql by date:

Previous
From: Bret Hughes
Date:
Subject: Re: definative way to place secs from epoc into timestamp
Next
From: Markus Schaber
Date:
Subject: Re: Multiples schemas