Well, this ran much better. However, I'm not sure if it's because of
set enable_nestloop = 0, or because I'm executing the query twice in a
row, where previous results may be cached. I will try this setting in
my code for when this process runs later today and see what the result
is.
Thanks!
pg_dw=# explain analyze CREATE TABLE ecr_opens with (FILLFACTOR=100)
pg_dw-# as
pg_dw-# select o.emailcampaignid, count(memberid) opencnt
pg_dw-# from openactivity o,ecr_sents s
pg_dw-# where s.emailcampaignid = o.emailcampaignid
pg_dw-# group by o.emailcampaignid;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=4391163.81..4391288.05 rows=9939 width=12) (actual
time=167254.751..167254.937 rows=472 loops=1)
-> Hash Join (cost=14.78..4344767.23 rows=9279316 width=12) (actual
time=0.300..164577.131 rows=8586466 loops=1)
Hash Cond: (o.emailcampaignid = s.emailcampaignid)
-> Seq Scan on openactivity o (cost=0.00..3529930.67
rows=192540967 width=12) (actual time=0.011..124351.878 rows=192542480
loops=1)
-> Hash (cost=8.79..8.79 rows=479 width=4) (actual
time=0.253..0.253 rows=479 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 17kB
-> Seq Scan on ecr_sents s (cost=0.00..8.79 rows=479
width=4) (actual time=0.010..0.121 rows=479 loops=1)
Total runtime: 167279.950 ms
On Wed, 2011-06-08 at 11:51 -0400, Stephen Frost wrote:
> * Tony Capobianco (tcapobianco@prospectiv.com) wrote:
> > HashAggregate (cost=4391163.81..4391288.05 rows=9939 width=12)
> > -> Hash Join (cost=14.78..4344767.23 rows=9279316 width=12)
> > Hash Cond: (o.emailcampaignid = s.emailcampaignid)
> > -> Seq Scan on openactivity o (cost=0.00..3529930.67
> > rows=192540967 width=12)
> > -> Hash (cost=8.79..8.79 rows=479 width=4)
> > -> Seq Scan on ecr_sents s (cost=0.00..8.79 rows=479
> > width=4)
> >
> > Yikes. Two sequential scans.
>
> Err, isn't that more-or-less exactly what you want here? The smaller
> table is going to be hashed and then you'll traverse the bigger table
> and bounce each row off the hash table. Have you tried actually running
> this and seeing how long it takes? The bigger table doesn't look to be
> *that* big, if your i/o subsystem is decent and you've got a lot of
> memory available for kernel cacheing, should be quick.
>
> Thanks,
>
> Stephen