Re: slow group by query - Mailing list pgsql-sql

From Stephan Szabo
Subject Re: slow group by query
Date
Msg-id 20021119065715.V65980-100000@megazone23.bigpanda.com
Whole thread Raw
In response to slow group by query  (Ellen Cyran <ellen@urban.csuohio.edu>)
Responses Re: slow group by query  (Ellen Cyran <ellen@urban.csuohio.edu>)
List pgsql-sql
On Mon, 18 Nov 2002, Ellen Cyran wrote:

> Is there any way to make this query faster?  I have indexes on year,
> msa_code, and sic.  I've also tried it with
> an index on the combined group by columns.  I've made both sort_mem and
> shared_buffers bigger, but still this query
> takes 40 seconds when I select 4 msa_codes and 7 minutes when I select 40
> msa_codes.  Would it just be better
> to transpose the table to begin with and avoid the group by all together?
>
> SELECT b.msa_code, b.sic, b.own, b.ind_div,
>   (select emp from tbl_bls_msa as bls
> where bls.year = '1975' and bls.msa_code = b.msa_code and bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1975,
>   (select emp from tbl_bls_msa as bls
> where bls.year = '1976' and bls.msa_code = b.msa_code and  bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1976,
>   (select emp from tbl_bls_msa as bls
> where bls.year = '1977' and bls.msa_code = b.msa_code  and  bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1977,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1978' and bls.msa_code = b.msa_code and  bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div ) AS emp1978,
> (select emp from tbl_bls_msa as bls
> where bls.year = '1990' and bls.msa_code = b.msa_code  and  bls.sic = b.sic
> and bls.own = b.own and bls.ind_div = b.ind_div)  AS emp1990
>   FROM tbl_bls_msa AS b
> where msa_code in ('1680','1640','0040','0120','0080')
> GROUP BY b.msa_code, b.sic, b.ind_div, b.own ;

Have you tried doing the subqueries in from?  Right now you're running
each subquery once for each output row AFAICS.

Maybe something like (doing only one year for example - and using a
postgres extension), would work...

select distinct on (b.msa_code, b.sic, b.own, b.ind_div)b.msa_code, b.sic, b.own, b.ind_div, y1975.emp1975 from
tbl_bls_msa as b left outer join (select emp from tbl_bls_mas as blswhere bls.year='1975') y1975 on
(b.msa_code=y1975.msa_codeandb.sic=y1975.sic and b.own=y1975.own and b.ind_div=y1975.ind_div)
 
where msa_code in ('1680', '1640', '0040', '0120', '0080');




pgsql-sql by date:

Previous
From: Tomasz Myrta
Date:
Subject: Re:
Next
From: Tom Lane
Date:
Subject: Re: slow group by query