Re: Subselect performance question - Mailing list pgsql-sql

From Josh Berkus
Subject Re: Subselect performance question
Date
Msg-id web-1392296@davinci.ethosmedia.com
Whole thread Raw
In response to Re: Subselect performance question  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-sql
Manfred,

> I don't know, if it matters, but your queries are not equivalent.  If
> for a client_id there is no row with case_status = 'ACT',
> active_count
> is ...

Thanks.  Would have noticed that, and done a COALESCE.

> should match your first query and it doesn't use sub-selects at all.
> I haven't tested it though, so watch out for typos ...
> My personal hard-and-fast rule is "avoid sub-selects, use joins",
>  but
> YMMV.

Yes, usually.  There's other reasons in how I'm going to use the view,
though, that make me want to use a subselect.

Actually, somebody reminded me that if I have a fixed list of
categories, the fastest appoach performance-wise is to use a crosstab
identity martix.  What I did is to create and populate a table as
follows:

case_status_crosstab
status_group     act_count stl_count dis_count oth_count
---------------- --------- --------- --------- ---------
Closed\ Cases            0         0         0         0
Active\ Cases            1         0         0         0
Settled\ Cases           0         1         0         0
Dismissed\ Cases         0         0         1         0
Other\ Cases             0         0         0         1
Special\ Cases           0         0         0         1

Then I can count them as follows:

CREATE VIEW bv_clients AS
SELECT client_id, client_name, short_name, ljo.description as
office_location,contact_name, phone_no, fax_no, status_label,address_1, address_2, address_3, address_4, address_5,
address_6,city,state, zip_code, comments,active_count, settled_count, dismissed_count, other_count
 
FROM clients JOIN status ON (clients.status = status.status AND
status.relation = 'clients')
LEFT OUTER JOIN text_list_values ljo ON (clients.jw_office =
ljo.list_value and ljo.list_name ILIKE 'office location')
LEFT OUTER JOIN (SELECT client_id, sum(act_count) as active_count,
sum(stl_count) as settled_count,sum(dis_count) as dismissed_count, sum(oth_count) as other_count
FROM case_clients cc JOIN text_list_values tvl ON (cc.case_status =
tvl.list_value and list_name ilike 'case status')JOIN case_status_crosstab cst ON tvl.rollup1 = cst.status_group
GROUP BY client_id) as c_count USING (client_id);

This actually runs pretty fast, once everything is indexed.

-Josh Berkus


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Subselect performance question
Next
From:
Date:
Subject: INserting images in postgres