Re: Subselect performance question - Mailing list pgsql-sql

From Tom Lane
Subject Re: Subselect performance question
Date
Msg-id 18063.1020372973@sss.pgh.pa.us
Whole thread Raw
In response to Subselect performance question  ("Josh Berkus" <josh@agliodbs.com>)
List pgsql-sql
"Josh Berkus" <josh@agliodbs.com> writes:
> Under those circumstances, is there a hard-and-fast rule for which
> query I should use?  Most of the time, users will only look at one
> record at a time from the main data set.

> SELECT client_id, client_name, 
>     (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
> AND case_status = 'ACT') as active_count,
>     (SELECT count(*) FROM case_clients WHERE client_id = clients.client_id
> AND case_status = 'STL') as settled_count,
> FROM clients;

> OR:

> SELECT client_id, client_name,  active_count, settled_count,
> FROM clients 
>     LEFT OUTER JOIN (SELECT client_id, count(*) as active_count FROM
> case_clients WHERE case_status = 'ACT' GROUP BY client_id) ac ON
> ac.client_id = clients.client_id
>    LEFT OUTER JOIN (SELECT client_id, count(*) as settled_count FROM
> case_clients WHERE case_status = 'STL' GROUP BY client_id) sc ON
> sc.client_id = clients.client_id;

The second one looks like a sure loser to me, because of the GROUP BYs.

If you were expecting queries to retrieve many different client_ids,
it *might* be better to use the second form.  But I think for a small
number of client_ids the first will be quicker.
        regards, tom lane


pgsql-sql by date:

Previous
From: Manfred Koizar
Date:
Subject: Re: Subselect performance question
Next
From: "Josh Berkus"
Date:
Subject: Re: Subselect performance question