Thread: Is there a more elegant way to write this query?...

Is there a more elegant way to write this query?...

From
"Nick Fankhauser"
Date:
Hi-

I'm suffering from a performance problem, but when I look at my query, I'm
not convinced that there isn't a better way to handle this in SQL. -So I'm
seeking advice here before I go to the performance list.

I have three tables- case, actor and actor_case_assignment.

As the names imply, actor_case_assignment contains records that assign an
actor to a case. Actors such as attorneys or judges may have many cases,
while the average actor (we hope) only has one.

What I'm trying to do is link these tables to get back a single row per
actor that shows the actor's name, the number of cases that actor is
assigned to, and if they only have one case, I want the number for that
case. This means I have to do some grouping to get the case count, but I'm
then forced to use an aggregate function like max on the other fields. I
hope there's a better way. Any suggestions?

Here's what I'm using right now:
 select   actor.actor_id,   max(actor.actor_full_name),   max(case_data.case_public_id),   max(case_data.case_id),
count(case_data.case_id)as case_count from   actor,   actor_case_assignment,   case_data where
actor.actor_full_name_uppercaselike upper('martin%')   and actor.actor_id = actor_case_assignment.actor_id   and
case_data.case_id= actor_case_assignment.case_id group by   actor.actor_id order by   max(actor.actor_full_name),
case_countdesc, limit   1000;
 


Thanks!   -Nick

---------------------------------------------------------------------
Nick Fankhauser
   nickf@doxpop.com  Phone 1.765.965.7363  Fax 1.765.962.9788
doxpop - Court records at your fingertips - http://www.doxpop.com/




Re: Is there a more elegant way to write this query?...

From
Eric Soroos
Date:
On Nov 10, 2003, at 1:02 PM, Nick Fankhauser wrote:

> Hi-
>
> I'm suffering from a performance problem, but when I look at my query, 
> I'm
> not convinced that there isn't a better way to handle this in SQL. -So 
> I'm
> seeking advice here before I go to the performance list.
>

An explain analyze would help.

> What I'm trying to do is link these tables to get back a single row per
> actor that shows the actor's name, the number of cases that actor is
> assigned to, and if they only have one case, I want the number for that
> case. This means I have to do some grouping to get the case count, but 
> I'm
> then forced to use an aggregate function like max on the other fields. 
> I
> hope there's a better way. Any suggestions?

How about:
selectactor.actor_full_name,actor.actor_id,s1.ctCases,s1.case_id,case_data.case_public_id
fromactor inner join ( select actor_id, count(*) as ctCases, max(case_id) 
as case_id                    from actor_case_assignment group by actor_id) as s1            on (actor.actor_id =
s1.actor_id)    left outer join case_data using (s1.case_id=case_data.case_id)
 
limit 1000;

If you don't need the public_id, then you don't even need to join in 
the case data table.

eric



Re: Is there a more elegant way to write this query?...

From
"Nick Fankhauser"
Date:
Hi Eric-

Thanks for your suggestion.

> An explain analyze would help.

I'll do that (and move the question to the performance list) when I get to
the performance question, but at this point, I'm just seeking some help in
looking at this from a different angle- I couldn't figure out how to achieve
the result I was after without using max(), which felt like a kludge. Your
solution gave me the start I needed.

-Nick


> How about:
> select
>     actor.actor_full_name,
>     actor.actor_id,
>     s1.ctCases,
>     s1.case_id,
>     case_data.case_public_id
> from
>     actor inner join ( select actor_id, count(*) as ctCases,
> max(case_id)
> as case_id
>                         from
> actor_case_assignment group by actor_id) as s1
>                 on (actor.actor_id = s1.actor_id)
>          left outer join case_data using
> (s1.case_id=case_data.case_id)
> limit 1000;




Re: Is there a more elegant way to write this query?...

From
Terence Kearns
Date:
Nick Fankhauser wrote:
> Hi-
> 
> I'm suffering from a performance problem, but when I look at my query, I'm
> not convinced that there isn't a better way to handle this in SQL. -So I'm
> seeking advice here before I go to the performance list.
> 
> I have three tables- case, actor and actor_case_assignment.
> 
> As the names imply, actor_case_assignment contains records that assign an
> actor to a case. Actors such as attorneys or judges may have many cases,
> while the average actor (we hope) only has one.
> 
> What I'm trying to do is link these tables to get back a single row per
> actor that shows the actor's name, the number of cases that actor is
> assigned to, and if they only have one case, I want the number for that
> case. This means I have to do some grouping to get the case count, but I'm
> then forced to use an aggregate function like max on the other fields. I
> hope there's a better way. Any suggestions?
> 
> Here's what I'm using right now:
> 
>   select
>     actor.actor_id,
>     max(actor.actor_full_name),
>     max(case_data.case_public_id),
>     max(case_data.case_id),
>     count(case_data.case_id) as case_count
>   from
>     actor,
>     actor_case_assignment,
>     case_data
>   where
>     actor.actor_full_name_uppercase like upper('martin%')
>     and actor.actor_id = actor_case_assignment.actor_id
>     and case_data.case_id = actor_case_assignment.case_id
>   group by
>     actor.actor_id
>   order by
>     max(actor.actor_full_name),
>     case_count desc,
>   limit
>     1000;
> 
> 
> Thanks!
>     -Nick
> 
> ---------------------------------------------------------------------
> Nick Fankhauser
> 
>     nickf@doxpop.com  Phone 1.765.965.7363  Fax 1.765.962.9788
> doxpop - Court records at your fingertips - http://www.doxpop.com/
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

selectcount(as.case_id) as case_count,max(as.case_id) as max_case_id,a.actor_full_name
fromactors a,actor_case_assignment as
wherea.case_id = actor_case_assignment.case_idand a.actor_full_name_uppercase like upper('martin%')    -- this field
is(shouldbe) indexed
 
group bya.actor_full_name
order bya.actor_full_name_uppercase -- this field is(should be) indexed


This query will not pick up any actors that a NOT assigned to a case, 
but you said in your criteria that all actors are assigned to at least 
one case. This query relys on that assumption and eliminates the expense 
of an outer join.

Also, you are always getting max_case_id but it is obvious that this 
will be the relevent case where case_count == 1 you can simply ignore 
max_case_id when case_count != 1

try running this query with explain to compare against the others.
Also, try adding an index to the actor_full_name_uppercase column since 
you are using it in an order by clause and searching on it. indexes will 
incur slight performance expenses when inserting and updating.

cheers.