Thread: Is there a more elegant way to write this query?...
Hi- I'm suffering from a performance problem, but when I look at my query, I'm not convinced that there isn't a better way to handle this in SQL. -So I'm seeking advice here before I go to the performance list. I have three tables- case, actor and actor_case_assignment. As the names imply, actor_case_assignment contains records that assign an actor to a case. Actors such as attorneys or judges may have many cases, while the average actor (we hope) only has one. What I'm trying to do is link these tables to get back a single row per actor that shows the actor's name, the number of cases that actor is assigned to, and if they only have one case, I want the number for that case. This means I have to do some grouping to get the case count, but I'm then forced to use an aggregate function like max on the other fields. I hope there's a better way. Any suggestions? Here's what I'm using right now: select actor.actor_id, max(actor.actor_full_name), max(case_data.case_public_id), max(case_data.case_id), count(case_data.case_id)as case_count from actor, actor_case_assignment, case_data where actor.actor_full_name_uppercaselike upper('martin%') and actor.actor_id = actor_case_assignment.actor_id and case_data.case_id= actor_case_assignment.case_id group by actor.actor_id order by max(actor.actor_full_name), case_countdesc, limit 1000; Thanks! -Nick --------------------------------------------------------------------- Nick Fankhauser nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788 doxpop - Court records at your fingertips - http://www.doxpop.com/
On Nov 10, 2003, at 1:02 PM, Nick Fankhauser wrote: > Hi- > > I'm suffering from a performance problem, but when I look at my query, > I'm > not convinced that there isn't a better way to handle this in SQL. -So > I'm > seeking advice here before I go to the performance list. > An explain analyze would help. > What I'm trying to do is link these tables to get back a single row per > actor that shows the actor's name, the number of cases that actor is > assigned to, and if they only have one case, I want the number for that > case. This means I have to do some grouping to get the case count, but > I'm > then forced to use an aggregate function like max on the other fields. > I > hope there's a better way. Any suggestions? How about: selectactor.actor_full_name,actor.actor_id,s1.ctCases,s1.case_id,case_data.case_public_id fromactor inner join ( select actor_id, count(*) as ctCases, max(case_id) as case_id from actor_case_assignment group by actor_id) as s1 on (actor.actor_id = s1.actor_id) left outer join case_data using (s1.case_id=case_data.case_id) limit 1000; If you don't need the public_id, then you don't even need to join in the case data table. eric
Hi Eric- Thanks for your suggestion. > An explain analyze would help. I'll do that (and move the question to the performance list) when I get to the performance question, but at this point, I'm just seeking some help in looking at this from a different angle- I couldn't figure out how to achieve the result I was after without using max(), which felt like a kludge. Your solution gave me the start I needed. -Nick > How about: > select > actor.actor_full_name, > actor.actor_id, > s1.ctCases, > s1.case_id, > case_data.case_public_id > from > actor inner join ( select actor_id, count(*) as ctCases, > max(case_id) > as case_id > from > actor_case_assignment group by actor_id) as s1 > on (actor.actor_id = s1.actor_id) > left outer join case_data using > (s1.case_id=case_data.case_id) > limit 1000;
Nick Fankhauser wrote: > Hi- > > I'm suffering from a performance problem, but when I look at my query, I'm > not convinced that there isn't a better way to handle this in SQL. -So I'm > seeking advice here before I go to the performance list. > > I have three tables- case, actor and actor_case_assignment. > > As the names imply, actor_case_assignment contains records that assign an > actor to a case. Actors such as attorneys or judges may have many cases, > while the average actor (we hope) only has one. > > What I'm trying to do is link these tables to get back a single row per > actor that shows the actor's name, the number of cases that actor is > assigned to, and if they only have one case, I want the number for that > case. This means I have to do some grouping to get the case count, but I'm > then forced to use an aggregate function like max on the other fields. I > hope there's a better way. Any suggestions? > > Here's what I'm using right now: > > select > actor.actor_id, > max(actor.actor_full_name), > max(case_data.case_public_id), > max(case_data.case_id), > count(case_data.case_id) as case_count > from > actor, > actor_case_assignment, > case_data > where > actor.actor_full_name_uppercase like upper('martin%') > and actor.actor_id = actor_case_assignment.actor_id > and case_data.case_id = actor_case_assignment.case_id > group by > actor.actor_id > order by > max(actor.actor_full_name), > case_count desc, > limit > 1000; > > > Thanks! > -Nick > > --------------------------------------------------------------------- > Nick Fankhauser > > nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788 > doxpop - Court records at your fingertips - http://www.doxpop.com/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster selectcount(as.case_id) as case_count,max(as.case_id) as max_case_id,a.actor_full_name fromactors a,actor_case_assignment as wherea.case_id = actor_case_assignment.case_idand a.actor_full_name_uppercase like upper('martin%') -- this field is(shouldbe) indexed group bya.actor_full_name order bya.actor_full_name_uppercase -- this field is(should be) indexed This query will not pick up any actors that a NOT assigned to a case, but you said in your criteria that all actors are assigned to at least one case. This query relys on that assumption and eliminates the expense of an outer join. Also, you are always getting max_case_id but it is obvious that this will be the relevent case where case_count == 1 you can simply ignore max_case_id when case_count != 1 try running this query with explain to compare against the others. Also, try adding an index to the actor_full_name_uppercase column since you are using it in an order by clause and searching on it. indexes will incur slight performance expenses when inserting and updating. cheers.