Seeking help with a query that takes too long - Mailing list pgsql-performance
From | Nick Fankhauser |
---|---|
Subject | Seeking help with a query that takes too long |
Date | |
Msg-id | NEBBLAAHGLEEPCGOBHDGKEIJJGAA.nickf@ontko.com Whole thread Raw |
Responses |
Re: Seeking help with a query that takes too long
|
List | pgsql-performance |
[I originally posted this using the wrong E-Mail account, so a double posting may occur if the first message gets released by the moderator later- sorry!] Hi- I have a query that I'm trying to speed up. I haven't been able to come up with any workable ideas for speeding it up, so I'm seeking some input from the list. I'm using version 7.3.2 I have three tables: case_data (1,947,386 rows) actor (3,385,669 rows) actor_case_assignment (8,668,650 rows) As the names imply, actor_case_assignment contains records that assign an actor to a case. Actors such as attorneys or judges may have many cases, while the average actor (we hope) only has one. What I'm trying to do is link these tables to get back a single row per actor that shows the actor's name, the number of cases that actor is assigned to, and if they only have one case, I want the public_id for that case. This means I have to do a group by to get the case count, but I'm then forced to use an aggregate function like max on the other fields. All of the fields ending in "_id" have unique indexes, and actor_full_name_uppercase is indexed. An analyze is done every night & the database is fairly stable in it's composition. Here's the select: select actor.actor_id, max(actor.actor_full_name), max(case_data.case_public_id), max(case_data.case_id), count(case_data.case_id) as case_count from actor, actor_case_assignment, case_data where actor.actor_full_name_uppercase like upper('sanders%') and actor.actor_id = actor_case_assignment.actor_id and case_data.case_id = actor_case_assignment.case_id group by actor.actor_id order by max(actor.actor_full_name), case_count desc limit 1000; Here's the explain analyze: QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- -------------------------------------------------- Limit (cost=2214.71..2214.72 rows=1 width=115) (actual time=120034.61..120035.67 rows=1000 loops=1) -> Sort (cost=2214.71..2214.72 rows=1 width=115) (actual time=120034.60..120034.98 rows=1001 loops=1) Sort Key: max((actor.actor_full_name)::text), count(case_data.case_id) -> Aggregate (cost=2214.67..2214.70 rows=1 width=115) (actual time=119962.80..120011.49 rows=3456 loops=1) -> Group (cost=2214.67..2214.68 rows=2 width=115) (actual time=119962.76..119987.04 rows=5879 loops=1) -> Sort (cost=2214.67..2214.68 rows=2 width=115) (actual time=119962.74..119965.09 rows=5879 loops=1) Sort Key: actor.actor_id -> Nested Loop (cost=0.00..2214.66 rows=2 width=115) (actual time=59.05..119929.71 rows=5879 loops=1) -> Nested Loop (cost=0.00..2205.26 rows=3 width=76) (actual time=51.46..66089.04 rows=5882 loops=1) -> Index Scan using actor_full_name_uppercase on actor (cost=0.00..6.01 rows=1 width=42) (actual time=37.62..677.44 rows=3501 loops=1) Index Cond: ((actor_full_name_uppercase >= 'SANDERS'::character varying) AND (actor_full_name_uppercase < 'SANDERT'::character varying)) Filter: (actor_full_name_uppercase ~~ 'SANDERS%'::text) -> Index Scan using actor_case_assignment_actor_id on actor_case_assignment (cost=0.00..2165.93 rows=2666 width=34) (actual time=16.37..18.67 rows=2 loops=3501) Index Cond: ("outer".actor_id = actor_case_assignment.actor_id) -> Index Scan using case_data_case_id on case_data (cost=0.00..3.66 rows=1 width=39) (actual time=9.14..9.15 rows=1 loops=5882) Index Cond: (case_data.case_id = "outer".case_id) Total runtime: 120038.60 msec (17 rows) Any ideas? Thanks! -Nick --------------------------------------------------------------------- Nick Fankhauser nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788 doxpop - Court records at your fingertips - http://www.doxpop.com/
pgsql-performance by date: