Suggestions wanted for 7.2.4 query - Mailing list pgsql-performance

From Josh Berkus
Subject Suggestions wanted for 7.2.4 query
Date
Msg-id 200305031928.52906.josh@agliodbs.com
Whole thread Raw
Responses Re: Suggestions wanted for 7.2.4 query
List pgsql-performance
Folks,

I have a common query on a production database that's running a little too
slow (3-6 seconds).     I can currently drop the time to 0.8 seconds by
setting enable_seqscan = false; the main reason is the planner poorly
deciding to use a seq scan for the hash join between "events" and "cases",
mostly due to a dramatically bad estimate of the number of rows required from
"cases".

Suggestions on how to get Postgres to use cases_pkey instead of a seq scan on
cases without messing with global query settings in the database which might
make other queries run slower?  (And yes, a VACUUM FULL ANALYZE was
involved).

The View:
create view sv_events as
select events.event_id, events.status, status_label, status.rollup as rstatus,
events.etype_id, type_name,
    event_cats.ecat_id, cat_name, events.event_date, events.event_name,
    jw_date_format(events.event_date, events.event_tz, events.duration) as
show_date,
    cases.case_name || '(' || cases.docket || ')' as event_case,
    events.case_id, cases.case_name, cases.docket, NULL::VARCHAR as tgroup_name,
    events.location_id, location_name, locations.zip_code, locations.address,
locations.state_code, locations.city,
    lu.user_name as lock_name, lu.email as lock_email, lu.user_id AS lock_user
FROM status, locations, event_types, event_cats, cases,
    events LEFT OUTER JOIN lock_users lu ON events.event_id = lock_record
WHERE events.status <> 0
    AND (events.status = status.status AND status.relation = 'events')
    AND events.location_id = locations.location_id
    AND event_types.etype_id = events.etype_id
    AND event_cats.ecat_id = event_types.ecat_id
    AND events.case_id = cases.case_id;

The Query:
SELECT sv_events.*, FALSE AS fuzzy_team FROM sv_events WHERE  EXISTS ( SELECT
event_id FROM event_days
 WHERE event_days.event_id = sv_events.event_id AND  (event_day BETWEEN
('2003-04-08'::TIMESTAMP WITHOUT TIME ZONE)
  AND ('2003-06-17 23:59'::TIMESTAMP WITHOUT TIME ZONE) ) );


The Explain:
jwnet_test=> \i perform.sql
psql:perform.sql:9: NOTICE:  QUERY PLAN:

Limit  (cost=199572.58..199572.58 rows=10 width=368) (actual
time=3239.95..3239.96 rows=10 loops=1)
  ->  Sort  (cost=199572.58..199572.58 rows=33575 width=368) (actual
time=3239.92..3239.93 rows=41 loops=1)
        ->  Hash Join  (cost=6576.62..191013.53 rows=33575 width=368) (actual
time=513.49..3220.38 rows=1790 loops=1)
              ->  Hash Join  (cost=6574.72..189924.26 rows=14837 width=350)
(actual time=509.20..3063.85 rows=1790 loops=1)
                    ->  Hash Join  (cost=38.81..180804.32 rows=14837
width=304) (actual time=16.38..452.80 rows=1919 loops=1)
                          ->  Hash Join  (cost=33.92..180539.78 rows=14837
width=252) (actual time=15.68..428.38 rows=1919 loops=1)
                                ->  Hash Join  (cost=22.17..180231.28
rows=14837 width=155) (actual time=13.98..406.61 rows=1919 loops=1)
                                      ->  Seq Scan on events
(cost=0.00..179874.82 rows=14837 width=67) (actual time=0.27..382.47
rows=1919 loops=1)
                                            SubPlan
                                              ->  Index Scan using
event_days_pk on event_days  (cost=0.00..6.01 rows=1 width=4) (actual
time=0.01..0.01 rows=0 loops=29734)
                                      ->  Hash  (cost=21.99..21.99 rows=72
width=83) (actual time=13.66..13.66 rows=0 loops=1)
                                            ->  Subquery Scan lu
(cost=12.61..21.99 rows=72 width=83) (actual time=13.64..13.65 rows=1
loops=1)
                                                  ->  Hash Join
(cost=12.61..21.99 rows=72 width=83) (actual time=13.63..13.64 rows=1
loops=1)
                                                        ->  Seq Scan on
edit_locks  (cost=0.00..7.94 rows=72 width=26) (actual time=12.82..12.83
rows=1 loops=1)
                                                        ->  Hash
(cost=6.50..6.50 rows=150 width=57) (actual time=0.71..0.71 rows=0 loops=1)
                                                              ->  Seq Scan on
users  (cost=0.00..6.50 rows=150 width=57) (actual time=0.01..0.47 rows=150
loops=1)
                                ->  Hash  (cost=11.00..11.00 rows=300
width=97) (actual time=1.66..1.66 rows=0 loops=1)
                                      ->  Seq Scan on locations
(cost=0.00..11.00 rows=300 width=97) (actual time=0.01..1.11 rows=300
loops=1)
                          ->  Hash  (cost=4.75..4.75 rows=56 width=52) (actual
time=0.60..0.60 rows=0 loops=1)
                                ->  Hash Join  (cost=1.21..4.75 rows=56
width=52) (actual time=0.17..0.51 rows=56 loops=1)
                                      ->  Seq Scan on event_types
(cost=0.00..2.56 rows=56 width=31) (actual time=0.01..0.15 rows=56 loops=1)
                                      ->  Hash  (cost=1.17..1.17 rows=17
width=21) (actual time=0.07..0.07 rows=0 loops=1)
                                            ->  Seq Scan on event_cats
(cost=0.00..1.17 rows=17 width=21) (actual time=0.01..0.05 rows=17 loops=1)
                    ->  Hash  (cost=3800.07..3800.07 rows=112107 width=46)
(actual time=491.84..491.84 rows=0 loops=1)
                          ->  Seq Scan on cases  (cost=0.00..3800.07
rows=112107 width=46) (actual time=0.01..277.20 rows=112107 loops=1)
              ->  Hash  (cost=1.88..1.88 rows=10 width=18) (actual
time=0.12..0.12 rows=0 loops=1)
                    ->  Seq Scan on status  (cost=0.00..1.88 rows=10 width=18)
(actual time=0.03..0.11 rows=10 loops=1)
Total runtime: 3241.09 msec

The Index Scan:
jwnet_test=> set enable_seqscan = false;
SET VARIABLE
jwnet_test=> \i perform.sql
psql:perform.sql:9: NOTICE:  QUERY PLAN:

Limit  (cost=252608.52..252608.52 rows=10 width=368) (actual
time=740.62..740.64 rows=10 loops=1)
  ->  Sort  (cost=252608.52..252608.52 rows=33469 width=368) (actual
time=740.60..740.61 rows=41 loops=1)
        ->  Hash Join  (cost=86.85..244083.21 rows=33469 width=368) (actual
time=20.93..720.70 rows=1790 loops=1)
              ->  Hash Join  (cost=80.75..242992.18 rows=14812 width=350)
(actual time=16.69..554.62 rows=1790 loops=1)
                    ->  Nested Loop  (cost=49.20..242664.38 rows=14812
width=253) (actual time=14.56..519.42 rows=1790 loops=1)
                          ->  Hash Join  (cost=49.20..158631.12 rows=14812
width=207) (actual time=14.40..459.91 rows=1919 loops=1)
                                ->  Hash Join  (cost=32.78..158355.48
rows=14812 width=155) (actual time=13.59..442.08 rows=1919 loops=1)
                                      ->  Index Scan using idx_events_status
on events  (cost=0.00..157988.97 rows=14812 width=67) (actual
time=0.08..416.67 rows=1919 loops=1)
                                            SubPlan
                                              ->  Index Scan using
event_days_pk on event_days  (cost=0.00..5.26 rows=1 width=4) (actual
time=0.01..0.01 rows=0 loops=29734)
                                      ->  Hash  (cost=32.60..32.60 rows=72
width=83) (actual time=13.47..13.47 rows=0 loops=1)
                                            ->  Subquery Scan lu
(cost=0.00..32.60 rows=72 width=83) (actual time=1.60..13.46 rows=1 loops=1)
                                                  ->  Merge Join
(cost=0.00..32.60 rows=72 width=83) (actual time=1.59..13.45 rows=1 loops=1)
                                                        ->  Index Scan using
users_pkey on users  (cost=0.00..19.63 rows=150 width=57) (actual
time=0.09..0.12 rows=3 loops=1)
                                                        ->  Index Scan using
edit_locks_user_id on edit_locks  (cost=0.00..11.51 rows=72 width=26) (actual
time=1.43..13.28 rows=1 loops=1)
                                ->  Hash  (cost=16.28..16.28 rows=56 width=52)
(actual time=0.77..0.77 rows=0 loops=1)
                                      ->  Hash Join  (cost=5.67..16.28 rows=56
width=52) (actual time=0.29..0.68 rows=56 loops=1)
                                            ->  Index Scan using
event_types_pkey on event_types  (cost=0.00..9.63 rows=56 width=31) (actual
time=0.08..0.28 rows=56 loops=1)
                                            ->  Hash  (cost=5.63..5.63 rows=17
width=21) (actual time=0.15..0.15 rows=0 loops=1)
                                                  ->  Index Scan using
event_cats_pkey on event_cats  (cost=0.00..5.63 rows=17 width=21) (actual
time=0.08..0.13 rows=17 loops=1)
                          ->  Index Scan using cases_pkey on cases
(cost=0.00..5.66 rows=1 width=46) (actual time=0.02..0.02 rows=1 loops=1919)
                    ->  Hash  (cost=30.80..30.80 rows=300 width=97) (actual
time=2.07..2.07 rows=0 loops=1)
                          ->  Index Scan using locations_pkey on locations
(cost=0.00..30.80 rows=300 width=97) (actual time=0.09..1.61 rows=300
loops=1)
              ->  Hash  (cost=6.07..6.07 rows=10 width=18) (actual
time=0.08..0.08 rows=0 loops=1)
                    ->  Index Scan using status_relation on status
(cost=0.00..6.07 rows=10 width=18) (actual time=0.03..0.06 rows=10 loops=1)
Total runtime: 741.72 msec



--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


pgsql-performance by date:

Previous
From: Rod Taylor
Date:
Subject: Re: NOT IN doesn't use index? (fwd)
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: NOT IN doesn't use index? (fwd)