Re: Perplexing Query Performance - Mailing list pgsql-general
From | mdklatt@ou.edu (Michael Klatt) |
---|---|
Subject | Re: Perplexing Query Performance |
Date | |
Msg-id | 2cb75565.0204160757.64a18bcc@posting.google.com Whole thread Raw |
In response to | Re: Perplexing Query Performance (mdklatt@ou.edu (Michael Klatt)) |
List | pgsql-general |
mdklatt@ou.edu (Michael Klatt) wrote in message news:<2cb75565.0204151456.193bde9@posting.google.com>... > mdklatt@ou.edu (Michael Klatt) wrote in message news:<2cb75565.0204051614.4ad76291@posting.google.com>... > Consider this query: > > select * > from daily_rainfall as p, sites as s > where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and > s.longitude >= -180 and s.terrain in ('A', 'I', 'L'); > > Almost any combination of 'A', 'I', and/or 'L' consistently executes > in the same time, but if it's ONLY 'A' PostgreSQL seems to go off into > Never Never Land. There is no problem if 'I' or 'L' is used alone, > just 'A'. > Here's some additional information to help diagnose the problem. This is a "normal" query that executes in a reasonable amount of time. SRDC=> explain select * from daily_rainfall as p, sites as s where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and s.longitude >= -180 and s.longitude <= 180 and s.terrain in ('A', 'L', 'I'); NOTICE: QUERY PLAN: Nested Loop (cost=99752.84..183759.80 rows=752 width=100) -> Subquery Scan p (cost=99752.84..104595.25 rows=38739 width=44) -> Aggregate (cost=99752.84..104595.25 rows=38739 width=44) -> Group (cost=99752.84..101689.80 rows=387392 width=44) -> Sort (cost=99752.84..99752.84 rows=387392 width=44) -> Seq Scan on rainfall (cost=0.00..35636.20 rows=387392 width=44) -> Index Scan using sites_pkey on sites s (cost=0.00..2.03 rows=1 width=56) EXPLAIN This is the query that keeps going and going and going.... SRDC=> explain select * from daily_rainfall as p, sites as s where p.site = s.ident and s.latitude >= -90 and s.latitude <= 90 and s.longitude >= -180 and s.longitude <= 180 and s.terrain in ('A'); NOTICE: QUERY PLAN: Nested Loop (cost=99752.84..105094.23 rows=89 width=100) -> Seq Scan on sites s (cost=0.00..14.74 rows=1 width=56) -> Subquery Scan p (cost=99752.84..104595.25 rows=38739 width=44) -> Aggregate (cost=99752.84..104595.25 rows=38739 width=44) -> Group (cost=99752.84..101689.80 rows=387392 width=44) -> Sort (cost=99752.84..99752.84 rows=387392 width=44) -> Seq Scan on rainfall (cost=0.00..35636.20 rows=387392 width=44) EXPLAIN One difference I see is that the first query does an index scan on table 'sites', whereas the second query does not. The only index on 'sites' is the primary key index with on attribute 'ident'. Would performance improve if I added an index on 'terrain'? Attributes 'latitude' and 'longitude' are always used as selection criteria so should I add indexes to them as well?
pgsql-general by date: