We recently upgraded to PostgreSQL 8.1 from 7.4 and a few queries are
having performance problems and running for very long times. The
commonality seems to be PostgreSQL 8.1 is choosing to use a nested
loop join because it estimates there will be only be a single row.
There are really thousands of rows and the nested loop version takes
much longer.
Even with the bad plan, the test query runs quickly. The real query
is much more complicated and we have had to kill it after running for
24 hours.
SELECT
MAX(titles.name) AS title_name,
MAX(providers.short_name) AS provider_short_name,
SUM(x.xtns) AS xtns,
SUM(x.rev) AS rev
FROM xtns_by_mso_title_wk x
INNER JOIN providers providers ON x.provider_no = providers.provider_no
INNER JOIN titles titles ON x.title_no = titles.title_no
WHERE x.mso_no = 50
AND x.week BETWEEN '20060423 00:00:00' AND '20060423 00:00:00'
GROUP BY x.title_no, x.provider_no
The EXPLAIN ANALYZE looks like:
GroupAggregate (cost=11.63..11.67 rows=1 width=61) (actual
time=1440.550..1467.602 rows=3459 loops=1)
-> Sort (cost=11.63..11.64 rows=1 width=61) (actual
time=1440.515..1446.634 rows=3934 loops=1)
Sort Key: x.title_no, x.provider_no
-> Nested Loop (cost=0.00..11.62 rows=1 width=61) (actual
time=7.900..1422.686 rows=3934 loops=1)
-> Nested Loop (cost=0.00..7.38 rows=1 width=49)
(actual time=7.877..1373.392 rows=3934 loops=1)
-> Index Scan using unq_xtns_by_mso_title_wk on
xtns_by_mso_title_wk x (cost=0.00..4.12 rows=1 width=26) (actual
time=7.827..1297.681 rows=3934 loops=1)
Index Cond: ((week >= '2006-04-23
00:00:00'::timestamp without time zone) AND (week <= '2006-04-23
00:00:00'::timestamp without time zone) AND (mso_no = 50))
-> Index Scan using pk_titles on titles
(cost=0.00..3.25 rows=1 width=27) (actual time=0.010..0.012 rows=1
loops=3934)
Index Cond: ("outer".title_no = titles.title_no)
-> Index Scan using pk_providers on providers
(cost=0.00..4.23 rows=1 width=16) (actual time=0.004..0.005 rows=1
loops=3934)
Index Cond: ("outer".provider_no = providers.provider_no)
If it is searching over multiple weeks (week BETWEEN '20060417
00:00:00' AND '20060423 00:00:00'), it estimates better and uses a
hash join.
GroupAggregate (cost=7848.20..7878.48 rows=156 width=61) (actual
time=117.761..145.910 rows=3459 loops=1)
-> Sort (cost=7848.20..7852.08 rows=1552 width=61) (actual
time=117.735..123.823 rows=3934 loops=1)
Sort Key: x.title_no, x.provider_no
-> Hash Join (cost=5.95..7765.94 rows=1552 width=61)
(actual time=6.539..102.825 rows=3934 loops=1)
Hash Cond: ("outer".provider_no = "inner".provider_no)
-> Nested Loop (cost=0.00..7736.71 rows=1552
width=49) (actual time=5.117..86.980 rows=3934 loops=1)
-> Index Scan using idx_xtns_by_mso_ti_wk_wk_mso_t on
xtns_by_mso_title_wk x (cost=0.00..2677.04 rows=1552 width=26)
(actual time=5.085..18.065 rows=3934 loops=1)
Index Cond: ((week >= '2006-04-17
00:00:00'::timestamp without time zone) AND (week <= '2006-04-23
00:00:00'::timestamp without time zone) AND (mso_no = 50))
-> Index Scan using pk_titles on titles
(cost=0.00..3.25 rows=1 width=27) (actual time=0.006..0.010 rows=1
loops=3934)
Index Cond: ("outer".title_no = titles.title_no)
-> Hash (cost=5.16..5.16 rows=316 width=16) (actual
time=1.356..1.356 rows=325 loops=1)
-> Seq Scan on providers (cost=0.00..5.16
rows=316 width=16) (actual time=0.008..0.691 rows=325 loops=1)
If the week range is replace by an equals (week = '20060423
00:00:00'), it also uses a hash join. Unforuntately, the queries are
automatically generated and changing them to use an equals could be
problematic.
GroupAggregate (cost=7828.75..7859.32 rows=157 width=61) (actual
time=98.330..125.370 rows=3459 loops=1)
-> Sort (cost=7828.75..7832.67 rows=1567 width=61) (actual
time=98.303..104.055 rows=3934 loops=1)
Sort Key: x.title_no, x.provider_no
-> Hash Join (cost=5.95..7745.60 rows=1567 width=61)
(actual time=1.785..83.830 rows=3934 loops=1)
Hash Cond: ("outer".provider_no = "inner".provider_no)
-> Nested Loop (cost=0.00..7716.14 rows=1567
width=49) (actual time=0.170..68.338 rows=3934 loops=1)
-> Index Scan using idx_xtns_by_mso_ti_wk_wk_mso_t on
xtns_by_mso_title_wk x (cost=0.00..2607.56 rows=1567 width=26)
(actual time=0.138..11.993 rows=3934 loops=1)
Index Cond: ((week = '2006-04-23
00:00:00'::timestamp without time zone) AND (mso_no = 50))
-> Index Scan using pk_titles on titles
(cost=0.00..3.25 rows=1 width=27) (actual time=0.006..0.008 rows=1
loops=3934)
Index Cond: ("outer".title_no = titles.title_no)
-> Hash (cost=5.16..5.16 rows=316 width=16) (actual
time=1.565..1.565 rows=325 loops=1)
-> Seq Scan on providers (cost=0.00..5.16
rows=316 width=16) (actual time=0.008..0.677 rows=325 loops=1)
Does anyone have some suggestions to try? The most worrying thing is
that when the statistics are off, it can do a pathological query.
- Ian