extending ORDER BY make query much slower - Mailing list pgsql-sql

From Dan Langille
Subject extending ORDER BY make query much slower
Date
Msg-id 3E6F45D8.15284.5994716@localhost
Whole thread Raw
Responses Re: extending ORDER BY make query much slower
Re: extending ORDER BY make query much slower
List pgsql-sql
NOTE: I've also put the analyse output at 
http://www.freshports.org/tmp/faster.txt in case it's easier to read 
than the format below.

I've found that adding another field to the ORDER BY clause, times go 
through the roof.

This query is pretty fast.

explain analyse SELECT element.id,        commit_log.id,        commit_log.message_id,        commit_log.committer,
  commit_log.commit_date,        commit_log.description,        commit_log.encoding_losses,
commit_log.message_subject,       element.name,        commit_log_elements.revision_name,
element_pathname(element.id)AS element_pathname   FROM commit_log, commit_log_elements, element  WHERE
commit_log_elements.commit_log_id= commit_log.id    AND commit_log_elements.element_id    = element.id
 
ORDER BY commit_log.commit_date DESC
LIMIT 100;

Limit  (cost=0.00..468.79 rows=100 width=261) (actual 
time=1.80..78.10 rows=100 loops=1)  ->  Nested Loop  (cost=0.00..1248045.68 rows=266228 width=261) 
(actual time=1.78..77.19 rows=101 loops=1)        ->  Nested Loop  (cost=0.00..307745.31 rows=266228 
width=244) (actual time=0.36..5.41 rows=101 loops=1)              ->  Index Scan Backward using commit_log_commit_date

on commit_log  (cost=0.00..34305.03 rows=70543 width=227) (actual 
time=0.30..1.17 rows=32 loops=1)              ->  Index Scan using commit_log_elements_clid on 
commit_log_elements  (cost=0.00..3.46 rows=33 width=17) (actual 
time=0.03..0.07 rows=3 loops=32)                    Index Cond: (commit_log_elements.commit_log_id = 
"outer".id)        ->  Index Scan using element_pkey on element  
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1 
loops=101)              Index Cond: ("outer".element_id = element.id)Total runtime: 79.52 msec

If I add another item to the ORDER BY, the times sky rocket.  I 
haven't let the query run to completion, but here's the analyse 
output:

explain analyse SELECT element.id,        commit_log.id AS my_commit_log_id,        commit_log.message_id,
to_char(commit_log.commit_date- SystemTimeAdjust(), 'DD Mon 
 
YYYY')  AS commit_date,        to_char(commit_log.commit_date - SystemTimeAdjust(), 
'HH24:MI')      AS commit_time,        commit_log.committer,        commit_log.description,
commit_log.encoding_losses,       commit_log.message_subject,        element.name,
commit_log_elements.revision_name,       element_pathname(element.id) AS element_pathname   FROM commit_log,
commit_log_elements,element  WHERE commit_log_elements.commit_log_id = commit_log.id    AND
commit_log_elements.element_id   = element.id
 
ORDER BY commit_log.commit_date DESC,         my_commit_log_id
LIMIT 100;


explain  SELECT element.id,        commit_log.id AS my_commit_log_id,        commit_log.message_id,
commit_log.committer,       commit_log.description,        commit_log.encoding_losses,
commit_log.message_subject,       element.name,        commit_log_elements.revision_name,
element_pathname(element.id)AS element_pathname   FROM commit_log, commit_log_elements, element  WHERE
commit_log_elements.commit_log_id= commit_log.id    AND commit_log_elements.element_id    = element.id
 
ORDER BY commit_log.commit_date DESC,         my_commit_log_id
LIMIT 100;

                                            QUERY PLANLimit  (cost=196855.28..196855.53 rows=100 width=261)  ->  Sort
(cost=196855.28..197520.85rows=266228 width=261)        Sort Key: commit_log.commit_date, commit_log.id        ->  Hash
Join (cost=9478.25..38465.07 rows=266228 
 
width=261)              Hash Cond: ("outer".commit_log_id = "inner".id)              ->  Hash Join
(cost=3503.91..19098.46rows=266228 
 
width=34)                    Hash Cond: ("outer".element_id = "inner".id)                    ->  Seq Scan on
commit_log_elements 
 
(cost=0.00..4703.28 rows=266228 width=17)                    ->  Hash  (cost=2463.73..2463.73 rows=132073 
width=17)                          ->  Seq Scan on element  
(cost=0.00..2463.73 rows=132073 width=17)              ->  Hash  (cost=3133.43..3133.43 rows=70543 width=227)
        ->  Seq Scan on commit_log  (cost=0.00..3133.43 
 
rows=70543 width=227)
(12 rows)

If I restructure the query, I can get times pretty close to the 
original query:

explain analyse SELECT * FROM ( SELECT element.id,        commit_log.id AS my_commit_log_id,
commit_log.message_id,       commit_log.commit_date,        commit_log.committer,        commit_log.description,
commit_log.encoding_losses,       commit_log.message_subject,        element.name,
commit_log_elements.revision_name,       element_pathname(element.id) AS element_pathname   FROM commit_log,
commit_log_elements,element  WHERE commit_log_elements.commit_log_id = commit_log.id    AND
commit_log_elements.element_id   = element.id
 
ORDER BY commit_log.commit_date DESC 
LIMIT 100) AS TEMP
order by commit_date, my_commit_log_id;

Sort  (cost=472.11..472.36 rows=100 width=261) (actual 
time=80.01..80.40 rows=100 loops=1)  Sort Key: commit_date, my_commit_log_id  ->  Subquery Scan "temp"
(cost=0.00..468.79rows=100 width=261) 
 
(actual time=1.66..78.64 rows=100 loops=1)        ->  Limit  (cost=0.00..468.79 rows=100 width=261) (actual 
time=1.63..76.75 rows=100 loops=1)              ->  Nested Loop  (cost=0.00..1248045.68 rows=266228 
width=261) (actual time=1.62..75.85 rows=101 loops=1)                    ->  Nested Loop  (cost=0.00..307745.31 
rows=266228 width=244) (actual time=0.34..5.38 rows=101 loops=1)                          ->  Index Scan Backward using

commit_log_commit_date on commit_log  (cost=0.00..34305.03 rows=70543 
width=227) (actual time=0.27..1.17 rows=32 loops=1)                          ->  Index Scan using 
commit_log_elements_clid on commit_log_elements  (cost=0.00..3.46 
rows=33 width=17) (actual time=0.03..0.07 rows=3 loops=32)                                Index Cond: 
(commit_log_elements.commit_log_id = "outer".id)                    ->  Index Scan using element_pkey on element  
(cost=0.00..3.52 rows=1 width=17) (actual time=0.08..0.09 rows=1 
loops=101)                          Index Cond: ("outer".element_id = 
element.id)Total runtime: 81.04 msec
(12 rows)

Any comments?
-- 
Dan Langille : http://www.langille.org/



pgsql-sql by date:

Previous
From: "Chad Thompson"
Date:
Subject: Re: filtering out doubles
Next
From: mila
Date:
Subject: Re: LEFT JOIN and missing values