Thread: ORDER BY Optimization

ORDER BY Optimization

From
Derek Buttineau|Compu-SOLVE
Date:
Good Day,

I'm hoping this is the right place to send this.  I have a query that's 
causing me some grief trying to optimize it.  The query cost is fine 
until I order the data set.  Mind you after it's been ran and cached, 
subsequent calls to it are near instant.  The Query in question is:

select mr.*,m.* from maillog_received mr JOIN maillog m ON mr.maillog_id 
= m.id WHERE mr.subscription=89 and m.spam=1 ORDER BY m.msg_date desc 
limit 10

The strucutre of the tables involved in the query are as follows:
                                    Table "public.maillog_received"   Column    |          Type          |
             
 
Modifiers
--------------+------------------------+------------------------------------------------------------------id
|integer                | not null default 
 
nextval('public.maillog_received_id_seq'::text)maillog_id   | bigint                 | not nullsubscription | integer
            | not nulllocal_part   | character varying(255) | not nulldomain       | character varying(255) | not null
 
Indexes:   "maillog_received_pkey" PRIMARY KEY, btree (id)   "maillog_received_subscription_idx" btree (subscription)
"maillog_received_subscription_maillog_id_idx"btree (subscription, 
 
maillog_id)
Foreign-key constraints:   "$1" FOREIGN KEY (subscription) REFERENCES subscriptions(id) ON 
DELETE CASCADE   "$2" FOREIGN KEY (maillog_id) REFERENCES maillog(id) ON DELETE CASCADE
Triggers:   checkit BEFORE INSERT OR UPDATE ON maillog_received FOR EACH ROW 
EXECUTE PROCEDURE check_sub()
                                        Table "public.maillog"    Column      |            Type             |
            
 
Modifiers
-----------------+-----------------------------+---------------------------------------------------------id
| bigint                      | not null default 
 
nextval('public.maillog_id_seq'::text)message_id      | character varying(16)       |msg_date        | timestamp
withouttime zone | not nullfrom_local_part | character varying(255)      | not nullfrom_domain     | character
varying(255)     | not nullfrom_ip         | character varying(128)      |from_host       | character varying(255)
|subject        | character varying(255)      |virus           | integer                     | not null default 0spam
        | integer                     | not null default 0list            | integer                     | not null
default0bad_recipient   | integer                     | not null default 0bad_sender      | integer
|not null default 0bad_relay       | integer                     | not null default 0bad_file        | integer
          | not null default 0bad_mime        | integer                     | not null default 0sascore         |
numeric(7,2)               |sareport        | text                        |vscanreport     | text
|contentreport   | text                        |bypassed        | integer                     | not null default
0delivered      | integer                     | not null default 1complete        | integer                     | not
nulldefault 1
 
Indexes:   "maillog_pkey" PRIMARY KEY, btree (id)   "maillog_msg_date_idx" btree (msg_date)

EXPLAIN ANALYZE gives me the following:
Limit  (cost=31402.85..31402.87 rows=10 width=306) (actual 
time=87454.203..87454.334 rows=10 loops=1)  ->  Sort  (cost=31402.85..31405.06 rows=886 width=306) (actual 
time=87454.187..87454.240 rows=10 loops=1)        Sort Key: m.msg_date        ->  Nested Loop  (cost=0.00..31359.47
rows=886width=306) 
 
(actual time=4.740..86430.468 rows=26308 loops=1)              ->  Index Scan using maillog_received_subscription_idx
on
 
maillog_received mr  (cost=0.00..17789.73 rows=4479 width=43) (actual 
time=0.030..33554.061 rows=65508 loops=1)                    Index Cond: (subscription = 89)              ->  Index
Scanusing maillog_pkey on maillog m  
 
(cost=0.00..3.02 rows=1 width=263) (actual time=0.776..0.780 rows=0 
loops=65508)                    Index Cond: ("outer".maillog_id = m.id)                    Filter: (spam = 1)Total
runtime:87478.068 ms
 

Now there is a lot of data in these tables, at least a few million 
records, but I'd hoped to get a bit better performance :)  Now another 
odd thing I will mention, if I take the database schema to a second 
server (both running postgres 8.0.2 on FreeBSD 5.3), I get a much 
different (and to me, it looks much more effecient) query plan (though 
it's with substantially less data (about 500,000 records in the table)):
Limit  (cost=0.00..6482.60 rows=10 width=311) (actual 
time=25.340..26.885 rows=10 loops=1)  ->  Nested Loop  (cost=0.00..1175943.99 rows=1814 width=311) (actual 
time=25.337..26.867 rows=10 loops=1)        ->  Index Scan Backward using maillog_msg_date_idx on maillog 
m  (cost=0.00..869203.93 rows=51395 width=270) (actual 
time=25.156..26.050 rows=48 loops=1)              Filter: (spam = 1)        ->  Index Scan using 
maillog_received_subscription_maillog_id_idx on maillog_received mr  
(cost=0.00..5.96 rows=1 width=41) (actual time=0.011..0.012 rows=0 loops=48)              Index Cond: ((mr.subscription
=89) AND (mr.maillog_id = 
 
"outer".id))Total runtime: 27.016 ms

Any suggestions?

-- 
Regards,

Derek Buttineau
Internet Systems Developer
Compu-SOLVE Internet Services
Compu-SOLVE Technologies Inc.

705.725.1212 x255



Re: ORDER BY Optimization

From
Rosser Schwarz
Date:
while you weren't looking, Derek Buttineau|Compu-SOLVE wrote:

> I'm hoping this is the right place to send this.

The PostgreSQL Performance list, pgsql-performance@postgresql.org
would be more appropriate. I'm copying my followup there, as well.

As for your query, almost all the time is actually spent in the
nestloop, not the sort.  Compare:

>   ->  Sort  (cost=31402.85..31405.06 rows=886 width=306) (actual
> time=87454.187..87454.240 rows=10 loops=1)

vs.

>          ->  Nested Loop  (cost=0.00..31359.47 rows=886 width=306)
> (actual time=4.740..86430.468 rows=26308 loops=1)

That's 50-ish ms versus 80-odd seconds.

It seems to me a merge join might be more appropriate here than a
nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
dataset weighing in the sub-ten mbyte range.

Provided it's not already at least that big, and you don't want to up
it permanently, try saying:

SET work_mem = 10240; -- 10 mbytes

immediately before running this query (uncached, of course) and see
what happens.

Also, your row-count estimates look pretty off-base.  When were these
tables last VACUUMed or ANALYZEd?

/rls

--
:wq

Re: ORDER BY Optimization

From
Derek Buttineau|Compu-SOLVE
Date:
Thanks for the response :)

>That's 50-ish ms versus 80-odd seconds.
>
>It seems to me a merge join might be more appropriate here than a
>nestloop. What's your work_mem set at?  Off-the-cuff numbers show the
>dataset weighing in the sub-ten mbyte range.
>
>Provided it's not already at least that big, and you don't want to up
>it permanently, try saying:
>
>SET work_mem = 10240; -- 10 mbytes
>
>
It's currently set at 16mb, I've also tried upping sort_mem as well
without any noticible impact on the uncached query. :(

>immediately before running this query (uncached, of course) and see
>what happens.
>
>Also, your row-count estimates look pretty off-base.  When were these
>tables last VACUUMed or ANALYZEd?
>
>
I'm not entirely sure what's up with the row-count estimates, the tables
are updated quite frequently (and VACUUM is also run quite frequently),
however I had just run a VACUUM ANALYZE on both databases before running
the explain.

I'm also still baffled at the differences in the plans between the two
servers, on the one that uses the index to sort, I get for comparison a
nestloop of:

Nested Loop  (cost=0.00..1175943.99 rows=1814 width=311) (actual
time=25.337..26.867 rows=10 loops=1)

The plan that the "live" server seems to be using seems fairly inefficient.

Derek