enable_sort optimization problem - Mailing list pgsql-general

From Dave E Martin
Subject enable_sort optimization problem
Date
Msg-id 42956DB0.2020605@dave.to
Whole thread Raw
Responses Re: enable_sort optimization problem
Re: enable_sort optimization problem
List pgsql-general
(8.0.1 on debian/linux 2.6.11 kernel)

I have noticed that if I set enable_sort=false in the .conf file, my
queries are running faster. I had a query which if I did a limit 20, ran
in 6 milliseconds, but if I changed it to limit 21, it took around 19
seconds (or 19000 milliseconds). It also took longer if I did limit 19
offset 2. (I don't know what it is about the 21st record). In any case,
I noticed that in the analysis, the long version was doing a sort and
the quick version was not, so I tried the enable_sort=false setting, and
now things are generally running faster.

I HAVE done analyzes, and vacuums, and vacuum analyzes.  In prior
experimenting with this, there were even some seq_scans, which turned
into index_scans when I set enable_seqscan=false, and became moderately
faster.

I am using 8.0.1, and below are the two query plans, first the
enable_sort=true version, then the enable_sort=false version, note the
VAST difference in speed. What is the problem, and how can I convince
the query optimizer to do the right thing (short of enable_sort=false)?

from the config file:
# - Planner Cost Constants -

#effective_cache_size = 1000    # typically 8KB each
#random_page_cost = 4           # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)


With enable_sort=true: (the default)
opennms=# explain analyze SELECT OUTAGES.*, NODE.NODELABEL,
IPINTERFACE.IPHOSTNAME, SERVICE.SERVICENAME, NOTIFICATIONS.NOTIFYID,
NOTIFICATIONS.ANSWEREDBY FROM OUTAGES JOIN NODE USING(NODEID) JOIN
IPINTERFACE ON OUTAGES.NODEID=IPINTERFACE.NODEID AND
OUTAGES.IPADDR=IPINTERFACE.IPADDR  LEFT OUTER JOIN SERVICE ON
OUTAGES.SERVICEID=SERVICE.SERVICEID LEFT OUTER JOIN NOTIFICATIONS ON
SVCLOSTEVENTID=NOTIFICATIONS.EVENTID ORDER BY OUTAGEID DESC LIMIT 29
OFFSET 2;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=127422.75..127422.82 rows=29 width=126) (actual
time=40858.260..40858.608 rows=29 loops=1)
   ->  Sort  (cost=127422.74..127423.14 rows=161 width=126) (actual
time=40858.230..40858.367 rows=31 loops=1)
         Sort Key: outages.outageid
         ->  Nested Loop Left Join  (cost=1047.62..127416.84 rows=161
width=126) (actual time=2448.671..38122.920 rows=134579 loops=1)
               ->  Hash Left Join  (cost=1047.62..126533.18 rows=161
width=102) (actual time=2448.616..23963.962 rows=132250 loops=1)
                     Hash Cond: ("outer".serviceid = "inner".serviceid)
                     ->  Hash Join  (cost=1044.17..126527.32 rows=161
width=92) (actual time=1996.055..21178.268 rows=132250 loops=1)
                           Hash Cond: (("outer".nodeid = "inner".nodeid)
AND (("outer".ipaddr)::text = ("inner".ipaddr)::text))
                           ->  Index Scan using outages_serviceid_idx on
outages  (cost=0.00..123497.86 rows=132245 width=52) (actual
time=81.413..15330.873 rows=132250 loops=1)
                           ->  Hash  (cost=1037.31..1037.31 rows=1372
width=64) (actual time=1914.558..1914.558 rows=0 loops=1)
                                 ->  Merge Join  (cost=0.00..1037.31
rows=1372 width=64) (actual time=150.436..1906.441 rows=1372 loops=1)
                                       Merge Cond: ("outer".nodeid =
"inner".nodeid)
                                       ->  Index Scan using pk_nodeid on
node  (cost=0.00..229.55 rows=1043 width=24) (actual time=0.022..623.197
rows=1043 loops=1)
                                       ->  Index Scan using
ipinterface_nodeid_idx on ipinterface  (cost=0.00..788.01 rows=1372
width=40) (actual time=150.369..1259.366 rows=1372 loops=1)
                     ->  Hash  (cost=3.36..3.36 rows=33 width=14)
(actual time=452.519..452.519 rows=0 loops=1)
                           ->  Index Scan using pk_serviceid on service
(cost=0.00..3.36 rows=33 width=14) (actual time=452.193..452.369 rows=33
loops=1)
               ->  Index Scan using notifications_eventid_idx on
notifications  (cost=0.00..5.46 rows=2 width=28) (actual
time=0.078..0.079 rows=0 loops=132250)
                     Index Cond: ("outer".svclosteventid =
notifications.eventid)
 Total runtime: 40868.130 ms
(19 rows)


And the enable_sort=false version:
opennms=# explain analyze SELECT OUTAGES.*, NODE.NODELABEL,
IPINTERFACE.IPHOSTNAME, SERVICE.SERVICENAME, NOTIFICATIONS.NOTIFYID,
NOTIFICATIONS.ANSWEREDBY FROM OUTAGES JOIN NODE USING(NODEID) JOIN
IPINTERFACE ON OUTAGES.NODEID=IPINTERFACE.NODEID AND
OUTAGES.IPADDR=IPINTERFACE.IPADDR  LEFT OUTER JOIN SERVICE ON
OUTAGES.SERVICEID=SERVICE.SERVICEID LEFT OUTER JOIN NOTIFICATIONS ON
SVCLOSTEVENTID=NOTIFICATIONS.EVENTID ORDER BY OUTAGEID DESC LIMIT 29
OFFSET 2;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=12600.25..195303.93 rows=29 width=126) (actual
time=211.185..300.336 rows=29 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..1014320.39 rows=161 width=126)
(actual time=147.328..300.072 rows=31 loops=1)
         ->  Nested Loop Left Join  (cost=0.00..1013436.73 rows=161
width=102) (actual time=147.264..298.671 rows=31 loops=1)
               ->  Nested Loop  (cost=0.00..1012949.54 rows=161
width=92) (actual time=137.286..287.138 rows=31 loops=1)
                     ->  Nested Loop  (cost=0.00..1012208.81 rows=159
width=76) (actual time=126.277..254.709 rows=31 loops=1)
                           ->  Index Scan Backward using pk_outageid on
outages  (cost=0.00..252480.62 rows=132245 width=52) (actual
time=77.021..77.484 rows=31 loops=1)
                           ->  Index Scan using
ipinterface_nodeid_ipaddr_ismanaged_idx on ipinterface  (cost=0.00..5.73
rows=1 width=40) (actual time=5.304..5.686 rows=1 loops=31)
                                 Index Cond: (("outer".nodeid =
ipinterface.nodeid) AND (("outer".ipaddr)::text =
(ipinterface.ipaddr)::text))
                     ->  Index Scan using pk_nodeid on node
(cost=0.00..4.65 rows=1 width=24) (actual time=1.010..1.015 rows=1 loops=31)
                           Index Cond: ("outer".nodeid = node.nodeid)
               ->  Index Scan using pk_serviceid on service
(cost=0.00..3.01 rows=1 width=14) (actual time=0.337..0.344 rows=1 loops=31)
                     Index Cond: ("outer".serviceid = service.serviceid)
         ->  Index Scan using notifications_eventid_idx on
notifications  (cost=0.00..5.46 rows=2 width=28) (actual
time=0.019..0.019 rows=0 loops=31)
               Index Cond: ("outer".svclosteventid = notifications.eventid)
 Total runtime: 300.856 ms
(15 rows)

I'm much happier with 300ms vs 40868ms. Am I doing any harm by leaving
enable_sort=false? (I assume it will still use sort if it has no
choice?) More importantly for the long term, why is the optimizer making
such a blunder? (I notice just now that if I remove the limit clause
completely, even the enable_sort=false version takes a very long time,
so it appears the optimizer isn't handling limit well).

p.s. My only relationship to opennms is as a user (one who is trying to
make some reports go faster).

With limit 50 offset 200 variation (with enable_sort=false): Total
runtime: 59.288 ms
With limit 50 offset 200 variation (with enable_sort=true): Total
runtime:  18764.598 ms
No limit or offset and enable_sort=true: Total runtime: 19510.191 ms
No limit or offset and enable_sort=false: Total runtime: 53530.251 ms

Hm, disabling sort seems to hinder non limited queries, so this seems to
be a catch-22.


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Modulus operator returns negative values / numeric division rounds up sometimes
Next
From: Richard Huxton
Date:
Subject: Re: enable_sort optimization problem