Thread: Why is explain horribly optimistic for sorts?
Hello all. We are logging our web server traffic to postgresql 7.0.3, and that's working well. What's not working so well is retrieving our data in reasonable times if I try to order it. When I run our queries through explain, it *looks* like they will run in reasonable times, but in fact they take several minutes. That's not so good. I'm wondering why explain is so horribly wrong when it comes to sorts? For that matter, I'm wondering why sorts take so incredibly long. Some background..... - We only have ~120,000 records. - The relevant parts of the table are: Table "jennyann" Attribute | Type | Modifier ----------------+-------------+---------- ClientHost | text | LogTime | timestamp | target | text | host | text | Indices: jennyan_host_key, jennyann_clienthost_key, jennyann_logtime_key, jennyann_target_key - All indices are normal btrees. - ClientHost is (for the most part) an IP address. Here's what explain tells me: explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; NOTICE: QUERY PLAN: Sort (cost=119.88..119.88 rows=2085 width=136) -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) A cost of 119 seems pretty good, and usually takes just a couple seconds for other queries I've made. Unfortuantely, it's completely wrong. This query takes several minutes to complete. If I drop the "order by" clause then things get to be reasonable speeds, but I rather need that clause there. Help? Please?
On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote: > Hello all. We are logging our web server traffic to postgresql 7.0.3, and > that's working well. What's not working so well is retrieving our data in > reasonable times if I try to order it. When I run our queries through > explain, it *looks* like they will run in reasonable times, but in fact > they take several minutes. That's not so good. I'm wondering why explain > is so horribly wrong when it comes to sorts? For that matter, I'm > wondering why sorts take so incredibly long. [snip] > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; > NOTICE: QUERY PLAN: > > Sort (cost=119.88..119.88 rows=2085 width=136) > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) > > A cost of 119 seems pretty good, and usually takes just a couple seconds > for other queries I've made. Unfortuantely, it's completely wrong. This > query takes several minutes to complete. If I drop the "order by" clause > then things get to be reasonable speeds, but I rather need that clause > there. <guessing> would this work? create temp table QRY as select * from jennyann where ...; create index ORD on qry("LogTime"); select * from QRY order by "Logtime" limit 1000; if the select is fast but the order is slow, maybe this'd help? </guessing> -- It is always hazardous to ask "Why?" in science, but it is often interesting to do so just the same. -- Isaac Asimov, 'The Genetic Code' will@serensoft.com http://groups.yahoo.com/group/newbieDoc -- we need your brain! http://www.dontUthink.com/ -- your brain needs us!
You VACUUM and VACUUM ANALYZE regularly, right? -Mitch ----- Original Message ----- From: "Ben" <bench@silentmedia.com> To: <pgsql-general@postgresql.org> Sent: Saturday, March 03, 2001 12:44 PM Subject: Why is explain horribly optimistic for sorts? > Hello all. We are logging our web server traffic to postgresql 7.0.3, and > that's working well. What's not working so well is retrieving our data in > reasonable times if I try to order it. When I run our queries through > explain, it *looks* like they will run in reasonable times, but in fact > they take several minutes. That's not so good. I'm wondering why explain > is so horribly wrong when it comes to sorts? For that matter, I'm > wondering why sorts take so incredibly long. > > Some background..... > > - We only have ~120,000 records. > - The relevant parts of the table are: > > Table "jennyann" > Attribute | Type | Modifier > ----------------+-------------+---------- > ClientHost | text | > LogTime | timestamp | > target | text | > host | text | > Indices: jennyan_host_key, > jennyann_clienthost_key, > jennyann_logtime_key, > jennyann_target_key > > - All indices are normal btrees. > - ClientHost is (for the most part) an IP address. > > Here's what explain tells me: > > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; > NOTICE: QUERY PLAN: > > Sort (cost=119.88..119.88 rows=2085 width=136) > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) > > > A cost of 119 seems pretty good, and usually takes just a couple seconds > for other queries I've made. Unfortuantely, it's completely wrong. This > query takes several minutes to complete. If I drop the "order by" clause > then things get to be reasonable speeds, but I rather need that clause > there. > > Help? Please? > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Ben <bench@silentmedia.com> writes: > Here's what explain tells me: > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; > NOTICE: QUERY PLAN: > Sort (cost=119.88..119.88 rows=2085 width=136) > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) How many records are actually getting sorted here --- ie, how many would you get back if you didn't have the LIMIT? It seems to be estimating 2085, but I suspect that must be wildly wrong ... regards, tom lane
nightly. On Sat, 3 Mar 2001, Mitch Vincent wrote: > You VACUUM and VACUUM ANALYZE regularly, right? > > -Mitch > > ----- Original Message ----- > From: "Ben" <bench@silentmedia.com> > To: <pgsql-general@postgresql.org> > Sent: Saturday, March 03, 2001 12:44 PM > Subject: Why is explain horribly optimistic for sorts? > > > > Hello all. We are logging our web server traffic to postgresql 7.0.3, and > > that's working well. What's not working so well is retrieving our data in > > reasonable times if I try to order it. When I run our queries through > > explain, it *looks* like they will run in reasonable times, but in fact > > they take several minutes. That's not so good. I'm wondering why explain > > is so horribly wrong when it comes to sorts? For that matter, I'm > > wondering why sorts take so incredibly long. > > > > Some background..... > > > > - We only have ~120,000 records. > > - The relevant parts of the table are: > > > > Table "jennyann" > > Attribute | Type | Modifier > > ----------------+-------------+---------- > > ClientHost | text | > > LogTime | timestamp | > > target | text | > > host | text | > > Indices: jennyan_host_key, > > jennyann_clienthost_key, > > jennyann_logtime_key, > > jennyann_target_key > > > > - All indices are normal btrees. > > - ClientHost is (for the most part) an IP address. > > > > Here's what explain tells me: > > > > explain SELECT * FROM jennyann where target like '/music/%' order by > "LogTime" limit 1000; > > NOTICE: QUERY PLAN: > > > > Sort (cost=119.88..119.88 rows=2085 width=136) > > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 > rows=2085 width=136) > > > > > > A cost of 119 seems pretty good, and usually takes just a couple seconds > > for other queries I've made. Unfortuantely, it's completely wrong. This > > query takes several minutes to complete. If I drop the "order by" clause > > then things get to be reasonable speeds, but I rather need that clause > > there. > > > > Help? Please? > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > >
Yes, it is horribly wrong. select count(*) FROM jennyann where target like '/music/%' gives me 93686 rows. On Sat, 3 Mar 2001, Tom Lane wrote: > Ben <bench@silentmedia.com> writes: > > Here's what explain tells me: > > > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; > > NOTICE: QUERY PLAN: > > > Sort (cost=119.88..119.88 rows=2085 width=136) > > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) > > How many records are actually getting sorted here --- ie, how many would > you get back if you didn't have the LIMIT? It seems to be estimating > 2085, but I suspect that must be wildly wrong ... > > regards, tom lane >
While this does get me results faster, it still takes a couple minutes, and I don't see why this should be the case. On Sat, 3 Mar 2001, will trillich wrote: > On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote: > > Hello all. We are logging our web server traffic to postgresql 7.0.3, and > > that's working well. What's not working so well is retrieving our data in > > reasonable times if I try to order it. When I run our queries through > > explain, it *looks* like they will run in reasonable times, but in fact > > they take several minutes. That's not so good. I'm wondering why explain > > is so horribly wrong when it comes to sorts? For that matter, I'm > > wondering why sorts take so incredibly long. > > [snip] > > > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000; > > NOTICE: QUERY PLAN: > > > > Sort (cost=119.88..119.88 rows=2085 width=136) > > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136) > > > > A cost of 119 seems pretty good, and usually takes just a couple seconds > > for other queries I've made. Unfortuantely, it's completely wrong. This > > query takes several minutes to complete. If I drop the "order by" clause > > then things get to be reasonable speeds, but I rather need that clause > > there. > > <guessing> > would this work? > > create temp table QRY as select * from jennyann where ...; > create index ORD on qry("LogTime"); > select * from QRY order by "Logtime" limit 1000; > > if the select is fast but the order is slow, maybe this'd help? > </guessing> > > -- > It is always hazardous to ask "Why?" in science, but it is often > interesting to do so just the same. > -- Isaac Asimov, 'The Genetic Code' > > will@serensoft.com > http://groups.yahoo.com/group/newbieDoc -- we need your brain! > http://www.dontUthink.com/ -- your brain needs us! > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
After enabling the new likeplanning code, things only get worse... the estimated cost for the same query is now: Sort (cost=4.95..4.95 rows=1 width=136) -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=1 width=136) So this doesn't really seem to help much. :) Out of curiosity, why does it take so long to order data by a datetime field? On Sat, 3 Mar 2001, Tom Lane wrote: > Ben <bench@silentmedia.com> writes: > > Yes, it is horribly wrong. > > select count(*) FROM jennyann where target like '/music/%' > > gives me 93686 rows. > > Well, that misestimation is the source of the problem, then, not any > misestimation of the cost of a sort. > > 7.0 didn't have very good estimation rules for LIKE clauses, at least > not by default. Have you tried the new LIKE estimator (see > contrib/likeplanning/README in the source tree)? I'm not sure it will > do any better, given that your data appears to be mightily nonuniform > ;-), but it's worth a try. > > regards, tom lane >
Ben <bench@silentmedia.com> writes: > Yes, it is horribly wrong. > select count(*) FROM jennyann where target like '/music/%' > gives me 93686 rows. Well, that misestimation is the source of the problem, then, not any misestimation of the cost of a sort. 7.0 didn't have very good estimation rules for LIKE clauses, at least not by default. Have you tried the new LIKE estimator (see contrib/likeplanning/README in the source tree)? I'm not sure it will do any better, given that your data appears to be mightily nonuniform ;-), but it's worth a try. regards, tom lane
Ben <bench@silentmedia.com> writes: > After enabling the new likeplanning code, things only get worse... the > estimated cost for the same query is now: > Sort (cost=4.95..4.95 rows=1 width=136) > -> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=1 width=136) Yeah, I was afraid of that: the new code is smarter but also subject to the same range-estimation problem that plagues us in cases like "WHERE a BETWEEN b AND c". The only stats we have for range estimation are the column minimum and maximum values collected by VACUUM ANALYZE, so we just do a linear interpolation. If your data isn't fairly uniformly distributed between the min and max then range estimates will be all wet. In this example, it seems that 80% of your target entries start with "/music/", but I'll bet the min and max cover a much wider range. So the estimator mistakenly thinks that LIKE '/music/%' will select a fairly small proportion of rows, rather than 80%, and this leads the planner to choose a plan that would be appropriate if that were true. There has been talk of keeping more extensive stats to allow better estimation in cases like this, but I don't foresee it happening for a release or two yet... > Out of curiosity, why does it take so long to order data by a datetime > field? AFAIK it shouldn't be materially different from sorting on a float8 field. You're still blaming the wrong thing: this isn't the sorter's fault, it's doing the best it can. The reason this is so slow, I believe, is that an indexscan is being used to select 80% of the data. That results in a lot of useless thrashing to visit tuples in strict order by target. A simple sequential scan would've been much faster. It's possible that an indexscan on logtime would work better, because it could stop as soon as it had retrieved 1000 matching tuples. This would be particularly likely if the tuples are more or less in logtime order, so that there's not a lot of disk thrashing to retrieve them. Otherwise a plain old sequential scan and sort is likely to be the fastest way. Unfortunately the planner is never going to figure this out as long as it's so far off about the selectivity of the LIKE. You could try forcing its hand with various combinations of SET enable_sort TO off; SET enable_indexscan TO off; ... just remember to turn these things back on for subsequent queries, else you'll get really bad plans ... regards, tom lane
will trillich <will@serensoft.com> writes: > create temp table QRY as select * from jennyann where ...; I'm still learning SQL so I looked up the 'temp' keyword in the 7.0 docs. They describe 'TEMP' in the 'CREATE TABLE' command but not in the 'CREATE TABLE AS' command. Well, I just tried it and it worked so I guess it's just an oversight in the docs. Regards, -- matthew rice <matt@starnix.com> starnix inc. tollfree: 1-87-pro-linux thornhill, ontario, canada http://www.starnix.com professional linux services & products
Matthew Rice <matt@starnix.com> writes: > I'm still learning SQL so I looked up the 'temp' keyword in the > 7.0 docs. They describe 'TEMP' in the 'CREATE TABLE' command but > not in the 'CREATE TABLE AS' command. > Well, I just tried it and it worked so I guess it's just an oversight > in the docs. Good point. Fixed, thanks. regards, tom lane
On Sat, 3 Mar 2001, Tom Lane wrote: > Yeah, I was afraid of that: the new code is smarter but also subject to > the same range-estimation problem that plagues us in cases like "WHERE a > BETWEEN b AND c". The only stats we have for range estimation are the > column minimum and maximum values collected by VACUUM ANALYZE, so we > just do a linear interpolation. If your data isn't fairly uniformly > distributed between the min and max then range estimates will be all > wet. In this example, it seems that 80% of your target entries start > with "/music/", but I'll bet the min and max cover a much wider range. > So the estimator mistakenly thinks that LIKE '/music/%' will select > a fairly small proportion of rows, rather than 80%, and this leads the > planner to choose a plan that would be appropriate if that were true. Ah, I think I understand the problem here. > > Out of curiosity, why does it take so long to order data by a datetime > > field? > > AFAIK it shouldn't be materially different from sorting on a float8 > field. You're still blaming the wrong thing: this isn't the sorter's > fault, it's doing the best it can. The reason this is so slow, I > believe, is that an indexscan is being used to select 80% of the data. > That results in a lot of useless thrashing to visit tuples in strict > order by target. A simple sequential scan would've been much faster. I think you're correct, because this did speed things up quite a bit.... the query time dropped to about a minute. > It's possible that an indexscan on logtime would work better, because it > could stop as soon as it had retrieved 1000 matching tuples. This would > be particularly likely if the tuples are more or less in logtime order, > so that there's not a lot of disk thrashing to retrieve them. Otherwise > a plain old sequential scan and sort is likely to be the fastest way. And *this* sped things up even more.... not using the target key and using the LogTime key instead dropped the time to about 30 seconds. > Unfortunately the planner is never going to figure this out as long as > it's so far off about the selectivity of the LIKE. You could try > forcing its hand with various combinations of > SET enable_sort TO off; > SET enable_indexscan TO off; > ... just remember to turn these things back on for subsequent queries, > else you'll get really bad plans ... Having no notion of what's involved, it seems like it would be nice to let the query optimizer select which index to use based on the query and the indices available. Or, if it already does this, to figure out exactly why it's not doing the right thing here. Thanks for the help!