Thread: Why is explain horribly optimistic for sorts?

Why is explain horribly optimistic for sorts?

From
Ben
Date:
Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

Some background.....

- We only have ~120,000 records.
- The relevant parts of the table are:

            Table "jennyann"
   Attribute    |    Type     | Modifier
----------------+-------------+----------
 ClientHost     | text        |
 LogTime        | timestamp   |
 target         | text        |
 host           | text        |
Indices: jennyan_host_key,
         jennyann_clienthost_key,
         jennyann_logtime_key,
         jennyann_target_key

- All indices are normal btrees.
- ClientHost is (for the most part) an IP address.

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE:  QUERY PLAN:

Sort  (cost=119.88..119.88 rows=2085 width=136)
  ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=2085 width=136)


A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

Help? Please?



Re: Why is explain horribly optimistic for sorts?

From
will trillich
Date:
On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote:
> Hello all. We are logging our web server traffic to postgresql 7.0.3, and
> that's working well. What's not working so well is retrieving our data in
> reasonable times if I try to order it. When I run our queries through
> explain, it *looks* like they will run in reasonable times, but in fact
> they take several minutes. That's not so good. I'm wondering why explain
> is so horribly wrong when it comes to sorts? For that matter, I'm
> wondering why sorts take so incredibly long.

[snip]

> explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
> NOTICE:  QUERY PLAN:
>
> Sort  (cost=119.88..119.88 rows=2085 width=136)
>   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=2085 width=136)
>
> A cost of 119 seems pretty good, and usually takes just a couple seconds
> for other queries I've made. Unfortuantely, it's completely wrong. This
> query takes several minutes to complete. If I drop the "order by" clause
> then things get to be reasonable speeds, but I rather need that clause
> there.

<guessing>
would this work?

    create temp table QRY as select * from jennyann where ...;
    create index ORD on qry("LogTime");
    select * from QRY order by "Logtime" limit 1000;

if the select is fast but the order is slow, maybe this'd help?
</guessing>

--
It is always hazardous to ask "Why?" in science, but it is often
interesting to do so just the same.
        -- Isaac Asimov, 'The Genetic Code'

will@serensoft.com
http://groups.yahoo.com/group/newbieDoc -- we need your brain!
http://www.dontUthink.com/ -- your brain needs us!

Re: Why is explain horribly optimistic for sorts?

From
"Mitch Vincent"
Date:
You VACUUM and VACUUM ANALYZE regularly, right?

-Mitch

----- Original Message -----
From: "Ben" <bench@silentmedia.com>
To: <pgsql-general@postgresql.org>
Sent: Saturday, March 03, 2001 12:44 PM
Subject: Why is explain horribly optimistic for sorts?


> Hello all. We are logging our web server traffic to postgresql 7.0.3, and
> that's working well. What's not working so well is retrieving our data in
> reasonable times if I try to order it. When I run our queries through
> explain, it *looks* like they will run in reasonable times, but in fact
> they take several minutes. That's not so good. I'm wondering why explain
> is so horribly wrong when it comes to sorts? For that matter, I'm
> wondering why sorts take so incredibly long.
>
> Some background.....
>
> - We only have ~120,000 records.
> - The relevant parts of the table are:
>
>             Table "jennyann"
>    Attribute    |    Type     | Modifier
> ----------------+-------------+----------
>  ClientHost     | text        |
>  LogTime        | timestamp   |
>  target         | text        |
>  host           | text        |
> Indices: jennyan_host_key,
>          jennyann_clienthost_key,
>          jennyann_logtime_key,
>          jennyann_target_key
>
> - All indices are normal btrees.
> - ClientHost is (for the most part) an IP address.
>
> Here's what explain tells me:
>
> explain SELECT * FROM jennyann where target like '/music/%' order by
"LogTime" limit 1000;
> NOTICE:  QUERY PLAN:
>
> Sort  (cost=119.88..119.88 rows=2085 width=136)
>   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94
rows=2085 width=136)
>
>
> A cost of 119 seems pretty good, and usually takes just a couple seconds
> for other queries I've made. Unfortuantely, it's completely wrong. This
> query takes several minutes to complete. If I drop the "order by" clause
> then things get to be reasonable speeds, but I rather need that clause
> there.
>
> Help? Please?
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>


Re: Why is explain horribly optimistic for sorts?

From
Tom Lane
Date:
Ben <bench@silentmedia.com> writes:
> Here's what explain tells me:

> explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
> NOTICE:  QUERY PLAN:

> Sort  (cost=119.88..119.88 rows=2085 width=136)
>   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=2085 width=136)

How many records are actually getting sorted here --- ie, how many would
you get back if you didn't have the LIMIT?  It seems to be estimating
2085, but I suspect that must be wildly wrong ...

            regards, tom lane

Re: Why is explain horribly optimistic for sorts?

From
Ben
Date:
nightly.

On Sat, 3 Mar 2001, Mitch Vincent wrote:

> You VACUUM and VACUUM ANALYZE regularly, right?
>
> -Mitch
>
> ----- Original Message -----
> From: "Ben" <bench@silentmedia.com>
> To: <pgsql-general@postgresql.org>
> Sent: Saturday, March 03, 2001 12:44 PM
> Subject: Why is explain horribly optimistic for sorts?
>
>
> > Hello all. We are logging our web server traffic to postgresql 7.0.3, and
> > that's working well. What's not working so well is retrieving our data in
> > reasonable times if I try to order it. When I run our queries through
> > explain, it *looks* like they will run in reasonable times, but in fact
> > they take several minutes. That's not so good. I'm wondering why explain
> > is so horribly wrong when it comes to sorts? For that matter, I'm
> > wondering why sorts take so incredibly long.
> >
> > Some background.....
> >
> > - We only have ~120,000 records.
> > - The relevant parts of the table are:
> >
> >             Table "jennyann"
> >    Attribute    |    Type     | Modifier
> > ----------------+-------------+----------
> >  ClientHost     | text        |
> >  LogTime        | timestamp   |
> >  target         | text        |
> >  host           | text        |
> > Indices: jennyan_host_key,
> >          jennyann_clienthost_key,
> >          jennyann_logtime_key,
> >          jennyann_target_key
> >
> > - All indices are normal btrees.
> > - ClientHost is (for the most part) an IP address.
> >
> > Here's what explain tells me:
> >
> > explain SELECT * FROM jennyann where target like '/music/%' order by
> "LogTime" limit 1000;
> > NOTICE:  QUERY PLAN:
> >
> > Sort  (cost=119.88..119.88 rows=2085 width=136)
> >   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94
> rows=2085 width=136)
> >
> >
> > A cost of 119 seems pretty good, and usually takes just a couple seconds
> > for other queries I've made. Unfortuantely, it's completely wrong. This
> > query takes several minutes to complete. If I drop the "order by" clause
> > then things get to be reasonable speeds, but I rather need that clause
> > there.
> >
> > Help? Please?
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
> >
>


Re: Why is explain horribly optimistic for sorts?

From
Ben
Date:
Yes, it is horribly wrong.

select count(*) FROM jennyann where target like '/music/%'

gives me 93686 rows.

On Sat, 3 Mar 2001, Tom Lane wrote:

> Ben <bench@silentmedia.com> writes:
> > Here's what explain tells me:
>
> > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
> > NOTICE:  QUERY PLAN:
>
> > Sort  (cost=119.88..119.88 rows=2085 width=136)
> >   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=2085 width=136)
>
> How many records are actually getting sorted here --- ie, how many would
> you get back if you didn't have the LIMIT?  It seems to be estimating
> 2085, but I suspect that must be wildly wrong ...
>
>             regards, tom lane
>


Re: Why is explain horribly optimistic for sorts?

From
Ben
Date:
While this does get me results faster, it still takes a couple minutes,
and I don't see why this should be the case.

On Sat, 3 Mar 2001, will trillich wrote:

> On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote:
> > Hello all. We are logging our web server traffic to postgresql 7.0.3, and
> > that's working well. What's not working so well is retrieving our data in
> > reasonable times if I try to order it. When I run our queries through
> > explain, it *looks* like they will run in reasonable times, but in fact
> > they take several minutes. That's not so good. I'm wondering why explain
> > is so horribly wrong when it comes to sorts? For that matter, I'm
> > wondering why sorts take so incredibly long.
>
> [snip]
>
> > explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
> > NOTICE:  QUERY PLAN:
> >
> > Sort  (cost=119.88..119.88 rows=2085 width=136)
> >   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=2085 width=136)
> >
> > A cost of 119 seems pretty good, and usually takes just a couple seconds
> > for other queries I've made. Unfortuantely, it's completely wrong. This
> > query takes several minutes to complete. If I drop the "order by" clause
> > then things get to be reasonable speeds, but I rather need that clause
> > there.
>
> <guessing>
> would this work?
>
>     create temp table QRY as select * from jennyann where ...;
>     create index ORD on qry("LogTime");
>     select * from QRY order by "Logtime" limit 1000;
>
> if the select is fast but the order is slow, maybe this'd help?
> </guessing>
>
> --
> It is always hazardous to ask "Why?" in science, but it is often
> interesting to do so just the same.
>         -- Isaac Asimov, 'The Genetic Code'
>
> will@serensoft.com
> http://groups.yahoo.com/group/newbieDoc -- we need your brain!
> http://www.dontUthink.com/ -- your brain needs us!
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>


Re: Why is explain horribly optimistic for sorts?

From
Ben
Date:
After enabling the new likeplanning code, things only get worse... the
estimated cost for the same query is now:

Sort  (cost=4.95..4.95 rows=1 width=136)
  ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=1 width=136)

So this doesn't really seem to help much. :)

Out of curiosity, why does it take so long to order data by a datetime
field?

On Sat, 3 Mar 2001, Tom Lane wrote:

> Ben <bench@silentmedia.com> writes:
> > Yes, it is horribly wrong.
> > select count(*) FROM jennyann where target like '/music/%'
> > gives me 93686 rows.
>
> Well, that misestimation is the source of the problem, then, not any
> misestimation of the cost of a sort.
>
> 7.0 didn't have very good estimation rules for LIKE clauses, at least
> not by default.  Have you tried the new LIKE estimator (see
> contrib/likeplanning/README in the source tree)?  I'm not sure it will
> do any better, given that your data appears to be mightily nonuniform
> ;-), but it's worth a try.
>
>             regards, tom lane
>


Re: Why is explain horribly optimistic for sorts?

From
Tom Lane
Date:
Ben <bench@silentmedia.com> writes:
> Yes, it is horribly wrong.
> select count(*) FROM jennyann where target like '/music/%'
> gives me 93686 rows.

Well, that misestimation is the source of the problem, then, not any
misestimation of the cost of a sort.

7.0 didn't have very good estimation rules for LIKE clauses, at least
not by default.  Have you tried the new LIKE estimator (see
contrib/likeplanning/README in the source tree)?  I'm not sure it will
do any better, given that your data appears to be mightily nonuniform
;-), but it's worth a try.

            regards, tom lane

Re: Why is explain horribly optimistic for sorts?

From
Tom Lane
Date:
Ben <bench@silentmedia.com> writes:
> After enabling the new likeplanning code, things only get worse... the
> estimated cost for the same query is now:

> Sort  (cost=4.95..4.95 rows=1 width=136)
>   ->  Index Scan using jennyann_target_key on jennyann  (cost=0.00..4.94 rows=1 width=136)

Yeah, I was afraid of that: the new code is smarter but also subject to
the same range-estimation problem that plagues us in cases like "WHERE a
BETWEEN b AND c".  The only stats we have for range estimation are the
column minimum and maximum values collected by VACUUM ANALYZE, so we
just do a linear interpolation.  If your data isn't fairly uniformly
distributed between the min and max then range estimates will be all
wet.  In this example, it seems that 80% of your target entries start
with "/music/", but I'll bet the min and max cover a much wider range.
So the estimator mistakenly thinks that LIKE '/music/%' will select
a fairly small proportion of rows, rather than 80%, and this leads the
planner to choose a plan that would be appropriate if that were true.

There has been talk of keeping more extensive stats to allow better
estimation in cases like this, but I don't foresee it happening for
a release or two yet...

> Out of curiosity, why does it take so long to order data by a datetime
> field?

AFAIK it shouldn't be materially different from sorting on a float8
field.  You're still blaming the wrong thing: this isn't the sorter's
fault, it's doing the best it can.  The reason this is so slow, I
believe, is that an indexscan is being used to select 80% of the data.
That results in a lot of useless thrashing to visit tuples in strict
order by target. A simple sequential scan would've been much faster.

It's possible that an indexscan on logtime would work better, because it
could stop as soon as it had retrieved 1000 matching tuples.  This would
be particularly likely if the tuples are more or less in logtime order,
so that there's not a lot of disk thrashing to retrieve them.  Otherwise
a plain old sequential scan and sort is likely to be the fastest way.

Unfortunately the planner is never going to figure this out as long as
it's so far off about the selectivity of the LIKE.  You could try
forcing its hand with various combinations of
    SET enable_sort TO off;
    SET enable_indexscan TO off;
... just remember to turn these things back on for subsequent queries,
else you'll get really bad plans ...

            regards, tom lane

documentation bug (was Re: Why is explain horribly optimistic for sorts?)

From
Matthew Rice
Date:
will trillich <will@serensoft.com> writes:
>     create temp table QRY as select * from jennyann where ...;

I'm still learning SQL so I looked up the 'temp' keyword in the
7.0 docs.  They describe 'TEMP' in the 'CREATE TABLE' command but
not in the 'CREATE TABLE AS' command.

Well, I just tried it and it worked so I guess it's just an oversight
in the docs.

Regards,
--
matthew rice <matt@starnix.com>                               starnix inc.
tollfree: 1-87-pro-linux                        thornhill, ontario, canada
http://www.starnix.com              professional linux services & products

Matthew Rice <matt@starnix.com> writes:
> I'm still learning SQL so I looked up the 'temp' keyword in the
> 7.0 docs.  They describe 'TEMP' in the 'CREATE TABLE' command but
> not in the 'CREATE TABLE AS' command.

> Well, I just tried it and it worked so I guess it's just an oversight
> in the docs.

Good point.  Fixed, thanks.

            regards, tom lane

Re: Why is explain horribly optimistic for sorts?

From
Ben
Date:
On Sat, 3 Mar 2001, Tom Lane wrote:

> Yeah, I was afraid of that: the new code is smarter but also subject to
> the same range-estimation problem that plagues us in cases like "WHERE a
> BETWEEN b AND c".  The only stats we have for range estimation are the
> column minimum and maximum values collected by VACUUM ANALYZE, so we
> just do a linear interpolation.  If your data isn't fairly uniformly
> distributed between the min and max then range estimates will be all
> wet.  In this example, it seems that 80% of your target entries start
> with "/music/", but I'll bet the min and max cover a much wider range.
> So the estimator mistakenly thinks that LIKE '/music/%' will select
> a fairly small proportion of rows, rather than 80%, and this leads the
> planner to choose a plan that would be appropriate if that were true.

Ah, I think I understand the problem here.

> > Out of curiosity, why does it take so long to order data by a datetime
> > field?
>
> AFAIK it shouldn't be materially different from sorting on a float8
> field.  You're still blaming the wrong thing: this isn't the sorter's
> fault, it's doing the best it can.  The reason this is so slow, I
> believe, is that an indexscan is being used to select 80% of the data.
> That results in a lot of useless thrashing to visit tuples in strict
> order by target. A simple sequential scan would've been much faster.

I think you're correct, because this did speed things up quite a bit....
the query time dropped to about a minute.

> It's possible that an indexscan on logtime would work better, because it
> could stop as soon as it had retrieved 1000 matching tuples.  This would
> be particularly likely if the tuples are more or less in logtime order,
> so that there's not a lot of disk thrashing to retrieve them.  Otherwise
> a plain old sequential scan and sort is likely to be the fastest way.

And *this* sped things up even more.... not using the target key and using
the LogTime key instead dropped the time to about 30 seconds.

> Unfortunately the planner is never going to figure this out as long as
> it's so far off about the selectivity of the LIKE.  You could try
> forcing its hand with various combinations of
>     SET enable_sort TO off;
>     SET enable_indexscan TO off;
> ... just remember to turn these things back on for subsequent queries,
> else you'll get really bad plans ...

Having no notion of what's involved, it seems like it would be nice to let
the query optimizer select which index to use based on the query and the
indices available. Or, if it already does this, to figure out exactly why
it's not doing the right thing here.

Thanks for the help!