Thread: Re: Cost of sort/order by not estimated by the query planner

Re: Cost of sort/order by not estimated by the query planner

From
Laurent Laborde
Date:
hummm.... Adding pgsql-perf :)

On Mon, Nov 30, 2009 at 5:54 PM, Laurent Laborde <kerdezixe@gmail.com> wrote:
> Friendly greetings !
> I use postgresql 8.3.6.
>
> here is a few info about the table i'm querying :
> -------------------------------------------------------------
> - select count(*) from _article : 17301610
> - select count(*) from _article WHERE (_article.bitfield && getbit(0)) : 6729
>
>
> Here are both request with problems :
> --------------------------------------------------
>
> QUERY 1 : Very fast !
> -------------
>
> explain SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 500;
>                                             QUERY PLAN
> -----------------------------------------------------------------------------------------------------
>  Limit  (cost=66114.13..66115.38 rows=500 width=1114)
>   ->  Sort  (cost=66114.13..66157.37 rows=17296 width=1114)
>         Sort Key: id
>         ->  Bitmap Heap Scan on _article  (cost=138.32..65252.29
> rows=17296 width=1114)
>               Recheck Cond: (bitfield && B'1'::bit varying)
>               ->  Bitmap Index Scan on idx_article_bitfield
> (cost=0.00..134.00 rows=17296 width=0)
>                     Index Cond: (bitfield && B'1'::bit varying)
>
>
>
>
> QUERY 2 : Endless ... (more than 30mn... i stopped the query)
> -------------
>
> explain SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 5;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..2042.87 rows=5 width=1114)
>   ->  Index Scan using _article_pkey on _article
> (cost=0.00..7066684.46 rows=17296 width=1114)
>         Filter: (bitfield && B'1'::bit varying)
> (3 rows)
>
>
> With LIMIT 5 and LIMIT 500, the query plan are differents.
> Postgresql estimate that it can do a a simple index scan to find only 5 row.
> With more than LIMIT ~400 it estimate that it's faster to do a more
> complex plan.
> and it make sense !
>
> The problem is in the order by, of course.
> If i remove the "order by" the LIMIT 5 is faster (0.044 ms) and do an
> index scan.
> At limit 500 (without order) it still use an index scan and it is
> slightly slower.
> At limit 5000 (without order) it switch to a Bitmap Index Scan +
> Bitmap Heap Scan and it's slower but acceptable (5.275 ms)
>
> Why, with the "QUERY 2", postgresql doesn't estimate the cost of the
> Sort/ORDER BY ?
> Of course, by ignoring the order, both query plan are right and the
> choice for thoses differents plans totally make sense.
>
> But... if the planner would be kind enough to considerate the cost of
> the order by, it would certainly choose the Bitmap Index + Bitmap Heap
> scan for the limit 5.
> And not an index_scan pkey !
>
> I have set the statistics to 1000 for _article.bitfield, just in case
> (and ran a vacuum analyze), it doesn't change anything.
>
> Is that a bug ? any Idea ?
>
> Thank you :)
>
> --
> Laurent "ker2x" Laborde
> Sysadmin & DBA at http://www.over-blog.com/
>



--
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

Re: Cost of sort/order by not estimated by the query planner

From
Greg Stark
Date:
On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
>>                                             QUERY PLAN
>> -----------------------------------------------------------------------------------------------------
>>  Limit  (cost=66114.13..66115.38 rows=500 width=1114)
>>   ->  Sort  (cost=66114.13..66157.37 rows=17296 width=1114)
>>         Sort Key: id
>>         ->  Bitmap Heap Scan on _article  (cost=138.32..65252.29
>> rows=17296 width=1114)
>>               Recheck Cond: (bitfield && B'1'::bit varying)
>>               ->  Bitmap Index Scan on idx_article_bitfield
>> (cost=0.00..134.00 rows=17296 width=0)
>>                     Index Cond: (bitfield && B'1'::bit varying)


Uhm, what kind of index is idx_article_bitfield?



--
greg

Re: Cost of sort/order by not estimated by the query planner

From
Laurent Laborde
Date:
On Wed, Dec 2, 2009 at 1:42 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
>>>                                             QUERY PLAN
>>> -----------------------------------------------------------------------------------------------------
>>>  Limit  (cost=66114.13..66115.38 rows=500 width=1114)
>>>   ->  Sort  (cost=66114.13..66157.37 rows=17296 width=1114)
>>>         Sort Key: id
>>>         ->  Bitmap Heap Scan on _article  (cost=138.32..65252.29
>>> rows=17296 width=1114)
>>>               Recheck Cond: (bitfield && B'1'::bit varying)
>>>               ->  Bitmap Index Scan on idx_article_bitfield
>>> (cost=0.00..134.00 rows=17296 width=0)
>>>                     Index Cond: (bitfield && B'1'::bit varying)
>
>
> Uhm, what kind of index is idx_article_bitfield?

Mmm, i forgot about that ! It's in a GIN index.
"idx_article_bitfield" gin (bitfield), tablespace "indexspace"


--
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

Re: Cost of sort/order by not estimated by the query planner

From
Greg Stark
Date:
On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
>>                                           QUERY PLAN
>> -------------------------------------------------------------------------------------------------
>>  Limit  (cost=0.00..2042.87 rows=5 width=1114)
>>   ->  Index Scan using _article_pkey on _article
>> (cost=0.00..7066684.46 rows=17296 width=1114)
>>         Filter: (bitfield && B'1'::bit varying)
>

Ah, I missed this the first time around. It's scanning _article_pkey
here. Ie, it's scanning the table from the oldest to the newest
article assuming that the values wihch satisfy that constraint are
evenly distributed and it'll find five of them pretty quickly. In
reality there's a correlation between this bit being set and the value
of _article.id and all the ones with it set are towards the end.
Postgres doesn't have any statistics on how multiple columns are
related yet so it can't know this.

If this is an important query you might try having an index on
<bitfield,id> or a partial index on "id where bitfield && B'1' ". The
latter sounds like what you really need

--
greg

Re: Cost of sort/order by not estimated by the query planner

From
Laurent Laborde
Date:
On Wed, Dec 2, 2009 at 1:47 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
>>>                                           QUERY PLAN
>>> -------------------------------------------------------------------------------------------------
>>>  Limit  (cost=0.00..2042.87 rows=5 width=1114)
>>>   ->  Index Scan using _article_pkey on _article
>>> (cost=0.00..7066684.46 rows=17296 width=1114)
>>>         Filter: (bitfield && B'1'::bit varying)
>>
>
> Ah, I missed this the first time around. It's scanning _article_pkey
> here. Ie, it's scanning the table from the oldest to the newest
> article assuming that the values wihch satisfy that constraint are
> evenly distributed and it'll find five of them pretty quickly. In
> reality there's a correlation between this bit being set and the value
> of _article.id and all the ones with it set are towards the end.
> Postgres doesn't have any statistics on how multiple columns are
> related yet so it can't know this.
>
> If this is an important query you might try having an index on
> <bitfield,id> or a partial index on "id where bitfield && B'1' ". The
> latter sounds like what you really need

There is, indeed, a lot of tricks and hacks.
Maybe my question was too confusing.

The question is : why a limit 5 is much much slower than a limit 500 ?

The problem is in the "order by" and not "finding enough the data that
match the filter".
Even if it's not evenly distributed, the queries without "order by"
are much much faster, EVEN when using the "pkey query plan".

without "order by" using the bitmap -> fast
without "order by" using the pkey index -> fast
with "order by" using the bitmap -> fast
with "order by" using the pkey index -> slowwwwwwwwwwwww

--
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

Re: Cost of sort/order by not estimated by the query planner

From
Robert Haas
Date:
On Wed, Dec 2, 2009 at 8:01 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
> On Wed, Dec 2, 2009 at 1:47 PM, Greg Stark <gsstark@mit.edu> wrote:
>> On Wed, Dec 2, 2009 at 11:13 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
>>>>                                           QUERY PLAN
>>>> -------------------------------------------------------------------------------------------------
>>>>  Limit  (cost=0.00..2042.87 rows=5 width=1114)
>>>>   ->  Index Scan using _article_pkey on _article
>>>> (cost=0.00..7066684.46 rows=17296 width=1114)
>>>>         Filter: (bitfield && B'1'::bit varying)
>>>
>>
>> Ah, I missed this the first time around. It's scanning _article_pkey
>> here. Ie, it's scanning the table from the oldest to the newest
>> article assuming that the values wihch satisfy that constraint are
>> evenly distributed and it'll find five of them pretty quickly. In
>> reality there's a correlation between this bit being set and the value
>> of _article.id and all the ones with it set are towards the end.
>> Postgres doesn't have any statistics on how multiple columns are
>> related yet so it can't know this.
>>
>> If this is an important query you might try having an index on
>> <bitfield,id> or a partial index on "id where bitfield && B'1' ". The
>> latter sounds like what you really need
>
> There is, indeed, a lot of tricks and hacks.
> Maybe my question was too confusing.
>
> The question is : why a limit 5 is much much slower than a limit 500 ?
>
> The problem is in the "order by" and not "finding enough the data that
> match the filter".
> Even if it's not evenly distributed, the queries without "order by"
> are much much faster, EVEN when using the "pkey query plan".
>
> without "order by" using the bitmap -> fast
> without "order by" using the pkey index -> fast
> with "order by" using the bitmap -> fast
> with "order by" using the pkey index -> slowwwwwwwwwwwww

I'm confused.  I think you've only shown us two query plans, so it's
hard to judge what's going on here in the two cases you haven't shown.
 Also, you haven't shown the EXPLAIN ANALYZE output, so it's a bit
tricky to judge what is really happening.

However... as a general rule, the usual reason why the planner makes
bad decisions with small LIMITs is that it overestimates the impact of
the startup cost.  If one plan has a startup cost of 1 and a run cost
of 100, and another plan has a startup cost of 0 and a run cost of
1000000, the planner will pick the latter plan if a sufficiently small
fraction of the rows are being fetched (less than a millionth of
them).  It's easy for the estimates to be off by enough to make this
is a bad decision, especially if using operations that the planner
doesn't have good estimates for (&& may be one such).

...Robert

Re: Cost of sort/order by not estimated by the query planner

From
Laurent Laborde
Date:
On Wed, Dec 2, 2009 at 2:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I'm confused.  I think you've only shown us two query plans, so it's
> hard to judge what's going on here in the two cases you haven't shown.
>  Also, you haven't shown the EXPLAIN ANALYZE output, so it's a bit
> tricky to judge what is really happening.

I will provide all the explain analyze.
But considering that the request with limit 5 take more than an half
hour (i don't know how much exactly), it will take some times.
See you soon :)

--
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

Re: Cost of sort/order by not estimated by the query planner

From
Laurent Laborde
Date:
* without order by, limit 5 : 70ms
----------------------------------
 explain analyze SELECT *
FROM   _article
WHERE (_article.bitfield && getbit(0))
LIMIT 5;

QUERY PLAN  :
Limit  (cost=0.00..20.03 rows=5 width=1109) (actual
time=70.190..70.265 rows=5 loops=1)
   ->  Index Scan using idx_article_bitfield on _article
(cost=0.00..69290.99 rows=17298 width=1109) (actual
time=70.188..70.260 rows=5 loops=1)
         Index Cond: (bitfield && B'1'::bit varying)
 Total runtime: 70.406 ms
(4 rows)

* without order by, limit 500 (same plan as above) : 371ms
------------------------------------------------------------------
explain analyze SELECT *
FROM   _article
WHERE (_article.bitfield && getbit(0))
LIMIT 500;

QUERY PLAN:
 Limit  (cost=0.00..2002.86 rows=500 width=1109) (actual
time=0.087..371.257 rows=500 loops=1)
   ->  Index Scan using idx_article_bitfield on _article
(cost=0.00..69290.99 rows=17298 width=1109) (actual
time=0.086..371.075 rows=500 loops=1)
         Index Cond: (bitfield && B'1'::bit varying)
 Total runtime: 371.369 ms

* without order by, limit 5000 (query plan changed) : 1307ms
-------------------------------------------------------------------
 explain analyze SELECT *
FROM   _article
WHERE (_article.bitfield && getbit(0))
LIMIT 5000;

QUERY PLAN :
 Limit  (cost=138.34..18971.86 rows=5000 width=1109) (actual
time=53.782..1307.173 rows=5000 loops=1)
   ->  Bitmap Heap Scan on _article  (cost=138.34..65294.79 rows=17298
width=1109) (actual time=53.781..1305.565 rows=5000 loops=1)
         Recheck Cond: (bitfield && B'1'::bit varying)
         ->  Bitmap Index Scan on idx_article_bitfield
(cost=0.00..134.01 rows=17298 width=0) (actual time=53.606..53.606
rows=6743 loops=1)
               Index Cond: (bitfield && B'1'::bit varying)
 Total runtime: 1307.972 ms


So... *without* "order by", differents limit and different query plan
: the queries are fast.

* with order by, limit 5 :
------------------------------
explain analyze SELECT *
FROM   _article
WHERE (_article.bitfield && getbit(0))
ORDER BY _article.id ASC
LIMIT 5;

QUERY PLAN :
Mmmm.... the query is running since 2h ... waiting, waiting.


* with order by, limit 500 : 546ms
-------------------------------
explain analyze SELECT *
FROM   _article
WHERE (_article.bitfield && getbit(0))
ORDER BY _article.id ASC
LIMIT 500;
QUERY PLAN :
 Limit  (cost=66156.73..66157.98 rows=500 width=1109) (actual
time=545.671..545.900 rows=500 loops=1)
   ->  Sort  (cost=66156.73..66199.98 rows=17298 width=1109) (actual
time=545.670..545.766 rows=500 loops=1)
         Sort Key: id
         Sort Method:  top-N heapsort  Memory: 603kB
         ->  Bitmap Heap Scan on _article  (cost=138.34..65294.79
rows=17298 width=1109) (actual time=1.059..541.359 rows=6729 loops=1)
               Recheck Cond: (bitfield && B'1'::bit varying)
               ->  Bitmap Index Scan on idx_article_bitfield
(cost=0.00..134.01 rows=17298 width=0) (actual time=0.922..0.922
rows=6743 loops=1)
                     Index Cond: (bitfield && B'1'::bit varying)
 Total runtime: 546.163 ms


Now... with ordery by, different limit, different query plan, the
limit 5 query is insanly *SLOW* (while the limit 500 is super fast).

What is think : The query planner do not consider the time taken by
the order by... which is *much* slower !!


--
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

Re: Cost of sort/order by not estimated by the query planner

From
Robert Haas
Date:
On Wed, Dec 2, 2009 at 10:32 AM, Laurent Laborde <kerdezixe@gmail.com> wrote:
> * without order by, limit 5 : 70ms
> ----------------------------------
>  explain analyze SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> LIMIT 5;
>
> QUERY PLAN  :
> Limit  (cost=0.00..20.03 rows=5 width=1109) (actual
> time=70.190..70.265 rows=5 loops=1)
>   ->  Index Scan using idx_article_bitfield on _article
> (cost=0.00..69290.99 rows=17298 width=1109) (actual
> time=70.188..70.260 rows=5 loops=1)
>         Index Cond: (bitfield && B'1'::bit varying)
>  Total runtime: 70.406 ms
> (4 rows)
>
> * without order by, limit 500 (same plan as above) : 371ms
> ------------------------------------------------------------------
> explain analyze SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> LIMIT 500;
>
> QUERY PLAN:
>  Limit  (cost=0.00..2002.86 rows=500 width=1109) (actual
> time=0.087..371.257 rows=500 loops=1)
>   ->  Index Scan using idx_article_bitfield on _article
> (cost=0.00..69290.99 rows=17298 width=1109) (actual
> time=0.086..371.075 rows=500 loops=1)
>         Index Cond: (bitfield && B'1'::bit varying)
>  Total runtime: 371.369 ms
>
> * without order by, limit 5000 (query plan changed) : 1307ms
> -------------------------------------------------------------------
>  explain analyze SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> LIMIT 5000;
>
> QUERY PLAN :
>  Limit  (cost=138.34..18971.86 rows=5000 width=1109) (actual
> time=53.782..1307.173 rows=5000 loops=1)
>   ->  Bitmap Heap Scan on _article  (cost=138.34..65294.79 rows=17298
> width=1109) (actual time=53.781..1305.565 rows=5000 loops=1)
>         Recheck Cond: (bitfield && B'1'::bit varying)
>         ->  Bitmap Index Scan on idx_article_bitfield
> (cost=0.00..134.01 rows=17298 width=0) (actual time=53.606..53.606
> rows=6743 loops=1)
>               Index Cond: (bitfield && B'1'::bit varying)
>  Total runtime: 1307.972 ms
>
>
> So... *without* "order by", differents limit and different query plan
> : the queries are fast.
>
> * with order by, limit 5 :
> ------------------------------
> explain analyze SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 5;
>
> QUERY PLAN :
> Mmmm.... the query is running since 2h ... waiting, waiting.
>
>
> * with order by, limit 500 : 546ms
> -------------------------------
> explain analyze SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 500;
> QUERY PLAN :
>  Limit  (cost=66156.73..66157.98 rows=500 width=1109) (actual
> time=545.671..545.900 rows=500 loops=1)
>   ->  Sort  (cost=66156.73..66199.98 rows=17298 width=1109) (actual
> time=545.670..545.766 rows=500 loops=1)
>         Sort Key: id
>         Sort Method:  top-N heapsort  Memory: 603kB
>         ->  Bitmap Heap Scan on _article  (cost=138.34..65294.79
> rows=17298 width=1109) (actual time=1.059..541.359 rows=6729 loops=1)
>               Recheck Cond: (bitfield && B'1'::bit varying)
>               ->  Bitmap Index Scan on idx_article_bitfield
> (cost=0.00..134.01 rows=17298 width=0) (actual time=0.922..0.922
> rows=6743 loops=1)
>                     Index Cond: (bitfield && B'1'::bit varying)
>  Total runtime: 546.163 ms
>
>
> Now... with ordery by, different limit, different query plan, the
> limit 5 query is insanly *SLOW* (while the limit 500 is super fast).
>
> What is think : The query planner do not consider the time taken by
> the order by... which is *much* slower !!

That is certainly not the case.  If the query planner did not consider
the time required to perform a sort, well, that would have been fixed
a lot sooner than now.  The problem real problem here is exactly what
I said upthread.   Without order-by, the query planner picks an
index-scan or a bitmap-index-scan and just runs it until it gets
enough rows to satisfy the LIMIT.  No problem.  With order-by, it has
to make a decision: should it fetch ALL the rows that satisfy the
bitfield condition, sort them by article ID, and then pick the top
five?  Or should it instead use the index on article ID to start
retrieving the lowest-numbered article IDs and hope to find 5 that
satisfy the bitfield condition before it goes through too many rows?

The answer depends on how frequently the bitfield condition will be
satisfied.  If most rows in the table satisfy the bitfield condition,
then the second plan is better; if very few do, the first plan is
better.  Somewhat more subtly, the plan also depends on the LIMIT.
The first plan requires almost the same amount of work for a small
limit as it does for a large one - you still have to find ALL the rows
that match the bitfield condition and sort them.  Then you return a
larger or smaller number of rows from the result of the sort depending
on the LIMIT.  But the amount of work that the second plan requires
varies dramatically depending on the LIMIT.  If the LIMIT is only
one-hundredth as large (5 instead of 500), then the second plan
figures to have to scan only one one-hundredth as many rows, so it
takes about a hundredth as much work for LIMIT 5 rather than LIMIT
500, whereas the cost of the first plan hasn't changed much.

The exact break-even point between the two plans will vary depending
on what percentage of the rows in the table satisfy the bitmap
condition.  In your particular case, the break-even point is less than
one row, so the first plan is always better, but the planner doesn't
know that.  I don't think the planner has any statistics that can tell
it how many of the rows in the table satisfy (_article.bitfield &&
getbit(0)), and it's probably estimating high because the actual
selectivity based on the numbers you provided is quite low.  That
makes the second plan look appealing for small numbers of rows.  If
the rows that it needs are clustered at the "wrong end" of the
article-ID index, which the planner certainly has no way of knowing,
then things get really ugly.

I've sometimes thought that the planner should outright discard plans
whose total cost (ignoring the effect of LIMIT) is too many orders of
magnitude more than some other available plan.  But it's hard to know
where to put the cutoff, and there are cases where the planner makes
this kind of trade-off and gets it right which we don't want to break,
so it's not a simple problem.  The best solution I've found is to
avoid using expressions that depend on operators other than equality
whenever possible.  The planner is very smart about equality.  It's
somewhat smart about >, <, >=, <=, <>, and pretty stupid about most
other things.

...Robert

Re: Cost of sort/order by not estimated by the query planner

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> The exact break-even point between the two plans will vary depending
> on what percentage of the rows in the table satisfy the bitmap
> condition.

It's worse than that.  The planner is not too bad about understanding
the percentage-of-rows problem --- at least, assuming you are using
a condition it has statistics for, which it doesn't for bitvector &&.
But whether the indexscan plan is fast will also depend on where the
matching rows are in the index ordering.  If they're all towards the
end you can lose big, and the planner hasn't got stats to let it
predict that.  It just assumes the filter condition is uncorrelated
to the ordering condition.

My own advice would be to forget the bitmap field and see if you can't
use a collection of plain boolean columns instead.  You might still
lose if there's a correlation problem, but "bitfield && B'1'" is
absolutely positively guaranteed to produce stupid row estimates and
hence bad plan choices.

Or you could work on introducing a non-stupid selectivity estimator
for &&, but it's not a trivial project.

            regards, tom lane