Re: performance problem with LIMIT (order BY in DESC order). Wrong index used? - Mailing list pgsql-performance

From Tom Lane
Subject Re: performance problem with LIMIT (order BY in DESC order). Wrong index used?
Date
Msg-id 9713.1302619107@sss.pgh.pa.us
Whole thread Raw
In response to Re: performance problem with LIMIT (order BY in DESC order). Wrong index used?  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
Claudio Freire <klaussfreire@gmail.com> writes:
> Did you try increasing the statistic targets?

> AFAIK, it looks a lot like the planner is missing stats, since it
> estimates the index query on idx_nfi_newsfeed will fetch 10k rows -
> instead of 25.

BTW, this is the right suggestion, but for the wrong reason.  You seem
to be looking at

Limit  (cost=0.00..980.09 rows=25 width=963) (actual time=48.592..4060.779 rows=25 loops=1)
  ->  Index Scan Backward using "IDX_NFI_DATETIME" on newsfeed_item  (cost=0.00..409365.16 rows=10442 width=963)
(actualtime=48.581..4060.542 rows=25 loops=1) 

Here, the actual row count is constrained to 25 because the LIMIT node
stops calling the indexscan node once it's got 25.  So this case proves
little about whether the planner's estimates are any good.  You need to
check the estimates in the unconstrained plan:

  ->  Bitmap Heap Scan on newsfeed_item  (cost=421.41..34450.72 rows=10442 width=963) (actual time=0.644..12.601
rows=477loops=1) 

Here we can see that there really are only 477 rows in the table that
satisfy the WHERE clause, versus an estimate of 10K.  So sure enough,
the statistics are bad, and an increase in stats target might help.
But you can't conclude that from an explain that involves LIMIT.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Two servers - One Replicated - Same query
Next
From: "Kevin Grittner"
Date:
Subject: Re: Linux: more cores = less concurrency.