Re: planner favors seq scan too early - Mailing list pgsql-performance

From Richard Huxton
Subject Re: planner favors seq scan too early
Date
Msg-id 47BD3781.6000503@archonet.com
Whole thread Raw
In response to planner favors seq scan too early  ("Markus Bertheau" <mbertheau.pg@googlemail.com>)
List pgsql-performance
Markus Bertheau wrote:
>
> I'm getting a plan that uses a sequential scan on ext_feeder_item instead of
> several index scans, which slows down the query significantly:
>
> # explain analyze select fi.pub_date from ext_feeder_item fi where fi.feed_id
>   in (select id from ext_feeder_feed ff where ff.is_system) order by pub_date
>   desc;
>  Sort  (cost=298545.70..299196.46 rows=260303 width=8) (actual
> time=89299.623..89302.146 rows=807 loops=1)

> Using LIMIT in the subquery I can see that starting with 50 values for the in
> the planner starts to prefer the seq scan. Plan for 49:

>  Sort  (cost=277689.24..277918.39 rows=91660 width=8) (actual
> time=477.769..478.193 rows=137 loops=1)

> Note that the rows estimate for the index scan is way off. Increasing
> statistics target for ext_feeder_item.feed_id to 100 lets the planner favor the
> index scan up to LIMIT 150 for the subquery.
>
> Using enable_seqscan=false, I see that the index scan plan continues to
> outperform the seqscan plan even with limit 1500 in the subquery (1196 values
> actually returned from it):

>  Sort  (cost=100925142.27..100925986.74 rows=337787 width=8) (actual
> time=102.111..104.627 rows=807 loops=1)

> Why does the planner choose that way and what can I do to make it choose the
> better plan, preferably without specifying limit and a maybe unreasonably high
> statistics target for ext_feeder_item.feed_id?

Although the index scans are fast enough, the cost estimate is much more.

This suggests you need to tweak your planner cost settings:
http://www.postgresql.org/docs/8.3/static/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

I'd probably start with reducing random_page_cost if you have a
reasonable disk system and making sure effective_cache_size is
accurately set.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: Question about shared_buffers and cpu usage
Next
From: Dave Cramer
Date:
Subject: Re: Question about shared_buffers and cpu usage