planner favors seq scan too early - Mailing list pgsql-performance

From Markus Bertheau
Subject planner favors seq scan too early
Date
Msg-id 684362e10802202304o2603105fl7272312e25cc7a19@mail.gmail.com
Whole thread Raw
Responses Re: planner favors seq scan too early  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
Given the following query:

SELECT
    fi.pub_date
FROM
    ext_feeder_item fi
WHERE
    fi.feed_id IN (SELECT id FROM ext_feeder_feed ff
                   WHERE ff.is_system)
ORDER BY
    pub_date DESC;

I'm getting a plan that uses a sequential scan on ext_feeder_item instead of
several index scans, which slows down the query significantly:

# explain analyze select fi.pub_date from ext_feeder_item fi where fi.feed_id
  in (select id from ext_feeder_feed ff where ff.is_system) order by pub_date
  desc;
 Sort  (cost=298545.70..299196.46 rows=260303 width=8) (actual
time=89299.623..89302.146 rows=807 loops=1)
   Sort Key: fi.pub_date
   Sort Method:  quicksort  Memory: 48kB
   ->  Hash IN Join  (cost=392.39..271572.17 rows=260303 width=8)
(actual time=537.226..89294.837 rows=807 loops=1)
         Hash Cond: (fi.feed_id = ff.id)
         ->  Seq Scan on ext_feeder_item fi  (cost=0.00..261330.45
rows=1932345 width=16) (actual time=0.035..82766.295 rows=1926576
loops=1)
         ->  Hash  (cost=377.78..377.78 rows=1169 width=8) (actual
time=175.579..175.579 rows=1196 loops=1)
               ->  Seq Scan on ext_feeder_feed ff  (cost=0.00..377.78
rows=1169 width=8) (actual time=13.723..171.467 rows=1196 loops=1)
                     Filter: is_system
 Total runtime: 89304.787 ms

Using LIMIT in the subquery I can see that starting with 50 values for the in
the planner starts to prefer the seq scan. Plan for 49:

# explain analyze select fi.pub_date from ext_feeder_item fi where fi.feed_id
  in (select id from ext_feeder_feed ff where ff.is_system limit 49) order by
  pub_date desc;

 Sort  (cost=277689.24..277918.39 rows=91660 width=8) (actual
time=477.769..478.193 rows=137 loops=1)
   Sort Key: fi.pub_date
   Sort Method:  quicksort  Memory: 22kB
   ->  Nested Loop  (cost=16.45..268878.12 rows=91660 width=8) (actual
time=119.258..477.150 rows=137 loops=1)
         ->  HashAggregate  (cost=16.45..16.94 rows=49 width=8)
(actual time=0.791..0.965 rows=49 loops=1)
               ->  Limit  (cost=0.00..15.84 rows=49 width=8) (actual
time=0.023..0.613 rows=49 loops=1)
                     ->  Seq Scan on ext_feeder_feed ff
(cost=0.00..377.78 rows=1169 width=8) (actual time=0.016..0.310
rows=49 loops=1)
                           Filter: is_system
         ->  Index Scan using ext_feeder_item_feed_id_idx on
ext_feeder_item fi  (cost=0.00..5463.58 rows=1871 width=16) (actual
time=4.485..9.692 rows=3 loops=49)
               Index Cond: (fi.feed_id = ff.id)
 Total runtime: 478.709 ms

Note that the rows estimate for the index scan is way off. Increasing
statistics target for ext_feeder_item.feed_id to 100 lets the planner favor the
index scan up to LIMIT 150 for the subquery.

Using enable_seqscan=false, I see that the index scan plan continues to
outperform the seqscan plan even with limit 1500 in the subquery (1196 values
actually returned from it):

# explain analyze select fi.pub_date from ext_feeder_item fi where fi.feed_id
  in (select id from ext_feeder_feed ff where ff.is_system limit 1500) order by
  pub_date desc;

 Sort  (cost=100925142.27..100925986.74 rows=337787 width=8) (actual
time=102.111..104.627 rows=807 loops=1)
   Sort Key: fi.pub_date
   Sort Method:  quicksort  Memory: 48kB
   ->  Nested Loop  (cost=100000392.39..100889503.71 rows=337787
width=8) (actual time=30.411..98.187 rows=807 loops=1)
         ->  HashAggregate  (cost=100000392.39..100000394.39 rows=200
width=8) (actual time=30.337..35.329 rows=1196 loops=1)
               ->  Limit  (cost=100000000.00..100000377.78 rows=1169
width=8) (actual time=0.027..24.759 rows=1196 loops=1)
                     ->  Seq Scan on ext_feeder_feed ff
(cost=100000000.00..100000377.78 rows=1169 width=8) (actual
time=0.019..16.448 rows=1196 loops=1)
                           Filter: is_system
         ->  Index Scan using ext_feeder_item_feed_id_idx on
ext_feeder_item fi  (cost=0.00..4424.43 rows=1689 width=16) (actual
time=0.026..0.040 rows=1 loops=1196)
               Index Cond: (fi.feed_id = ff.id)
 Total runtime: 107.264 ms

Without limit though, the planner chooses a different plan that also doesn't
perform:

# explain analyze select fi.pub_date from ext_feeder_item fi where fi.feed_id
  in (select id from ext_feeder_feed ff where ff.is_system) order by pub_date
  desc;

 Sort  (cost=1134023.40..1134669.54 rows=258456 width=8) (actual
time=854348.350..854350.866 rows=807 loops=1)
   Sort Key: fi.pub_date
   Sort Method:  quicksort  Memory: 48kB
   ->  Hash IN Join  (cost=543.03..1107253.77 rows=258456 width=8)
(actual time=21.241..854343.544 rows=807 loops=1)
         Hash Cond: (fi.feed_id = ff.id)
         ->  Index Scan Backward using ext_feeder_item_pub_date_idx on
ext_feeder_item fi  (cost=0.00..1096931.31 rows=1918631 width=16)
(actual time=0.096..847635.097 rows=1926576 loops=1)
         ->  Hash  (cost=528.42..528.42 rows=1169 width=8) (actual
time=21.114..21.114 rows=1196 loops=1)
               ->  Index Scan using ext_feeder_feed_pkey on
ext_feeder_feed ff  (cost=0.00..528.42 rows=1169 width=8) (actual
time=0.066..16.042 rows=1196 loops=1)
                     Filter: is_system
 Total runtime: 854353.431 ms


Why does the planner choose that way and what can I do to make it choose the
better plan, preferably without specifying limit and a maybe unreasonably high
statistics target for ext_feeder_item.feed_id?

PostgreSQL 8.3, from a freshly loaded and analyzed dump.

Thanks

Markus Bertheau

pgsql-performance by date:

Previous
From: Chris
Date:
Subject: Re: 7 hrs for a pg_restore?
Next
From: "Scott Marlowe"
Date:
Subject: Re: Question about shared_buffers and cpu usage