Thread: Baffling sequential scan plan when index scan would be best

Baffling sequential scan plan when index scan would be best

From

"Jeffrey W. Baker"

Date:

21 April 2005, 00:51:11

I always thought I would not be the kind of person who writes to this
list asking why the planner is using a sequential scan.  I always looked
upon such people as newcomers who would eventually learn the mysterious
wonders of the Pg query execution planner.

But really, this plan is bizarre!  Why is it scanning sequentially for
ONE tuple as selected by the primary key?  I even increased stats to
1000 and disable seq_scan, but it still insists it cannot do an index
scan.

skunk=# \d items;
      Table "items"
   Column   |   Type   | Modifiers
------------+----------+-----------
 item       | bigint   | not null
[...]
Indexes:
    "items_pkey" primary key, btree (item)

skunk=# analyze verbose items;
INFO:  analyzing "items"
INFO:  "items": 80372 pages, 300000 rows sampled, 2660996 estimated total rows
ANALYZE

skunk=# explain analyze select * from items where item = 2143888;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on items  (cost=100000000.00..100113634.45 rows=1 width=115) (actual time=4034.564..8859.082 rows=1 loops=1)
   Filter: (item = 2143888)
 Total runtime: 8859.160 ms
(3 rows)

 enable_hashagg                 | on
 enable_hashjoin                | on
 enable_indexscan               | on
 enable_mergejoin               | on
 enable_nestloop                | on
 enable_seqscan                 | off <===
 enable_sort                    | on
 enable_tidscan                 | on

 random_page_cost               | 1
 cpu_index_tuple_cost           | 0.001
 cpu_operator_cost              | 0.0025
 cpu_tuple_cost                 | 0.01

What's even more baffling is the planner will use index scan for any
other indexed column, including columns for which the index is not
particularly selective, like item category or date:

skunk=# set enable_seqscan=on;
SET

skunk=# explain select * from items where category = 245;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Index Scan using items_cat_idx on items  (cost=0.00..69887.77 rows=125795 width=115)
   Index Cond: (category = 245)
(2 rows)

skunk=# explain select * from items where startdate = '2005-03-01';
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Index Scan using items_start_ids on items  (cost=0.00..1948.53 rows=30283 width=115)
   Index Cond: (startdate = '2005-03-01'::date)
(2 rows)

So it seems that an index scan returning a half-million tuples is OK,
but an index scan returning a single tuple is right out.  What?

-Confused in California

Re: Baffling sequential scan plan when index scan would

From

Stephan Szabo

Date:

21 April 2005, 01:07:06

On Wed, 20 Apr 2005, Jeffrey W. Baker wrote:

> I always thought I would not be the kind of person who writes to this
> list asking why the planner is using a sequential scan.  I always looked
> upon such people as newcomers who would eventually learn the mysterious
> wonders of the Pg query execution planner.
>
> But really, this plan is bizarre!  Why is it scanning sequentially for
> ONE tuple as selected by the primary key?  I even increased stats to
> 1000 and disable seq_scan, but it still insists it cannot do an index
> scan.

What version are you using?  If it's not 8.0.x, I'd guess this is the old
cannot use indexes from cross-integer type comparisons issue, and you may
want to explicitly cast or quote that 2143888.

Re: Baffling sequential scan plan when index scan would

From

Chris Smith

Date:

21 April 2005, 01:18:09

Hi Jeff,

Maybe because the key is a bigint - you're looking for an int (int4).

Try casting it to a bigint ?

Jeffrey W. Baker wrote:
> I always thought I would not be the kind of person who writes to this
> list asking why the planner is using a sequential scan.  I always looked
> upon such people as newcomers who would eventually learn the mysterious
> wonders of the Pg query execution planner.
>
> But really, this plan is bizarre!  Why is it scanning sequentially for
> ONE tuple as selected by the primary key?  I even increased stats to
> 1000 and disable seq_scan, but it still insists it cannot do an index
> scan.
>
> skunk=# \d items;
>       Table "items"
>    Column   |   Type   | Modifiers
> ------------+----------+-----------
>  item       | bigint   | not null
> [...]
> Indexes:
>     "items_pkey" primary key, btree (item)
>
> skunk=# analyze verbose items;
> INFO:  analyzing "items"
> INFO:  "items": 80372 pages, 300000 rows sampled, 2660996 estimated total rows
> ANALYZE
>
> skunk=# explain analyze select * from items where item = 2143888;
>                                                       QUERY PLAN
 
>
-----------------------------------------------------------------------------------------------------------------------
>  Seq Scan on items  (cost=100000000.00..100113634.45 rows=1 width=115) (actual time=4034.564..8859.082 rows=1
loops=1)
>    Filter: (item = 2143888)
>  Total runtime: 8859.160 ms
> (3 rows)
>
>  enable_hashagg                 | on
>  enable_hashjoin                | on
>  enable_indexscan               | on
>  enable_mergejoin               | on
>  enable_nestloop                | on
>  enable_seqscan                 | off <===
>  enable_sort                    | on
>  enable_tidscan                 | on
>
>  random_page_cost               | 1
>  cpu_index_tuple_cost           | 0.001
>  cpu_operator_cost              | 0.0025
>  cpu_tuple_cost                 | 0.01
>
> What's even more baffling is the planner will use index scan for any
> other indexed column, including columns for which the index is not
> particularly selective, like item category or date:
>
> skunk=# set enable_seqscan=on;
> SET
>
> skunk=# explain select * from items where category = 245;
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Index Scan using items_cat_idx on items  (cost=0.00..69887.77 rows=125795 width=115)
>    Index Cond: (category = 245)
> (2 rows)
>
> skunk=# explain select * from items where startdate = '2005-03-01';
>                                       QUERY PLAN
> --------------------------------------------------------------------------------------
>  Index Scan using items_start_ids on items  (cost=0.00..1948.53 rows=30283 width=115)
>    Index Cond: (startdate = '2005-03-01'::date)
> (2 rows)
>
> So it seems that an index scan returning a half-million tuples is OK,
> but an index scan returning a single tuple is right out.  What?
>
> -Confused in California
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
Regards,

Chris Smith

  Unit 2, 3 National Street, Rozelle, NSW 2039 Australia

Ph: +61 2 9555 5570
Fx: +61 2 9555 5571

email: info@interspire.com
web: http://www.interspire.com