question on scan of clustered index - Mailing list pgsql-general

From pwing@student.umass.edu
Subject question on scan of clustered index
Date
Msg-id 1184016750.4692a96e2cef9@mail-www.oit.umass.edu
Whole thread Raw
Responses Re: question on scan of clustered index
Re: question on scan of clustered index
List pgsql-general
Hello:

I am running the following query:
SELECT COUNT(*) FROM orders WHERE o_orderdate < date('1995-03-15');

Here are some stats for the orders relation:
select relname, relpages, reltuples from pg_class where relname = 'orders';
"orders";29278;1.49935e+06

For my query above, the reduction factor is about 50%; 700,000 out of 1.5
millions tuples satisfy the WHERE clause.

I have a clustered index defined as follows:
CREATE INDEX orders_orderdate_idx
  ON orders
  USING btree
  (o_orderdate);
ALTER TABLE orders CLUSTER ON orders_orderdate_idx;

I am running three ways:  sequential scan, bitmap index scan and index scan.
The I/O cost for the index scan is 24+ times more than the other two.  I do not
understand why this happens.  If I am using a clustered index, it is my
understanding that there should be no need to retrieve a single page multiple
times, as tuples are sorted.  Am I misunderstanding something?

Paula

Here are the results of explain analyze, and I/O results from
pg_statio_user_tables:

Aggregate  (cost=1470905.79..1470905.80 rows=1 width=0) (actual
time=9040.320..9040.321 rows=1 loops=1)
  ->  Index Scan using orders_orderdate_idx on orders  (cost=0.00..1469101.03
rows=721902 width=0) (actual time=0.098..8222.234 rows=727305 loops=1)
        Index Cond: (o_orderdate < '1995-03-15'::date)
Total runtime: 9040.375 ms

I/O cost:
Heap Blocks Read:  649966 (from disk)
Heap Blocks Hit:    70070 (from buffer)
Index Blocks Read:   1591
__________________________________

Aggregate  (cost=52033.65..52033.66 rows=1 width=0) (actual
time=2364.470..2364.471 rows=1 loops=1)
  ->  Bitmap Heap Scan on orders  (cost=11927.12..50228.90 rows=721902 width=0)
(actual time=338.547..1609.118 rows=727305 loops=1)
        Recheck Cond: (o_orderdate < '1995-03-15'::date)
        ->  Bitmap Index Scan on orders_orderdate_idx  (cost=0.00..11746.65
rows=721902 width=0) (actual time=329.249..329.249 rows=727305 loops=1)
              Index Cond: (o_orderdate < '1995-03-15'::date)
Total runtime: 2364.697 ms

I/O cost:
Heap Blocks Read:  29278
Index Blocks Read:  1591
__________________________________

Aggregate  (cost=49832.76..49832.77 rows=1 width=0) (actual
time=2215.752..2215.753 rows=1 loops=1)
  ->  Seq Scan on orders  (cost=0.00..48028.00 rows=721902 width=0) (actual
time=0.042..1458.734 rows=727305 loops=1)
        Filter: (o_orderdate < '1995-03-15'::date)
Total runtime: 2215.801 ms

I/O cost:
Heap Blocks Read:  29278

pgsql-general by date:

Previous
From: tabai
Date:
Subject: Limit number connections by IP
Next
From: Stephen Frost
Date:
Subject: Re: returns setof rec... simple exampe doesn't work