Re: Timestamp indexes aren't used for ">=" - Mailing list pgsql-admin

From Stephan Szabo
Subject Re: Timestamp indexes aren't used for ">="
Date
Msg-id 20011109145054.H59285-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Timestamp indexes aren't used for ">="  ("Jeff Boes" <jboes@nexcerpt.com>)
List pgsql-admin
> (selecting by equality with timestamp value)
>
> explain select fetch_status, count(*)
> from stat_fetch where fetch_date = '2001-11-08 00:00:00-05'
> group by fetch_status;
>
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=8.05..8.06 rows=1 width=12)
>    ->  Group  (cost=8.05..8.05 rows=1 width=12)
>          ->  Sort  (cost=8.05..8.05 rows=1 width=12)
>                ->  Index Scan using ix_stat_fetch_3 on stat_fetch
> (cost=0.00..8.04 rows=1 width=12)
>
> EXPLAIN
>
> .....BUT.....
>
> (selecting by comparison ">=" to timestamp value)
>
> explain select fetch_status, count(*)
> from stat_fetch where fetch_date >= '2001-11-08 00:00:00-05'
> group by fetch_status;
>
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=12322.64..12522.06 rows=3988 width=12)
>    ->  Group  (cost=12322.64..12422.35 rows=39884 width=12)
>          ->  Sort  (cost=12322.64..12322.64 rows=39884 width=12)
>                ->  Seq Scan on stat_fetch  (cost=0.00..8917.33
> rows=39884 width=12)
>
> EXPLAIN
>
> .....AND YET.....
>
> set enable_seqscan to off;
> explain select fetch_status, count(*)
> from stat_fetch where fetch_date >= '2001-11-08 00:00:00-05'
> group by fetch_status;
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=38193.97..38393.39 rows=3988 width=12)
>    ->  Group  (cost=38193.97..38293.68 rows=39884 width=12)
>          ->  Sort  (cost=38193.97..38193.97 rows=39884 width=12)
>                ->  Index Scan using ix_stat_fetch_3 on stat_fetch
> (cost=0.00..34788.66 rows=39884 width=12)
>
> EXPLAIN
>
> Note the cost of the "Index" scan is actually a higher estimate
> than the sequential scan.

How many rows are in the table?  Have you run vacuum analyze?

It's estimating that about 40000 of them will match the condition, is
this a reasonable estimate?  If so, you're reading about 1/10
of the rows (assuming a small number of dead rows). Because the
tuple validity information is stored in the heap file, you need
to load the heap pages for those rows that match the index condition.

I think it tries to estimate the cost of:
 reading the index + reading the heap file for the matching rows
  (including the seeking necessary to move around to the correct
   page)
vs the cost of:
 reading the heap file sequentially



pgsql-admin by date:

Previous
From: "Jeff Boes"
Date:
Subject: Suppress certain messages from postmaster log?
Next
From: Tom Lane
Date:
Subject: Re: Timestamp indexes aren't used for ">="