Index scan vs. Seq scan on timestamps - Mailing list pgsql-general

From Per Jensen
Subject Index scan vs. Seq scan on timestamps
Date
Msg-id 41B4C699.20607@net-es.dk
Whole thread Raw
Responses Re: Index scan vs. Seq scan on timestamps  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
List pgsql-general
List,

PG version is 7.4.2


I log apache hits to a postgres server. The table layout is as follows:

apachelog=# \d accesslog
             Table "public.accesslog"
   Column  |           Type           | Modifiers
----------+--------------------------+-----------
  id       | integer                  | not null
  ip       | character varying(15)    |
  ident    | character varying(200)   |
  auth     | character varying(200)   |
  time     | timestamp with time zone |
  request  | character varying(200)   |
  code     | integer                  |
  bytes    | integer                  |
  referrer | character varying(200)   |
  agent    | character varying(200)   |
Indexes:
     "accesslog_pkey" primary key, btree (id)
     "accesslog_time_idx" btree ("time")

Number of rows: approx: 530.000

If I want to count hits 30 days back this query uses an efficient
index scan:

select count(*)
from accesslog
where time  between (current_timestamp - INTERVAL '30 d') and
current_timestamp;

Aggregate  (cost=8294.40..8294.40 rows=1 width=0)
    ->  Index Scan using accesslog_time_idx on accesslog
(cost=0.00..8287.97 rows=2569 width=0)
          Index Cond: (("time" >= (('now'::text)::timestamp(6) with time
zone - '30 days'::interval)) AND ("time" <= ('now'::text)::timestamp(6)
with time zone))
(3 rows)

while this query uses a less efficient seq scan:

select count(*)
from accesslog
where time  between (timeofday()::timestamp - INTERVAL '30 d') and
timeofday()::timestamp;

Aggregate  (cost=34966.56..34966.56 rows=1 width=0)
    ->  Seq Scan on accesslog  (cost=0.00..34823.86 rows=57077 width=0)
          Filter: (("time" >= (((timeofday())::timestamp without time
zone - '30 days'::interval))::timestamp with time zone) AND ("time" <=
((timeofday())::timestamp without time zone)::timestamp with time zone))
(3 rows)

Why does PG not use the index on the time column in the second select,
timeofday() has been cast to a timestamp after all.

Any insight much appreciated.

Regards
Per
--
----------------------------------------------------------------------
Per Jensen        http://www.net-es.dk/~pj         Linux rules!
----------------------------------------------------------------------

pgsql-general by date:

Previous
From: Andrew M
Date:
Subject: Re: SSL confirmation - (could not accept SSL connection:
Next
From: Greg Stark
Date:
Subject: Re: When to encrypt