Re: Why does a simple query not use an obvious index? - Mailing list pgsql-performance

From Guy Thornley
Subject Re: Why does a simple query not use an obvious index?
Date
Msg-id 20040830081959.GC3714@conker.esphion.com
Whole thread Raw
In response to Re: Why does a simple query not use an obvious index?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why does a simple query not use an obvious index?
Re: Why does a simple query not use an obvious index?
List pgsql-performance
On Sun, Aug 29, 2004 at 06:03:43PM -0400, Tom Lane wrote:
> >> select somefield from sometable where timestampfield > now()-'60
> >> seconds'::interval
>
> This is a FAQ, but since the archives don't seem to be up at the moment,
> here's the answer once again:
>
> The expression "now() - something" is not a constant, so the planner
> is faced with "timestampfield > unknownvalue".  Its default assumption
> about the number of rows that will match is much too high to make an
> indexscan look profitable (from memory, I think it guesses that about
> a third of the table will match...).


Ok; this explains some really wierd stuff I've been seeing.

However, I'm seeing breakage of the form mentioned by the original poster
even when the query uses a _constant_ timestamp: [Postgres 7.4.3]

        ntais# \d detect.stats
                                                 Table "detect.stats"
            Column    |           Type           |                          Modifiers
        --------------+--------------------------+-------------------------------------------------------------
         anomaly_id   | integer                  | not null
         at           | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone
         resolution   | real                     | default 1.0
         values       | real[]                   |
         stat_type_id | integer                  | not null
        Indexes:
            "stats_pkey" primary key, btree (anomaly_id, stat_type_id, "at")
            "stats__ends_at" btree (stats__ends_at("at", resolution, "values"))
        Foreign-key constraints:
            "$1" FOREIGN KEY (anomaly_id) REFERENCES anomalies(anomaly_id) ON DELETE CASCADE
            "$2" FOREIGN KEY (stat_type_id) REFERENCES stat_types(stat_type_id)


        ntais=# SET enable_seqscan = on;
        SET
        ntais=# EXPLAIN ANALYZE
          SELECT anomaly_id, stat_type_id
          FROM detect.stats
          WHERE detect.stats__ends_at(at, resolution, values) > '2004-08-30 16:21:09+12'::timestamptz
          ORDER BY anomaly_id, stat_type_id
          ;

                                                                 QUERY PLAN
             

-----------------------------------------------------------------------------------------------------------------------------
         Sort  (cost=602473.59..608576.72 rows=2441254 width=8) (actual time=198577.407..198579.136 rows=6152 loops=1)
           Sort Key: anomaly_id, stat_type_id
           ->  Seq Scan on stats  (cost=0.00..248096.42 rows=2441254 width=8) (actual time=198299.685..198551.460
rows=6152loops=1) 
                 Filter: (stats__ends_at("at", resolution, "values") > '2004-08-30 16:21:09+12'::timestamp with time
zone)
         Total runtime: 198641.649 ms
        (5 rows)


        ntais=# EXPLAIN ANALYZE
          SELECT anomaly_id, stat_type_id
          FROM detect.stats
          WHERE detect.stats__ends_at(at, resolution, values) > '2004-08-30
          16:21:09+12'::timestamptz
          ORDER BY anomaly_id, stat_type_id
          ;

                                                                         QUERY PLAN
                            

--------------------------------------------------------------------------------------------------------------------------------------------
         Sort  (cost=10166043.26..10172146.40 rows=2441254 width=8) (actual time=44.710..46.661 rows=6934 loops=1)
           Sort Key: anomaly_id, stat_type_id
           ->  Index Scan using stats__ends_at on stats  (cost=0.00..9811666.09 rows=2441254 width=8) (actual
time=0.075..24.702rows=6934 loops=1) 
         Index Cond: (stats__ends_at("at", resolution, "values") > '2004-08-30 16:21:09+12'::timestamp with time zone)
         Total runtime: 50.354 ms
        (5 rows)


        ntais=# SELECT count(*) FROM detect.stats;
          count
        ---------
         7326151
        (1 row)

Ive done repeated ANALYZE's, both table-specific and database-wide, and get
the same result every time.

For us, a global 'enable_seqscan = off' in postgresql.conf is the way to go.
You occasionally see an odd plan while developing a query (eg: scanning an
index with no contraint to simply get ORDER BY). Usually thats a broken
query/index, and I simply fix it.


Guy Thornley

pgsql-performance by date:

Previous
From: Mr Pink
Date:
Subject: Re: Why does a simple query not use an obvious index?
Next
From: Tom Lane
Date:
Subject: Re: Why does a simple query not use an obvious index?