Thread: Type casting and indexes

Type casting and indexes

From
"David Olbersen"
Date:
I hope this hasn't been answered before, I've looked at the docs here:
  http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=index.html

Anyway, I've found a (bug|feature|standard?) with type casting and index usage.

I've got a table with a column that's a timestamp with time zone. This column is indexed. If I issue the "normal" query
of: 

    SELECT count(*) FROM foo WHERE bar > '2003-05-05':;timestamp

I get the following EXPLAIN ANALYZE output:

urldb=> explain select count(*) from foo where bar > '2003-05-05'::timestamp;
                             QUERY PLAN
------------------------------------------------------------------------
 Aggregate  (cost=89960.75..89960.75 rows=1 width=0) (actual time=
             56706.58..56706.58 rows=1 loops=1)
   ->  Seq Scan on urlinfo  (cost=0.00..87229.45 rows=1092521 width=0) (actual
        time=25.37..56537.86 rows=27490 loops=1)
         Filter: (ratedon > ('2003-05-05 00:00:00'::timestamp without time
                  zone)::timestamp with time zone)
 Total runtime: 56706.67 msec

So it seems that the type conversion is killing the use of the index, even though the type conversion has to happen for
thecondition to be tested. 

If I change this query slightly, by casting to timestamptz, I get the following EXPLAIN ANALYZE output:

                              QUERY PLAN
-------------------------------------------------------------------------
 Aggregate  (cost=38609.70..38609.70 rows=1 width=0) (actual time=547.58..547.58
             rows=1 loops=1)
   ->  Index Scan using urlinfo_on on urlinfo  (cost=0.00..38578.97 rows=12295
        width=0) (actual time=0.18..381.95 rows=27490 loops=1)
         Index Cond: (ratedon > '2003-05-05 00:00:00-07'::timestamp with time
                      zone)
 Total runtime: 548.17 msec

That's much better! Is this the way it's supposed to work?

--------------------------
David Olbersen
iGuard Engineer
11415 West Bernardo Court
San Diego, CA 92127
1-858-676-2277 x2152


Re: Type casting and indexes

From
Stephan Szabo
Date:
On Thu, 8 May 2003, David Olbersen wrote:

> Anyway, I've found a (bug|feature|standard?) with type casting and index usage.
>
> I've got a table with a column that's a timestamp with time zone. This
> column is indexed. If I issue the "normal" query of:
>
>     SELECT count(*) FROM foo WHERE bar > '2003-05-05':;timestamp
>
> I get the following EXPLAIN ANALYZE output:
>
> urldb=> explain select count(*) from foo where bar > '2003-05-05'::timestamp;
>                              QUERY PLAN
> ------------------------------------------------------------------------
>  Aggregate  (cost=89960.75..89960.75 rows=1 width=0) (actual time=
>              56706.58..56706.58 rows=1 loops=1)
>    ->  Seq Scan on urlinfo  (cost=0.00..87229.45 rows=1092521 width=0) (actual
>         time=25.37..56537.86 rows=27490 loops=1)
>          Filter: (ratedon > ('2003-05-05 00:00:00'::timestamp without time
>                   zone)::timestamp with time zone)
>  Total runtime: 56706.67 msec
>
> So it seems that the type conversion is killing the use of the index,
> even though the type conversion has to happen for the condition to be
> tested.

IIRC, timestamp->timestamptz is not considered to give a constant value
(ie, is not stable) probably since it depends on timezone settings which
could be changed (for example by a function) during the query, so for each
row the conversion from '2003-05-05 00:00:00'::timestamp without time zone
to a timestamp with time zone can potentially give a different answer.


Re: Type casting and indexes

From
Tom Lane
Date:
"David Olbersen" <DOlbersen@stbernard.com> writes:
> So it seems that the type conversion is killing the use of the index, even though the type conversion has to happen
forthe condition to be tested. 

Seems like I just answered this yesterday ;-)

Note the difference in the number of estimated rows in the two explains.
The reason is that the timestamptz conversion is not a constant and so
the planner can't get a good estimate of the number of rows that will
satisfy it.  (And the reason it's not a constant is that it depends on
SET TIMEZONE.)

Bottom line: declare the constant correctly.  Or at least don't
gratuitously cast it to the wrong thing.

            regards, tom lane


Re: Type casting and indexes

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> On Thu, 8 May 2003, David Olbersen wrote:
>> So it seems that the type conversion is killing the use of the index,
>> even though the type conversion has to happen for the condition to be
>> tested.

> IIRC, timestamp->timestamptz is not considered to give a constant value
> (ie, is not stable)

No: it is stable, but not immutable, because it depends on SET TIMEZONE.
(Our policy on those is if you change one mid-query, it's unspecified
whether the query will notice or not.)  So the query is potentially
indexable.

The problem here is that instead of seeing a constant, the planner sees
a nonconstant function invocation on the right side of '>', and so it
has to fall back to a default selectivity estimate instead of being able
to extract a reasonable estimate from pg_statistic.  The default
estimate is high enough to discourage an indexscan ...

            regards, tom lane