Thread: Date indexing

Date indexing

From
"Ian Cass"
Date:
Hi,

I've been trying to do date range index lookups. According to EXPLAIN, my
sql is using the appropriate index, but the result takes forever to return
(~10mins) even tho the dataset it's referencing is reasonably small (~5000
lines). Similar queries on a similar sized dataset on the same database that
are not referencing a date range, but a specific field (WHERE user_name =
'U12345' for example) take a matter of seconds to return.

The index is on a timestamp field, and the only way I could find to get it
to use this index was something like the following...

AND (messages.client_id = '89' AND messages.originator like '%' AND
messages.logtime >= timestamp '20/04/2002 00:00:00' AND messages.logtime <=
timestamp '20/04/2002 23:59:59')

Index is on logtime, client_id, originator
logtime = timezone
client id = int2
originator = text

I'll only ever be referencing this data to a granularity of 1 day, so is
there a way I can use a function to index this so that the date column in
the index is text containing just DD/MM/YYYY? I did try to index something
like..

create index messages_200204_ix3 on messages_200204
(to_char(logtime,'DD/MM/YYYY'), client_id, originator)

...but it didn't work. Gives me parse errors on '.

Are date range index lookups slow on Postgres?

Can forward more details if required.

--
Ian Cass



Re: Date indexing

From
Tom Lane
Date:
"Ian Cass" <ian.cass@mblox.com> writes:
> I'll only ever be referencing this data to a granularity of 1 day, so is
> there a way I can use a function to index this so that the date column in
> the index is text containing just DD/MM/YYYY?

Don't use text, use a date.

regression=# create table foo (f1 timestamp without time zone);
CREATE
regression=# create index fooi on foo(date(f1));
CREATE
regression=# explain select * from foo where date(f1) = current_date;                          QUERY PLAN
-----------------------------------------------------------------Index Scan using fooi on foo  (cost=0.00..17.09 rows=5
width=8) Index Cond: (date(f1) = date('now'::text))
 
(2 rows)

(This is with current devel sources, as you can possibly tell from the
EXPLAIN format, but should work fine in 7.2; less sure about 7.1.)

Note that a coercion from timestamp *with* timezone to date will
not be allowed as an index function, because it's dependent on
external information --- viz, your timezone.  So I had to use
timestamp without time zone in this example.

One might also wonder why you're not just storing the column as type
date rather than timestamp in the first place, if you're certain you
do not need finer resolution.
        regards, tom lane


Re: Date indexing

From
"Ian Cass"
Date:
> Don't use text, use a date.

Thanks, I'll try that.

> One might also wonder why you're not just storing the column as type
> date rather than timestamp in the first place, if you're certain you
> do not need finer resolution.

Because in other instances, I do need the extra resolution. This is a high
level overview report I'm trying to write.

--
Ian Cass



Re: Date indexing

From
"Ian Cass"
Date:
> This won't be able to make much use of the compound index, because you
> are not searching for exact match on the first column, and the other
> columns are not in your criteria at all ... Thus your index is only as
> good as being just on the timestamp alone.

This makes sense. I'll change the order of my indexing.

> 10 minutes still sounds pretty excessive though, even for a seq. scan
> on a 5000 rows table. Have you vacuum'ed it lately?
> Try running 'vacuum full' on that table, it might help a lot, if you
> were ding lots of updates to it.


No, the actual table contains more like 10mill lines, however, the dataset
I'd be referencing if the index were working as I intended would be a mere
5k rows.

Thanks for your help. I'll try this later today.

--
Ian Cass