Thread: Date indexing
Hi, I've been trying to do date range index lookups. According to EXPLAIN, my sql is using the appropriate index, but the result takes forever to return (~10mins) even tho the dataset it's referencing is reasonably small (~5000 lines). Similar queries on a similar sized dataset on the same database that are not referencing a date range, but a specific field (WHERE user_name = 'U12345' for example) take a matter of seconds to return. The index is on a timestamp field, and the only way I could find to get it to use this index was something like the following... AND (messages.client_id = '89' AND messages.originator like '%' AND messages.logtime >= timestamp '20/04/2002 00:00:00' AND messages.logtime <= timestamp '20/04/2002 23:59:59') Index is on logtime, client_id, originator logtime = timezone client id = int2 originator = text I'll only ever be referencing this data to a granularity of 1 day, so is there a way I can use a function to index this so that the date column in the index is text containing just DD/MM/YYYY? I did try to index something like.. create index messages_200204_ix3 on messages_200204 (to_char(logtime,'DD/MM/YYYY'), client_id, originator) ...but it didn't work. Gives me parse errors on '. Are date range index lookups slow on Postgres? Can forward more details if required. -- Ian Cass
"Ian Cass" <ian.cass@mblox.com> writes: > I'll only ever be referencing this data to a granularity of 1 day, so is > there a way I can use a function to index this so that the date column in > the index is text containing just DD/MM/YYYY? Don't use text, use a date. regression=# create table foo (f1 timestamp without time zone); CREATE regression=# create index fooi on foo(date(f1)); CREATE regression=# explain select * from foo where date(f1) = current_date; QUERY PLAN -----------------------------------------------------------------Index Scan using fooi on foo (cost=0.00..17.09 rows=5 width=8) Index Cond: (date(f1) = date('now'::text)) (2 rows) (This is with current devel sources, as you can possibly tell from the EXPLAIN format, but should work fine in 7.2; less sure about 7.1.) Note that a coercion from timestamp *with* timezone to date will not be allowed as an index function, because it's dependent on external information --- viz, your timezone. So I had to use timestamp without time zone in this example. One might also wonder why you're not just storing the column as type date rather than timestamp in the first place, if you're certain you do not need finer resolution. regards, tom lane
> Don't use text, use a date. Thanks, I'll try that. > One might also wonder why you're not just storing the column as type > date rather than timestamp in the first place, if you're certain you > do not need finer resolution. Because in other instances, I do need the extra resolution. This is a high level overview report I'm trying to write. -- Ian Cass
> This won't be able to make much use of the compound index, because you > are not searching for exact match on the first column, and the other > columns are not in your criteria at all ... Thus your index is only as > good as being just on the timestamp alone. This makes sense. I'll change the order of my indexing. > 10 minutes still sounds pretty excessive though, even for a seq. scan > on a 5000 rows table. Have you vacuum'ed it lately? > Try running 'vacuum full' on that table, it might help a lot, if you > were ding lots of updates to it. No, the actual table contains more like 10mill lines, however, the dataset I'd be referencing if the index were working as I intended would be a mere 5k rows. Thanks for your help. I'll try this later today. -- Ian Cass