Thread: Optimizing timestamp queries? Inefficient Overlaps?
I have a table similar to this:
CREATE TABLE event_resources (
event_resource_id serial NOT NULL,
event_id integer NOT NULL,
resource_id integer NOT NULL,
start_date timestamptz NOT NULL,
end_date timestamptz NOT NULL,
CONSTRAINT event_resources_pkey PRIMARY KEY (event_resource_id)
);
Where the same resource can be added to an event multiple times.
Since the table spans a few years, any day queried should
return at most 0.1% of the table, and seems perfect for indexes. So I add these:
CREATE INDEX er_idx1 ON event_resources (start_date);
CREATE INDEX er_idx2 ON event_resources (end_date);
One query I need to perform is "All event resources that start or end
on a particular day". The first thing that comes to mind is this:
select *
from event_resources er
where er.start_date::date = $1::date or er.end_date::date = $1::date
This is very slow… Pg chooses a sequential scan. (I am running vacuum
and analyze) Shouldn't Pg be able to use an index here?
I've tried creating function indexes using cast, but Pg returns this error message:
ERROR: functions in index expression must be marked IMMUTABLE
Which I assume is related to timezones and daylight saving issues in converting
a timestamptz into a plain date.
This form strangely won't use an index either:
select *
from event_resources er
where (er.start_date, er.end_date) overlaps ($1::date, $1::date+1)
This is the only query form I've found that will use an index:
select *
from event_resources er
where (er.start_date >= $1::date and er.start_date < ($1::date+1))
or (er.end_date >= $1::date and er.end_date < ($1::date+1))
I know it's not exactly the same as the overlaps method, but since this works
I would expect OVERLAPS to work as well. I prefer overlaps because it's clean
and simple, self documenting.
Another (similar) query I need to perform is "All event resources that overlap a given
time range". Seems tailor-made for OVERLAPS:
select *
from event_resources er
where (er.start_date, er.end_date) overlaps ($1::timestamptz, $2::timestamptz)
Again… can't get this to use an index. I have to use this again:
select *
from event_resources er
where (er.start_date >= $1::timestamptz and er.start_date < $2::timestamptz)
or (er.end_date >= $1::timestamptz and er.end_date < $2::timestamptz)
What am I doing wrong? This is Pg 8.1.2 on RHEL 4.
"Adam Rich" <adam.r@sbcglobal.net> writes: > I have a table similar to this: > CREATE TABLE event_resources ( > event_resource_id serial NOT NULL, > event_id integer NOT NULL, > resource_id integer NOT NULL, > start_date timestamptz NOT NULL, > end_date timestamptz NOT NULL, > CONSTRAINT event_resources_pkey PRIMARY KEY (event_resource_id) > ); > CREATE INDEX er_idx1 ON event_resources (start_date); > CREATE INDEX er_idx2 ON event_resources (end_date); > select * > from event_resources er > where er.start_date::date = $1::date or er.end_date::date = $1::date > This is very slow. Pg chooses a sequential scan. (I am running vacuum > and analyze) Shouldn't Pg be able to use an index here? No, unless you were to create the indexes on start_date::date and end_date::date ... > I've tried creating function indexes using cast, but Pg returns this > error message: > ERROR: functions in index expression must be marked IMMUTABLE ... which you can't do because the cast from timestamptz to date is dependent on the current timezone setting. If the start and end are really intended to be accurate only to the day, as the column names seem to suggest, why didn't you declare them as date to start with? > select * > from event_resources er > where (er.start_date >= $1::date and er.start_date < ($1::date+1)) > or (er.end_date >= $1::date and er.end_date < ($1::date+1)) > I know it's not exactly the same as the overlaps method, but since this > works I would expect OVERLAPS to work as well. Sorry, but no -- read the SQL spec for OVERLAPS sometime. It's not even close to being the same, and with all the weird special cases for nulls, it's just about unoptimizable :-( regards, tom lane