Re: How to create a multi-column index with 2 dates using 'gist'? - Mailing list pgsql-general

From Fred Janon
Subject Re: How to create a multi-column index with 2 dates using 'gist'?
Date
Msg-id 2fd0c7810908250439p6d6e06b2xe868dbb079622f9c@mail.gmail.com
Whole thread Raw
In response to Re: How to create a multi-column index with 2 dates using 'gist'?  (Sam Mason <sam@samason.me.uk>)
Responses Re: How to create a multi-column index with 2 dates using 'gist'?
List pgsql-general
Thanks Sam. I looked at the gist documentation and although it would be fun, I don't have the time at the moment to explore that avenue (and scratching my head!). I also think it would require a lot of work testing to validate the code and that the gist index is better than the B-tree one. So I am following your advice using a B-tree index for now.

Basically I have an events table representing events with a duration (startdate, enddate). I was wondering if it would improve the performance if I was creating a separate table (indexed as you suggested) with the date ranges (startdate, enddate) and point to that from my events table. That would eliminate the duplicate ranges, costing a join to find the events within a date range, but maybe improving the search performance for events that overlap a certain date range. Any feedback on that?

Thanks

Fred

On Tue, Aug 25, 2009 at 18:52, Sam Mason <sam@samason.me.uk> wrote:
On Mon, Aug 24, 2009 at 05:24:59PM +0800, Fred Janon wrote:
> I am using 8.3 and pgAdmin III. I have a couple of tables using 2 DATE
> columns like 'startdate' and 'enddate' (just date, not interested in time in
> these columns). I have some queries (some using OVERLAPS) involving both
> 'startdate' and 'enddate' columns. I tried to create a multi column index
> using pgAdmin and it comes back with this error:
>
> ERROR: data type date has no default operator class for access method "gist"
> HINT: You must specify an operator class for the index or define a default
> operator class for the data type.

I've not had the opportunity to try doing this, but it would seem to
require hacking some C code to get this working.  Have a look here:

 http://www.postgresql.org/docs/current/static/gist.html

> I search the pdf docs and online without finding what an "operator class"
> for DATE would be. Would a multi-column index help in that case (OVERLAPS
> and dates comparison) anyway? Or should I just define an index for each of
> the dates?

An operator class bundles together various bits of code so that the
index knows which functions to call when it needs to compare things.

If you were creating an GiST index over a pair of dates to support
an "overlaps" operator you'd have to define a set of functions that
implement the various checks needed.


Depending on your data you may be easier with just a multi-column index
and using normal comparisons, I can't see how OVERLAPS could use indexes
as it does some strange things with NULL values.  The cases a B-Tree
index would win over GiST (this is an educated guess) is when few of the
ranges overlap within a table.  If that's the case then I'd do:

 CREATE INDEX tbl_start_end_idx ON tbl (startdate,enddate);

to create the btree index (they're the default, so nothing else is
needed) and then write queries as:

 SELECT r.range, t.*
 FROM tbl t, ranges r
 WHERE t.startdate <= r.rangeend
   AND t.enddate   >= r.rangestart;

if there are lots of overlapping ranges in the table then this is going
to do badly and you may need to start thinking about writing some C code
to get a GiST index going.

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Sam Mason
Date:
Subject: Re: How to create a multi-column index with 2 dates using 'gist'?
Next
From: Sam Mason
Date:
Subject: Re: How to create a multi-column index with 2 dates using 'gist'?