Thread: Constraining overlapping date ranges

Constraining overlapping date ranges

From

"McGehee, Robert"

Date:

21 December 2010, 14:10:41

PostgreSQLers,
I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid.  

Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as such this:

CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);

For a given id, I'd like to enforce that there is only one valid value on a given date. For instance, this would be
acceptable:

id    start_date    stop_date    value
2    2010-11-01    2010-12-01    3
2    2010-12-02    2010-12-15    4
3    2010-10-15    2010-12-15    -3

But this would not: (notice start_date of line 2 is before stop_date of line 1).
id    start_date    stop_date    value
2    2010-11-01    2010-12-01    3
2    2010-11-30    2010-12-15    4
3    2010-10-15    2010-12-15    -3

I'd also appreciate it if anyone can provide any indexing hints on this table to optimize queries like:
SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND stop_date;

Thanks in advance, and sorry if I overlooked any obvious documentation!

Robert McGehee

Re: Constraining overlapping date ranges

From

Richard Broersma

Date:

21 December 2010, 14:18:17

On Tue, Dec 21, 2010 at 7:49 AM, McGehee, Robert
<Robert.McGehee@geodecapital.com> wrote:
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid. 

 There is something you can try, but it is not exactly what you want
since it is based on timestamps rather than dates:

http://temporal.projects.postgresql.org/


--
Regards,
Richard Broersma Jr.

Re: Constraining overlapping date ranges

From

Vincent Veyron

Date:

22 December 2010, 06:58:51

Le mardi 21 décembre 2010 à 10:49 -0500, McGehee, Robert a écrit :
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid.  
>
> Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as such this:
>
> CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);
>
> For a given id, I'd like to enforce that there is only one valid value on a given date. For instance, this would be
acceptable:
>
> id    start_date    stop_date    value
> 2    2010-11-01    2010-12-01    3
> 2    2010-12-02    2010-12-15    4
> 3    2010-10-15    2010-12-15    -3
>
> But this would not: (notice start_date of line 2 is before stop_date of line 1).
> id    start_date    stop_date    value
> 2    2010-11-01    2010-12-01    3
> 2    2010-11-30    2010-12-15    4
> 3    2010-10-15    2010-12-15    -3
>

You could use a rule, as explained here:

http://www.postgresql.org/docs/8.4/static/rules-update.html

In your case, something like :

create table bad (like tbl);

CREATE RULE no_overlap AS ON INSERT to tbl WHERE EXISTS (SELECT 1 from
tbl t1 WHERE t1.start_date between NEW.start_date and NEW.stop_date or
t1.stop_date between NEW.start_date and NEW.stop_date AND t1.id=NEW.id)
DO INSTEAD INSERT INTO bad VALUES
(NEW.id,NEW.start_date,NEW.stop_date,NEW.value);

Then have your app check if the new record went into bad, for instance.



--
Vincent Veyron
http://marica.fr/
Progiciel de gestion des dossiers de contentieux et d'assurance pour le service juridique

Re: Constraining overlapping date ranges

From

Vincent Veyron

Date:

22 December 2010, 06:58:52

Le mardi 21 décembre 2010 à 10:49 -0500, McGehee, Robert a écrit :
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid.  
>
> Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as such this:
>
> CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);
>
> For a given id, I'd like to enforce that there is only one valid value on a given date. For instance, this would be
acceptable:
>
> id    start_date    stop_date    value
> 2    2010-11-01    2010-12-01    3
> 2    2010-12-02    2010-12-15    4
> 3    2010-10-15    2010-12-15    -3
>
> But this would not: (notice start_date of line 2 is before stop_date of line 1).
> id    start_date    stop_date    value
> 2    2010-11-01    2010-12-01    3
> 2    2010-11-30    2010-12-15    4
> 3    2010-10-15    2010-12-15    -3
>

You could use a rule, as explained here:

http://www.postgresql.org/docs/8.4/static/rules-update.html

In your case, something like :

create table bad (like tbl);

CREATE RULE no_overlap AS ON INSERT to tbl WHERE EXISTS (SELECT 1 from
tbl t1 WHERE t1.start_date between NEW.start_date and NEW.stop_date or
t1.stop_date between NEW.start_date and NEW.stop_date AND t1.id=NEW.id)
DO INSTEAD INSERT INTO bad VALUES
(NEW.id,NEW.start_date,NEW.stop_date,NEW.value);

Then have your app check if the new record went into bad, for instance.



--
Vincent Veyron
http://marica.fr/
Progiciel de gestion des dossiers de contentieux et d'assurance pour le service juridique

Re: Constraining overlapping date ranges

From

Filip Rembiałkowski

Date:

22 December 2010, 09:28:32

2010/12/21 McGehee, Robert <Robert.McGehee@geodecapital.com>:
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid. 
>
> Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as such this:
>
> CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);
>
> For a given id, I'd like to enforce that there is only one valid value on a given date. For instance, this would be
acceptable:
>
> id      start_date      stop_date       value
> 2       2010-11-01      2010-12-01      3
> 2       2010-12-02      2010-12-15      4
> 3       2010-10-15      2010-12-15      -3
>
> But this would not: (notice start_date of line 2 is before stop_date of line 1).
> id      start_date      stop_date       value
> 2       2010-11-01      2010-12-01      3
> 2       2010-11-30      2010-12-15      4
> 3       2010-10-15      2010-12-15      -3
>
> I'd also appreciate it if anyone can provide any indexing hints on this table to optimize queries like:
> SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND stop_date;
>
> Thanks in advance, and sorry if I overlooked any obvious documentation!
>


No one has mentioned exclusionn constraints yet... :-)

I have combined it with period contrib (see
http://pgfoundry.org/projects/temporal/) to do what you want.
Note: you don't have to use this contrib; equivalently, you can CAST
date periods to boxes, just make sure the function is injective.

-- index requires immutable function, and for some unknown reason
(anybody?) point and period constructors are not immutable...
CREATE OR REPLACE FUNCTION f_point(integer) RETURNS point LANGUAGE sql AS
'SELECT point($1,$1)' IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION f_period(date,date) RETURNS period LANGUAGE sql AS
'SELECT period($1,$2)' IMMUTABLE STRICT;

CREATE TABLE tbl (
    id integer NOT NULL,
    start_date date NOT NULL,
    end_date date,
    CONSTRAINT tbl_exclude_overlaps EXCLUDE USING gist ( f_point(id)
WITH ~=, f_period(start_date,end_date) WITH && )
);

INSERT INTO tbl SELECT 1, '2010-01-01', '2010-12-31';
INSERT 0 1
INSERT INTO tbl SELECT 1, '2011-01-01', '2011-03-31';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-01-01', '2010-12-21';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-12-22', '2010-12-31';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-12-25', '2010-12-26';
ERROR:  conflicting key value violates exclusion constraint
"tbl_exclude_overlaps"
DETAIL:  Key (f_point(id), f_period(start_date, end_date))=((2,2),
[2010-12-25 00:00:00+01, 2010-12-26 00:00:00+01)) conflicts with
existing key (f_point(id), f_period(start_date, end_date))=((2,2),
[2010-12-22 00:00:00+01, 2010-12-31 00:00:00+01)).


greets,
Filip

Re: Constraining overlapping date ranges

From

Thomas Kellerer

Date:

22 December 2010, 09:42:38

Filip Rembiałkowski, 22.12.2010 14:28:
> INSERT INTO tbl SELECT 1, '2010-01-01', '2010-12-31';
> INSERT 0 1

I'm curious why you use this syntax as you have fixed values and could use the "standard" VALUES construct without
problems:

INSERT INTO tbl VALUES (1, '2010-01-01', '2010-12-31');

Regards
Thomas

Re: Constraining overlapping date ranges

From

Filip Rembiałkowski

Date:

22 December 2010, 10:04:14

2010/12/22 Thomas Kellerer <spam_eater@gmx.net>:

> I'm curious why you use this syntax as you have fixed values and could use
> the "standard" VALUES construct without problems:
>
> INSERT INTO tbl VALUES (1, '2010-01-01', '2010-12-31');

no particular reason; just two keystrokes less :-)

The SQL code is postgres-specific anyway.

Re: Constraining overlapping date ranges

From

"McGehee, Robert"

Date:

22 December 2010, 11:58:43

Filip,
The period type + exclusion constraint was exactly the avenue I was taking, and I was also perplexed that period was
notdefined as immutable. Your solution, to force period() to be immutable, seemed like a reasonable one (though it
didn'twork, see below).
 

I tried implementing this on my existing table:
ALTER TABLE tbl ADD EXCLUDE USING gist
    (f_point(id) WITH ~=, f_period(start_date, stop_date) WITH &&);

and the index correctly identified all of the overlapping periods in my table by failing with details on the improper
key.HOWEVER, after fixing all of the offending data, the index still failed to create:
 
ERROR:  failed to re-find tuple within index "tbl_f_point_f_period_excl"
HINT:  This may be because of a non-immutable index expression.

(What does this mean?) The index seems to work on an empty table (per your example), but not on my populated table, and
theHINT seems to indicate that period() is not, in fact, immutable, which makes me nervous about using this solution.
Hopefullysomeone knows what's going on here.
 

So I think my other options are to use period data columns (rather than start_date and end_date), then no coercion is
neededin the EXCLUDE clause; try to CAST the date periods to boxes (as you suggested); use a rule, per Vincent's
suggestion;or not check for overlap as it may be uncommon.
 

Also, I found Jeff Davis's summary of the exclusion constraint helpful in case anyone's interested:
http://thoughts.j-davis.com/2010/09/25/exclusion-constraints-are-generalized-sql-unique/

Thanks, Robert

PS. I don't think the f_point function is necessary. Something like:
... EXCLUDE USING gist (id WITH =, f_period(start_date, end_date) WITH &&)
seems equivalent to your suggestion:
... EXCLUDE USING gist (f_point(id) WITH ~=, f_period(start_date, end_date) WITH &&)

-----Original Message-----
From: Filip Rembiałkowski [mailto:filip.rembialkowski@gmail.com] 
Sent: Wednesday, December 22, 2010 8:28 AM
To: McGehee, Robert
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Constraining overlapping date ranges

2010/12/21 McGehee, Robert <Robert.McGehee@geodecapital.com>:
> PostgreSQLers,
> I'm hoping for some help creating a constraint/key on a table such that there are no overlapping ranges of dates for
anyid.
 
>
> Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table as such this:
>
> CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value REAL);
>
> For a given id, I'd like to enforce that there is only one valid value on a given date. For instance, this would be
acceptable:
>
> id      start_date      stop_date       value
> 2       2010-11-01      2010-12-01      3
> 2       2010-12-02      2010-12-15      4
> 3       2010-10-15      2010-12-15      -3
>
> But this would not: (notice start_date of line 2 is before stop_date of line 1).
> id      start_date      stop_date       value
> 2       2010-11-01      2010-12-01      3
> 2       2010-11-30      2010-12-15      4
> 3       2010-10-15      2010-12-15      -3
>
> I'd also appreciate it if anyone can provide any indexing hints on this table to optimize queries like:
> SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND stop_date;
>
> Thanks in advance, and sorry if I overlooked any obvious documentation!
>


No one has mentioned exclusionn constraints yet... :-)

I have combined it with period contrib (see
http://pgfoundry.org/projects/temporal/) to do what you want.
Note: you don't have to use this contrib; equivalently, you can CAST
date periods to boxes, just make sure the function is injective.

-- index requires immutable function, and for some unknown reason
(anybody?) point and period constructors are not immutable...
CREATE OR REPLACE FUNCTION f_point(integer) RETURNS point LANGUAGE sql AS
'SELECT point($1,$1)' IMMUTABLE STRICT;
CREATE OR REPLACE FUNCTION f_period(date,date) RETURNS period LANGUAGE sql AS
'SELECT period($1,$2)' IMMUTABLE STRICT;

CREATE TABLE tbl (
    id integer NOT NULL,
    start_date date NOT NULL,
    end_date date,
    CONSTRAINT tbl_exclude_overlaps EXCLUDE USING gist ( f_point(id)
WITH ~=, f_period(start_date,end_date) WITH && )
);

INSERT INTO tbl SELECT 1, '2010-01-01', '2010-12-31';
INSERT 0 1
INSERT INTO tbl SELECT 1, '2011-01-01', '2011-03-31';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-01-01', '2010-12-21';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-12-22', '2010-12-31';
INSERT 0 1
INSERT INTO tbl SELECT 2, '2010-12-25', '2010-12-26';
ERROR:  conflicting key value violates exclusion constraint
"tbl_exclude_overlaps"
DETAIL:  Key (f_point(id), f_period(start_date, end_date))=((2,2),
[2010-12-25 00:00:00+01, 2010-12-26 00:00:00+01)) conflicts with
existing key (f_point(id), f_period(start_date, end_date))=((2,2),
[2010-12-22 00:00:00+01, 2010-12-31 00:00:00+01)).


greets,
Filip

Re: Constraining overlapping date ranges

From

Filip Rembiałkowski

Date:

25 December 2010, 13:45:06

that's really interesting; looks like a bug if point constructor is
really mutable? can you prepare a test case with non-overlapping
ranges which does not satisfy exclusion constraint?

regarding your last comment about unnecessary f_point function: I
tried "id WITH =" but postgres complained about lack of proper gist
opclass

2010/12/22, McGehee, Robert <Robert.McGehee@geodecapital.com>:
> Filip,
> The period type + exclusion constraint was exactly the avenue I was taking,
> and I was also perplexed that period was not defined as immutable. Your
> solution, to force period() to be immutable, seemed like a reasonable one
> (though it didn't work, see below).
>
> I tried implementing this on my existing table:
> ALTER TABLE tbl ADD EXCLUDE USING gist
>     (f_point(id) WITH ~=, f_period(start_date, stop_date) WITH &&);
>
> and the index correctly identified all of the overlapping periods in my
> table by failing with details on the improper key. HOWEVER, after fixing all
> of the offending data, the index still failed to create:
> ERROR:  failed to re-find tuple within index "tbl_f_point_f_period_excl"
> HINT:  This may be because of a non-immutable index expression.
>
> (What does this mean?) The index seems to work on an empty table (per your
> example), but not on my populated table, and the HINT seems to indicate that
> period() is not, in fact, immutable, which makes me nervous about using this
> solution. Hopefully someone knows what's going on here.
>
> So I think my other options are to use period data columns (rather than
> start_date and end_date), then no coercion is needed in the EXCLUDE clause;
> try to CAST the date periods to boxes (as you suggested); use a rule, per
> Vincent's suggestion; or not check for overlap as it may be uncommon.
>
> Also, I found Jeff Davis's summary of the exclusion constraint helpful in
> case anyone's interested:
> http://thoughts.j-davis.com/2010/09/25/exclusion-constraints-are-generalized-sql-unique/
>
> Thanks, Robert
>
> PS. I don't think the f_point function is necessary. Something like:
> ... EXCLUDE USING gist (id WITH =, f_period(start_date, end_date) WITH &&)
> seems equivalent to your suggestion:
> ... EXCLUDE USING gist (f_point(id) WITH ~=, f_period(start_date, end_date)
> WITH &&)
>
> -----Original Message-----
> From: Filip Rembiałkowski [mailto:filip.rembialkowski@gmail.com]
> Sent: Wednesday, December 22, 2010 8:28 AM
> To: McGehee, Robert
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Constraining overlapping date ranges
>
> 2010/12/21 McGehee, Robert <Robert.McGehee@geodecapital.com>:
>> PostgreSQLers,
>> I'm hoping for some help creating a constraint/key on a table such that
>> there are no overlapping ranges of dates for any id.
>>
>> Specifically: Using PostgreSQL 9.0.1, I'm creating a name-value pair table
>> as such this:
>>
>> CREATE TABLE tbl (id INTEGER, start_date DATE, stop_date DATE, value
>> REAL);
>>
>> For a given id, I'd like to enforce that there is only one valid value on
>> a given date. For instance, this would be acceptable:
>>
>> id      start_date      stop_date       value
>> 2       2010-11-01      2010-12-01      3
>> 2       2010-12-02      2010-12-15      4
>> 3       2010-10-15      2010-12-15      -3
>>
>> But this would not: (notice start_date of line 2 is before stop_date of
>> line 1).
>> id      start_date      stop_date       value
>> 2       2010-11-01      2010-12-01      3
>> 2       2010-11-30      2010-12-15      4
>> 3       2010-10-15      2010-12-15      -3
>>
>> I'd also appreciate it if anyone can provide any indexing hints on this
>> table to optimize queries like:
>> SELECT value FROM tbl WHERE id=2 and '2010-12-02' BETWEEN start_date AND
>> stop_date;
>>
>> Thanks in advance, and sorry if I overlooked any obvious documentation!
>>
>
>
> No one has mentioned exclusionn constraints yet... :-)
>
> I have combined it with period contrib (see
> http://pgfoundry.org/projects/temporal/) to do what you want.
> Note: you don't have to use this contrib; equivalently, you can CAST
> date periods to boxes, just make sure the function is injective.
>
> -- index requires immutable function, and for some unknown reason
> (anybody?) point and period constructors are not immutable...
> CREATE OR REPLACE FUNCTION f_point(integer) RETURNS point LANGUAGE sql AS
> 'SELECT point($1,$1)' IMMUTABLE STRICT;
> CREATE OR REPLACE FUNCTION f_period(date,date) RETURNS period LANGUAGE sql
> AS
> 'SELECT period($1,$2)' IMMUTABLE STRICT;
>
> CREATE TABLE tbl (
>     id integer NOT NULL,
>     start_date date NOT NULL,
>     end_date date,
>     CONSTRAINT tbl_exclude_overlaps EXCLUDE USING gist ( f_point(id)
> WITH ~=, f_period(start_date,end_date) WITH && )
> );
>
> INSERT INTO tbl SELECT 1, '2010-01-01', '2010-12-31';
> INSERT 0 1
> INSERT INTO tbl SELECT 1, '2011-01-01', '2011-03-31';
> INSERT 0 1
> INSERT INTO tbl SELECT 2, '2010-01-01', '2010-12-21';
> INSERT 0 1
> INSERT INTO tbl SELECT 2, '2010-12-22', '2010-12-31';
> INSERT 0 1
> INSERT INTO tbl SELECT 2, '2010-12-25', '2010-12-26';
> ERROR:  conflicting key value violates exclusion constraint
> "tbl_exclude_overlaps"
> DETAIL:  Key (f_point(id), f_period(start_date, end_date))=((2,2),
> [2010-12-25 00:00:00+01, 2010-12-26 00:00:00+01)) conflicts with
> existing key (f_point(id), f_period(start_date, end_date))=((2,2),
> [2010-12-22 00:00:00+01, 2010-12-31 00:00:00+01)).
>
>
> greets,
> Filip
>