Thread: Need help writing exclusion constraint
I have a table like this: create table event( destination_id integer not null references destination (destination_id), starts timestamp, ends timestamp ); I want to make sure that no two rows **with the same destination_id** overlap in time. I'm not sure how to write this exclusion constraint. I know how to make the constraint to prevent any two rows from overlapping, but I want to allow rows to overlap as long as they don't have the same destination_id. Thanks in advance. Matt
Matthew Wilson writes: > I have a table like this: > > create table event( > > destination_id integer not null references destination > (destination_id), > > starts timestamp, > ends timestamp > ); > > I want to make sure that no two rows **with the same destination_id** > overlap in time. > > I'm not sure how to write this exclusion constraint. I know how to make > the constraint to prevent any two rows from overlapping, but I want to > allow rows to overlap as long as they don't have the same > destination_id. Constraint expressions can only be simple boolean expressions, so can refer only to the column(s) of the current row you're inserting/updating, so to refer to other records (which you'll need to do to compare destination_ids) you need to create a function...something along the lines of this: CREATE OR REPLACE FUNCTION overlap_at_dest(dest integer, s timestamp, e timestamp) returns boolean as $_$ DECLARE c bigint; BEGIN select count(*) into c from event where (destination_id = dest) and ((starts, ends) overlaps (s,e)); return c = 0; END; $_$ LANGUAGE plpgsql; Then alter your table: ALTER TABLE event ADD CONSTRAINT event_overlap CHECK(overlap_at_dest(destination_id, starts, ends)); Cheers, Dan Popowich
Dne 15.1.2011 21:07, Daniel Popowich napsal(a): > CREATE OR REPLACE FUNCTION overlap_at_dest(dest integer, > s timestamp, > e timestamp) > returns boolean as $_$ > > DECLARE > c bigint; > BEGIN > select count(*) into c from event > where (destination_id = dest) > and ((starts, ends) overlaps (s,e)); > return c = 0; > END; > > $_$ LANGUAGE plpgsql; > > > > Then alter your table: > > ALTER TABLE event ADD CONSTRAINT event_overlap > CHECK(overlap_at_dest(destination_id, starts, ends)); There's a race condition - if there are two concurrent sessions, both inserting rows for the same destination_id, this trigger won't work I guess as the session does not see the rows inserted by the other one (this is due to the READ COMMITED isolation level). One way to fix this is locking - in this case you have to make sure that two sessions modifying the same destination_id will synchronize properly. The easiest way to od that is to lock the same row in some table - e.g. if you have a "destinations" table lock the row with the same destination_id. So the function should look something like this CREATE OR REPLACE FUNCTION overlap_at_dest(dest integer, s timestamp, e timestamp) returns boolean as $_$ DECLARE c bigint; BEGIN PERFORM * FROM destinations WHERE destination_id = dest FOR UPDATE; select count(*) into c from event where (destination_id = dest) and ((starts, ends) overlaps (s,e)); return c = 0; END; $_$ LANGUAGE plpgsql; Or something like that. If there's no suitable table, you can use advisory locks - just replace the PERFORM with pg_advisory_lock(dest); regards Tomas
Matthew Wilson <matt@tplus1.com> wrote: > I have a table like this: > > create table event( > > destination_id integer not null references destination > (destination_id), > > starts timestamp, > ends timestamp > ); > > I want to make sure that no two rows **with the same destination_id** > overlap in time. > > I'm not sure how to write this exclusion constraint. I know how to make > the constraint to prevent any two rows from overlapping, but I want to > allow rows to overlap as long as they don't have the same > destination_id. If I where you, i would use the temporal data typ ;-) It's a new extension for PostgreSQL, see here: http://temporal.projects.postgresql.org/reference.html, http://thoughts.j-davis.com/2010/09/25/exclusion-constraints-are-generalized-sql-unique/ http://www.slideshare.net/pgconf/not-just-unique-exclusion-constraints Regards, Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Sat, 2011-01-15 at 19:17 +0000, Matthew Wilson wrote: > create table event( > > destination_id integer not null references destination > (destination_id), > > starts timestamp, > ends timestamp > ); > > I want to make sure that no two rows **with the same destination_id** > overlap in time. First, you need to have some notion of "overlaps", so you need to combine the "starts" and "ends" into a single value. I recommend trying the PERIOD datatype (as Andreas suggests). They don't have to be in the same column necessarily (you could use a functional index that combines the values), but typically it would be helpful anyway. If you use the PERIOD datatype, the "overlaps" operator is "&&". So, assuming that the combined start/end is called "during", the exclusion constraint might look something like: EXCLUDE USING gist (destination_id WITH =, during WITH &&) You'll need to install the contrib module "btree_gist" first, so that "=" is indexable over integers using GiST. What's the above constraint says is: "rows R1 and R2 conflict if R1.destination_id = R2.destination_id AND R1.during && R2.during", and it will prevent R1 and R2 from both existing at the same time in your table. This method will be safe from race conditions. Hope this helps. Also, for more detailed examples that happen to be very similar to your problem, see: http://thoughts.j-davis.com/2009/11/08/temporal-keys-part-2/ http://thoughts.j-davis.com/2010/09/25/exclusion-constraints-are-generalized-sql-unique/ Regards, Jeff Davis
On Sat, 2011-01-15 at 15:07 -0500, Daniel Popowich wrote: > Constraint expressions can only be simple boolean expressions, so can > refer only to the column(s) of the current row you're > inserting/updating, Exclusion Constraints are a new feature in 9.0: http://www.postgresql.org/docs/9.0/static/ddl-constraints.html#DDL-CONSTRAINTS-EXCLUSION http://www.postgresql.org/docs/9.0/static/sql-createtable.html#SQL-CREATETABLE-EXCLUDE They allow you to constrain across rows, much like UNIQUE (in fact, the constraints that can be expressed by an exclusion constraint are a superset of the constraints that can be expressed by UNIQUE). > so to refer to other records (which you'll need to > do to compare destination_ids) you need to create a > function...something along the lines of this: > ... > ALTER TABLE event ADD CONSTRAINT event_overlap > CHECK(overlap_at_dest(destination_id, starts, ends)); As Tomas said, that's an unsafe thing to do. I do not recommend using a table-reading function in a check constraint. Regards, Jeff Davis
On Sat, 2011-01-15 at 21:32 +0100, Tomas Vondra wrote: > > ALTER TABLE event ADD CONSTRAINT event_overlap > > CHECK(overlap_at_dest(destination_id, starts, ends)); > > There's a race condition ... > One way to fix this is locking I do not recommend locking. In fact, the primary reason that exclusion constraints exist is to prevent unnecessary locking for problems exactly like this. I included some links in my other reply that demonstrate how to avoid that excessive locking while still being safe from race conditions. Regards, Jeff Davis
Jeff Davis writes: > On Sat, 2011-01-15 at 21:32 +0100, Tomas Vondra wrote: > > > ALTER TABLE event ADD CONSTRAINT event_overlap > > > CHECK(overlap_at_dest(destination_id, starts, ends)); > > > > There's a race condition > > ... > > > One way to fix this is locking > > I do not recommend locking. In fact, the primary reason that exclusion > constraints exist is to prevent unnecessary locking for problems exactly > like this. > > I included some links in my other reply that demonstrate how to avoid > that excessive locking while still being safe from race conditions. I totally understand the issues of race conditions. My original reply didn't address the issue...should have. Of course race conditions are only an issue for concurrent sessions...that depends on the total application architecture. Anyway...Jeff, all your answers depend on using new features in 9.0. What would you recommend for folk still using 8.4? Without 9.0 exclusion constraints, what else can you do besides using functions in check constraints (or triggers) with appropriate locking (at some level of the overall application architecture). Dan
On Wed, 2011-01-19 at 10:15 -0500, Daniel Popowich wrote: > Anyway...Jeff, all your answers depend on using new features in 9.0. > What would you recommend for folk still using 8.4? Without 9.0 > exclusion constraints, what else can you do besides using functions in > check constraints (or triggers) with appropriate locking (at some > level of the overall application architecture). There are several approaches, but all of them leave something to be desired, of course. I break the alternative solutions into 4 categories: 1. Full table lock -- instead of using a CHECK constraint, use a trigger and acquire a full table lock. The obvious problem here is the contention over that lock, so transactions will need to be kept short. Performance with this approach will not be very good, but perhaps that's OK in some situations. 2. What I call "quantization". That is, choose a size, say one hour, and assume that "1:00" really means "1:00 - 2:00". Then you can use a UNIQUE index. You have to align everything on the hour exactly (you can't do 1:30-2:30, for instance), and longer reservations require multiple entries. Choosing an appropriate chunk size is difficult, because if it's too big then it makes the application overly strict (and imposes inconveniences on your organization); but if you choose a size that is too small, it requires many entries for a single reservation (if you choose one minute, then a one-hour reservation requires 60 rows). These drawbacks are acceptable for some organizations, but not all. 3. Complex procedural code can be used. For instance, you might have a separate table (call it "my_locks") of rows that exist just for row-level locks. One row would represent 1:00 - 2:00, another 2:00 - 3:00, etc. And when you go to insert into the main table (call it "reservations"), you take a row-level lock on every row in my_locks that overlaps with the time you are inserting. So, if you are inserting 10:30 - 11:30 into the reservations table, you would take a lock on the rows 10:00 - 11:00 and 11:00 - 12:00 in your my_locks table. This effectively partitions your lock space, so that a reservation for 1:30 - 2:30 won't have to wait for a reservation between 10:30 and 11:30. There are other ideas along these lines as well, this is just an example of how adding complexity can help. Be careful though, the complexity explodes pretty quickly, and there are a lot of hidden performance traps. 4. You can work outside of the transactional system. Record both reservations, and check later for conflicts. The problem here is what to do when you find one. If you want to undo one reservation, and it was part of a larger transaction, you have to figure out how to undo the whole transaction. And you need to keep a log of which transactions need to be checked, so that if there is a crash you don't lose track and leave the conflicting reservations in there. As you can see, none of these are ideal. But, if you run into a specific problem, you can usually pick one of these approaches and make it work with careful determination. Exclusion constraints are much easier, however ;) Regards, Jeff Davis