Thread: Deferrable Unique Constraints
I noticed that implementing deferrable unique constraints is on the TODO list. I don't think its been said yet, but currently you can implement a deferrable unique constraint by using a deferrable constraint trigger together with a procedural language like plpgsql. If you need an index on a column, you can use a regular index instead of a unique index. Yes, I noticed that getting rid of constraint triggers is also on the TODO list. Below is an example. George Essig ------------------------------------------ create table t (x integer, y integer); create index t_x_in on t (x); -- Create a trigger function to test for duplicate values of x. -- Table t column x unique insert update trigger function. create or replace function t_x_un_ins_up_tr() RETURNS "trigger" AS ' declare invalid integer; begin -- Not absolutely necessary, but avoids a query if the new and old -- values of x are the same. if TG_OP = ''UPDATE'' then if new.x = old.x then return new; end if; end if; -- If 2 or more rows have the same value of x, set invalid to 1. select 1 into invalid from t where x = new.x offset 1 limit 1; -- If found, raise exception. if FOUND then raise EXCEPTION ''Violation of unique constraint on column x in table t by new row: x%, y %'', new.x, new.y; end if; return new; end;' LANGUAGE plpgsql; -- Create a deferrable constraint trigger that executes the trigger function. -- This runs at transaction commit time for every row that was inserted or updated. create constraint trigger t_x_un_ins_up_tr after insert or update on t deferrable initially deferred for each row execute procedure t_x_un_ins_up_tr (); -- Begin a transaction. -- Insert duplicate values of x successfully. -- Violation of constraint when transaction is committed. test=# begin; BEGIN test=# insert into t (x, y) values (1,1); INSERT 30332079 1 test=# insert into t (x, y) values (1,2); INSERT 30332080 1 test=# commit; ERROR: Violation of unique constraint on column x in table t by new row: x 1, y 1 test=# select * from t;x | y ---+--- (0 rows) -- Begin a transaction. -- Insert duplicate values of x successfully. -- Update one of the duplicate values to another value. -- Commit transaction successfully. test=# begin; BEGIN test=# insert into t (x, y) values (1,1); INSERT 30332083 1 test=# insert into t (x, y) values (1,2); INSERT 30332084 1 test=# update t set x = 2 where y = 2; UPDATE 1 test=# commit; COMMIT test=# select * from t;x | y ---+---1 | 12 | 2 (2 rows)
George Essig <george_essig@yahoo.com> writes: > I noticed that implementing deferrable unique constraints is on the > TODO list. I don't think its been said yet, but currently you can > implement a deferrable unique constraint by using a deferrable > constraint trigger together with a procedural language like plpgsql. You have a race condition. Two transactions can insert conflicting records and if they commit at the same time they would both not see each other's uncommitted records. Off the top of my head it seems the way to go about doing this would be to simply not insert the records in the index until commit time. This doesn't actually sound so hard, is there any problem with this approach? You could almost implement this with a deferred trigger, a boolean column, and a partial unique index. However I don't think deferred constraint triggers can modify the record data. The unfortunate bit here is that even if this worked the trigger setting the boolean flag which puts the record into the index would create a new copy of the record. Since it's modifying a record inserted by the same transaction it could in theory just modify it in place. I don't think any attempt is made to do that though. In any case a real native implementation wouldn't really need the flag so this problem wouldn't come up. -- greg
Greg Stark <gsstark@mit.edu> writes: > Off the top of my head it seems the way to go about doing this would be to > simply not insert the records in the index until commit time. This doesn't > actually sound so hard, is there any problem with this approach? Yeah:begin;insert into foo (key, ...) values (33, ...);select * from foo where key = 33;... If the SELECT uses an indexscan it will fail to find the just-inserted row. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Off the top of my head it seems the way to go about doing this would be to > > simply not insert the records in the index until commit time. This doesn't > > actually sound so hard, is there any problem with this approach? > > Yeah: > begin; > insert into foo (key, ...) values (33, ...); > select * from foo where key = 33; > ... > > If the SELECT uses an indexscan it will fail to find the just-inserted > row. Well presumably you would need a non-unique index created for query execution purposes. The unique index would be purely for enforcing the constraint. -- greg
On Wed, 2005-01-26 at 15:48 -0500, Greg Stark wrote: > Well presumably you would need a non-unique index created for query execution > purposes. The unique index would be purely for enforcing the constraint. Yuck. You could perhaps relax the uniqueness of the index during the transaction itself, and keep around some backend-local indication of which index entries it have been inserted. Then at transaction-commit you'd need to re-check the inserted index entries to verify that they are unique. It would be nice to just keep a pin on the leaf page that we inserted into, although we'd need to take care to follow subsequent page splits (could we use the existing L & Y techniques to do this?). Needless to say, it would be pretty ugly... -Neil
Neil Conway <neilc@samurai.com> writes: > You could perhaps relax the uniqueness of the index during the > transaction itself, and keep around some backend-local indication of > which index entries it have been inserted. Then at transaction-commit > you'd need to re-check the inserted index entries to verify that they > are unique. Yeah, what I've been visualizing is a list of "tentative duplicates" --- that is, you do the immediate unique check same as now, and if it passes (which hopefully is most of the time) then you're in the clear. Otherwise you log the apparent duplicate key value to be rechecked at commit. > It would be nice to just keep a pin on the leaf page that we > inserted into, although we'd need to take care to follow subsequent page > splits (could we use the existing L & Y techniques to do this?). I do not believe we can do that without risking deadlocks. It'll be safer just to repeat the search for each key value that's of concern. regards, tom lane
On Thu, Jan 27, 2005 at 03:31:29PM +1100, Neil Conway wrote: > You could perhaps relax the uniqueness of the index during the > transaction itself, and keep around some backend-local indication of > which index entries it have been inserted. Then at transaction-commit > you'd need to re-check the inserted index entries to verify that they > are unique. It would be nice to just keep a pin on the leaf page that we > inserted into, although we'd need to take care to follow subsequent page > splits (could we use the existing L & Y techniques to do this?). Maybe we can do something like 1. use a boolean-returning unique insertion. If it fails, returns false, doesn't ereport(ERROR); if it works, inserts and returns true. 2. the caller checks the return value. If false, records the insertion attempt into an should-check-later list. 3. at transaction end, unique insertion is tried again with the items on the list. If it fails, the transaction is aborted. It's only a SMOC, nothing difficult AFAICS. Disk-spilling logic included, because it'd be probably needed. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) Si no sabes adonde vas, es muy probable que acabes en otra parte.
Tom Lane <tgl@sss.pgh.pa.us> writes: > Yeah, what I've been visualizing is a list of "tentative duplicates" --- > that is, you do the immediate unique check same as now, and if it passes > (which hopefully is most of the time) then you're in the clear. I don't see how you're in the clear. If session A does an insert and it doesn't see a duplicate and doesn't commit, but then B does an insert and sees the duplicate from A and marks his tentative, and then commits, shouldn't B's commit succeed? Then when A commits shouldn't his fail? So A still has to recheck even if there was no sign of a duplicate when he inserted. Unless there's some way for B to indicate to A that his insert has become tentative then I think you have to resign yourself to checking all deferred unique constraints, not just ones that seem suspect. -- greg
Greg Stark <gsstark@mit.edu> writes: > I don't see how you're in the clear. If session A does an insert and it > doesn't see a duplicate and doesn't commit, but then B does an insert and sees > the duplicate from A and marks his tentative, and then commits, shouldn't B's > commit succeed? No. B, being the second to get there, has to wait to see if A commits or not. This is true already and it wouldn't change. We would however postpone the wait until B's commit time. regards, tom lane
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > It's only a SMOC, nothing difficult AFAICS. Disk-spilling logic > included, because it'd be probably needed. The question of disk-spilling is really the only part that seems very hard. It would be useful to see if we could solve the problem of spilling pending-trigger-event lists at the same time. Common infrastructure possible, perhaps? regards, tom lane