Re: Working around spurious unique constraint errors due to SERIALIZABLE bug - Mailing list pgsql-general
From | Craig Ringer |
---|---|
Subject | Re: Working around spurious unique constraint errors due to SERIALIZABLE bug |
Date | |
Msg-id | 1247853042.9349.199.camel@ayaki Whole thread Raw |
In response to | Re: Working around spurious unique constraint errors due to SERIALIZABLE bug (Florian Weimer <fweimer@bfk.de>) |
Responses |
Re: Working around spurious unique constraint errors due to SERIALIZABLE bug
Re: Working around spurious unique constraint errors due to SERIALIZABLE bug |
List | pgsql-general |
On Thu, 2009-07-16 at 14:13 +0000, Florian Weimer wrote: > The drawback is that some of the side effects of the INSERT occur > before the constraint check fails, so it seems to me that I still need > to perform the select. I was about to foolishly suggest: Instead of: SELECT 1 FROM x WHERE a = 4; IF NOT FOUND THEN INSERT INTO x (a,b) VALUES (4,10); END IF; trying: INSERT INTO x (a, b) SELECT 4, 10 WHERE NOT EXISTS(SELECT 1 FROM x WHERE a = 4); ... but then realised I couldn't come up with any justification for how it'd help (after all, the WHERE clause still has to be evaluated before the INSERT can proceed, there's still no predicate locking, and the statements can be evaluated concurrently) so I thought I'd test it. The test program, attached, demonstrates what I should've known in the first place. In SERIALIZABLE isolation, the above is *guaranteed* to fail every time there's conflict, because concurrent transactions cannot see changes committed by the others. So is a SELECT test then separate INSERT, by the way. In READ COMITTED you get away with it a lot of the time because the statement can see other transaction(s)' committed changes so the subquery often matches - but it's a race, and eventually you'll hit a situation where the subquery for two concurrent transactions is evaluated before either's insert is issued or at least is committed. In my test program I've managed as many as 1283 steps before two racing READ COMMITTED transactions collide. That's in a program designed to synchronize each transaction before each insert for maximum collision potential. With six racing transactions I've rarely seen more than three steps without a collision. ( I've attached the test program in case it's of any interest. It's a Python test controller that spawns slave processes which it synchronises using Pg's advisory locking. It ensures that the slaves all start each INSERT attempt together, and all finish before starting the next attempt. Signals are used for failure notification, cleanup, etc. ) Anyway, the point is that you're actually worse off in this particular situation thanks to your use of SERIALIZABLE isolation. However, READ COMMITTED just gives you a race you're likely to win most of the time instead of a guaranteed failure whenever there's a race, so it's not really a solution. Given that, it seems to me you'll have to rely on Pg's internal lower-level synchonization around unique indexes. Try the insert and see if it fails, then ROLLBACK TO SAVEPOINT (or use a PL/PgSQL exception block). As you noted, this does mean that certain side-effects may occur, including: - advancement of sequences due to nextval(...) calls - triggers that've done work that can't be rolled back, eg dblink calls, external file writes, inter-process communication etc If you really can't afford the INSERT side effects and can't redesign your code to be tolerant of them, you can always lock the table before an INSERT. If you can't afford to lock the table due to its impact on performance, you can potentially use Pg's advisory locking mechanism to protect your inserts. Eg (PL/PgSQL): PERFORM pg_advisory_lock(4); SELECT 1 FROM x WHERE a = 4; IF NOT FOUND THEN INSERT INTO x (a,b) VALUES (4,10); END IF; PERFORM pg_advisory_unlock(4); (You might want to use the two-argument form of the advisory locking calls if your IDs are INTEGER size not INT8, and use the table oid for the first argument.) If every possible INSERTer ensures it holds the lock on the id of interest before inserting, you'll be fine. Yes, it's ugly, but it preserves concurrent insert performance while eliminating failed INSERTs. A possible way to ensure that every possible INSERTer does do the right thing is to drop the INSERT privilege on the table and then use a SECURITY DEFINER function that checks the caller's rights and does the INSERT. Also: Is this really a phantom read? Your issue is not that you read a record that then vanishes or no longer matches your filter criteria; rather, it's that a record is created that matches your criteria after you tested for it. Certainly that wouldn't be possible if the concurrent transactions were actually executed serially, but does the standard actually require that this be the case? If it does, then compliant implementations would have to do predicate locking. Ouch. Does anybody do that? It seems MS-SQL implements very limited predicate locking (enough to handle your issue) but not enough to tackle aggregates or anything complex. -- Craig Ringer
Attachment
pgsql-general by date: