Thread: BUG #13667: SSI violation...
The following bug has been logged on the website: Bug reference: 13667 Logged by: Sean Chittenden Email address: sean@chittenden.org PostgreSQL version: 9.4.4 Operating system: Linux/FreeBSD Description: A few weeks back I was pointed in the direction of a bug that appears to be an SSI violation. The repro case below is written using bash and psql. In a tight loop, we would expect 50% of these transactions to fail. Instead, periodically we're seeing both transaction commit successfully. The concern is that PostgreSQL thinks these transactions are commutable or the SIREAD lock isn't erected correctly. https://github.com/gfredericks/pg-serializability-bug The two concurrent transactions required to tickle this bug are: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; INSERT INTO things (id) VALUES ('A'); SELECT id FROM things; COMMIT; BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; INSERT INTO things (id) VALUES ('B'); SELECT id FROM things; COMMIT; Cheers. -sc
On Thursday, October 8, 2015 5:01 AM, "sean@chittenden.org" <sean@chittenden.org> wrote: > A few weeks back I was pointed in the direction of a bug that appears to be > an SSI violation. The repro case below is written using bash and psql. > > In a tight loop, we would expect 50% of these transactions to fail. > Instead, periodically we're seeing both transaction commit successfully. > The concern is that PostgreSQL thinks these transactions are commutable or > the SIREAD lock isn't erected correctly. > > https://github.com/gfredericks/pg-serializability-bug > > The two concurrent transactions required to tickle this bug are: > > BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; > INSERT INTO things (id) VALUES ('A'); > SELECT id FROM things; > COMMIT; > > BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; > INSERT INTO things (id) VALUES ('B'); > SELECT id FROM things; > COMMIT; I agree this is a bug, and it behave like a race condition with a very narrow window of time for the second process to hit. It has proven very hard to find the cause. So far every attempt to instrument the code to catch the bug in progress has distorted the timings enough to prevent the bug from showing. Next time I take a run at this I think I need to insert delays at suspicious points in the code to try to provoke it at greater volume. If anyone else wants to take a shot at that and show me what they found that does that, it would expedite getting a fix. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Friday, October 9, 2015 4:49 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > I agree this is a bug, and it behave like a race condition with a > very narrow window of time for the second process to hit. It has > proven very hard to find the cause. Thanks to Thomas Munro joining me in a 2.5 day marathon hunt for this bug, we have found it and squashed it with the attached patch. The issue in heap_insert() in heapam.c was the bug that was actually causing the failures in the specific tests we had, but other locations doing insert, update, or delete had similar bugs, and we found a few stray issues in predicate.c that are also fixed here. The basic problem was that in the initial implementation we tried a little too hard to optimize, at the expense of correctness. When we were going to insert, for example, we checked for an existing predicate lock on the relation first, to avoid the effort of inserting the tuple, with WAL-logging, if we were just going to roll back the transaction anyway because we found a rw-conflict that completed a "dangerous structure". The problem is, it allowed this sequence to create an undetected serialization anomaly: T1 ---------- T2 ---------- lock read checklocks lock read insert At this point things are hosed, regardless of timings past this. The window in the test was small because T2 has to acquire its SIReadLock and scan the whole table before T1 manages to insert a tuple, but with enough contention T1 can stall long enough to see this happen. When T2 checks locks prior to its insert it will detect the rw-conflict from T1 to T2, but the rw-conflict in the other direction won't be detected, so no dangerous structure is formed and nothing is rolled back. Moving the check past the insert would fix the problem, but the reason it was put in front is enshrined in a comment at each problem location; for example: * We're about to do the actual insert -- but check for conflict first, to * avoid possibly having to roll back work we've just done. These checks are about as close to free as you can get if the transaction doing the check is not serializable; it doesn't even need to take out a LW lock to determine there is nothing to be done. The reason given in the comment still has merit for serializable transactions; even for them the check is orders of magnitude cheaper than a WAL logged tuple insert. It only requires an occasional serialization failure detection there to come out ahead. So rather than move the existing check, we added a recheck after. Barring objections I will push this tomorrow, including back-patching it to all supported branches. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Kevin Grittner <kgrittn@ymail.com> writes: > Thanks to Thomas Munro joining me in a 2.5 day marathon hunt for > this bug, we have found it and squashed it with the attached patch. > ... > These checks are about as close to free as you can get if the > transaction doing the check is not serializable; it doesn't even > need to take out a LW lock to determine there is nothing to be > done. The reason given in the comment still has merit for > serializable transactions; even for them the check is orders of > magnitude cheaper than a WAL logged tuple insert. It only requires > an occasional serialization failure detection there to come out > ahead. So rather than move the existing check, we added a recheck > after. > Barring objections I will push this tomorrow, including > back-patching it to all supported branches. I'm okay with the substance of the patch, but that's a pretty miserable excuse for fixing the comments. Both the initial checks and the rechecks ought to have at least a couple of sentences recapping the logic you gave us here. regards, tom lane
On Friday, October 30, 2015 12:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm okay with the substance of the patch, but that's a pretty > miserable excuse for fixing the comments. Further testing and study of the patch suggested further enhancements, attached, which I have pushed to all supported branches (along with more comments) -- so the fix will be in the next set of releases. Thanks to those who reported the bug, and I apologize that it took so long to track down. Thanks also to Thomas Munro for working with me in tracking it down! -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company