Thread: BUG #13667: SSI violation...

BUG #13667: SSI violation...

From
sean@chittenden.org
Date:
The following bug has been logged on the website:

Bug reference:      13667
Logged by:          Sean Chittenden
Email address:      sean@chittenden.org
PostgreSQL version: 9.4.4
Operating system:   Linux/FreeBSD
Description:

A few weeks back I was pointed in the direction of a bug that appears to be
an SSI violation.  The repro case below is written using bash and psql.

In a tight loop, we would expect 50% of these transactions to fail.
Instead, periodically we're seeing both transaction commit successfully.
The concern is that PostgreSQL thinks these transactions are commutable or
the SIREAD lock isn't erected correctly.

https://github.com/gfredericks/pg-serializability-bug

The two concurrent transactions required to tickle this bug are:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO things (id) VALUES ('A');
SELECT id FROM things;
COMMIT;

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
INSERT INTO things (id) VALUES ('B');
SELECT id FROM things;
COMMIT;

Cheers.  -sc

Re: BUG #13667: SSI violation...

From
Kevin Grittner
Date:
On Thursday, October 8, 2015 5:01 AM, "sean@chittenden.org" <sean@chittenden.org> wrote:

> A few weeks back I was pointed in the direction of a bug that appears to be
> an SSI violation.  The repro case below is written using bash and psql.
>
> In a tight loop, we would expect 50% of these transactions to fail.
> Instead, periodically we're seeing both transaction commit successfully.
> The concern is that PostgreSQL thinks these transactions are commutable or
> the SIREAD lock isn't erected correctly.
>
> https://github.com/gfredericks/pg-serializability-bug
>
> The two concurrent transactions required to tickle this bug are:
>
> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> INSERT INTO things (id) VALUES ('A');
> SELECT id FROM things;
> COMMIT;
>
> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> INSERT INTO things (id) VALUES ('B');
> SELECT id FROM things;
> COMMIT;

I agree this is a bug, and it behave like a race condition with a
very narrow window of time for the second process to hit.  It has
proven very hard to find the cause.  So far every attempt to
instrument the code to catch the bug in progress has distorted the
timings enough to prevent the bug from showing.  Next time I take a
run at this I think I need to insert delays at suspicious points in
the code to try to provoke it at greater volume.  If anyone else
wants to take a shot at that and show me what they found that does
that, it would expedite getting a fix.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #13667: SSI violation...

From
Kevin Grittner
Date:
On Friday, October 9, 2015 4:49 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

> I agree this is a bug, and it behave like a race condition with a
> very narrow window of time for the second process to hit.  It has
> proven very hard to find the cause.

Thanks to Thomas Munro joining me in a 2.5 day marathon hunt for
this bug, we have found it and squashed it with the attached patch.
The issue in heap_insert() in heapam.c was the bug that was
actually causing the failures in the specific tests we had, but
other locations doing insert, update, or delete had similar bugs,
and we found a few stray issues in predicate.c that are also fixed
here.

The basic problem was that in the initial implementation we tried a
little too hard to optimize, at the expense of correctness.  When
we were going to insert, for example, we checked for an existing
predicate lock on the relation first, to avoid the effort of
inserting the tuple, with WAL-logging, if we were just going to
roll back the transaction anyway because we found a rw-conflict
that completed a "dangerous structure".  The problem is, it allowed
this sequence to create an undetected serialization anomaly:

  T1
  ----------
                T2
                ----------
  lock
  read
  checklocks
                lock
                read
  insert

At this point things are hosed, regardless of timings past this.
The window in the test was small because T2 has to acquire its
SIReadLock and scan the whole table before T1 manages to insert a
tuple, but with enough contention T1 can stall long enough to see
this happen.  When T2 checks locks prior to its insert it will
detect the rw-conflict from T1 to T2, but the rw-conflict in the
other direction won't be detected, so no dangerous structure is
formed and nothing is rolled back.

Moving the check past the insert would fix the problem, but the
reason it was put in front is enshrined in a comment at each
problem location; for example:

 * We're about to do the actual insert -- but check for conflict first, to
 * avoid possibly having to roll back work we've just done.

These checks are about as close to free as you can get if the
transaction doing the check is not serializable; it doesn't even
need to take out a LW lock to determine there is nothing to be
done.  The reason given in the comment still has merit for
serializable transactions; even for them the check is orders of
magnitude cheaper than a WAL logged tuple insert.  It only requires
an occasional serialization failure detection there to come out
ahead.  So rather than move the existing check, we added a recheck
after.

Barring objections I will push this tomorrow, including
back-patching it to all supported branches.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

Re: BUG #13667: SSI violation...

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Thanks to Thomas Munro joining me in a 2.5 day marathon hunt for
> this bug, we have found it and squashed it with the attached patch.
> ...
> These checks are about as close to free as you can get if the
> transaction doing the check is not serializable; it doesn't even
> need to take out a LW lock to determine there is nothing to be
> done.  The reason given in the comment still has merit for
> serializable transactions; even for them the check is orders of
> magnitude cheaper than a WAL logged tuple insert.  It only requires
> an occasional serialization failure detection there to come out
> ahead.  So rather than move the existing check, we added a recheck
> after.

> Barring objections I will push this tomorrow, including
> back-patching it to all supported branches.

I'm okay with the substance of the patch, but that's a pretty miserable
excuse for fixing the comments.  Both the initial checks and the rechecks
ought to have at least a couple of sentences recapping the logic you gave
us here.

            regards, tom lane

Re: BUG #13667: SSI violation...

From
Kevin Grittner
Date:
On Friday, October 30, 2015 12:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I'm okay with the substance of the patch, but that's a pretty
> miserable excuse for fixing the comments.

Further testing and study of the patch suggested further
enhancements, attached, which I have pushed to all supported
branches (along with more comments) -- so the fix will be in the
next set of releases.

Thanks to those who reported the bug, and I apologize that it took
so long to track down.  Thanks also to Thomas Munro for working
with me in tracking it down!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment