Thread: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
levertond@googlemail.com
Date:
The following bug has been logged on the website: Bug reference: 8273 Logged by: David Leverton Email address: levertond@googlemail.com PostgreSQL version: Unsupported/Unknown Operating system: RHEL 5 x86_64 Description: The following test case causes a backend assertion failure in 9.3 beta2: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; CREATE TABLE testing( Â x INTEGER PRIMARY KEY ); INSERT INTO testing VALUES(1); SELECT * FROM testing WHERE x = 1 FOR UPDATE; SAVEPOINT test; UPDATE testing SET x = 2 WHERE x = 1; ROLLBACK TO test; UPDATE testing SET x = 3 WHERE x = 1; ROLLBACK; TRAP: FailedAssertion("!(((xid) != ((TransactionId) 0)))", File: "predicate.c", Line: 3936) Postgres was installed using the RPMs from http://yum.pgrpms.org/, and is using a near-default configuration (changes to port number, data directory and pg_hba.conf, and some roles created, but nothing likely to influence this bug). The full backtrace is as follows: #0Â 0x0000003d66a30265 in raise () from /lib64/libc.so.6 #1Â 0x0000003d66a31d10 in abort () from /lib64/libc.so.6 #2Â 0x000000000074af8d in ExceptionalCondition ( Â Â Â conditionName=<value optimized out>, errorType=<value optimized out>, Â Â Â fileName=<value optimized out>, lineNumber=<value optimized out>) Â Â Â at assert.c:54 #3Â 0x00000000006784bf in CheckForSerializableConflictOut (visible=1 '\001', Â Â Â relation=<value optimized out>, tuple=<value optimized out>, Â Â Â buffer=<value optimized out>, snapshot=<value optimized out>) Â Â Â at predicate.c:3936 #4Â 0x000000000049208e in heap_hot_search_buffer (tid=0x2abfdac, Â Â Â relation=0x2ad4b49eba78, buffer=260, snapshot=0x29cb3b0, Â Â Â heapTuple=0x2abfda8, all_dead=0x7fff510be26f "\001X\375\253\002", Â Â Â first_call=1 '\001') at heapam.c:1730 #5Â 0x000000000049ad1a in index_fetch_heap (scan=0x2abfd58) at indexam.c:529 #6Â 0x000000000049afb3 in index_getnext (scan=0x2abfd58, Â Â Â direction=ForwardScanDirection) at indexam.c:612 #7Â 0x00000000005adafb in IndexNext (node=0x2abe9d0) at nodeIndexscan.c:78 #8Â 0x00000000005a1cf8 in ExecScanFetch (node=0x2abe9d0, Â Â Â accessMtd=0x5adab0 <IndexNext>, recheckMtd=0x5ada60 <IndexRecheck>) Â Â Â at execScan.c:82 #9Â ExecScan (node=0x2abe9d0, accessMtd=0x5adab0 <IndexNext>, Â Â Â recheckMtd=0x5ada60 <IndexRecheck>) at execScan.c:167 #10 0x000000000059a94e in ExecProcNode (node=0x2abe9d0) at execProcnode.c:404 #11 0x00000000005b1253 in ExecModifyTable (node=0x2abe6c0) Â Â Â at nodeModifyTable.c:918 #12 0x000000000059a90c in ExecProcNode (node=0x2abe6c0) at execProcnode.c:377 #13 0x0000000000599b4d in ExecutePlan (queryDesc=0x2a98408, direction=12114, Â Â Â count=0) at execMain.c:1470 #14 standard_ExecutorRun (queryDesc=0x2a98408, direction=12114, count=0) Â Â Â at execMain.c:306 #15 0x0000000000683e5f in ProcessQuery (plan=0x2ab32c0, Â Â Â sourceText=0x2a67128 "UPDATE testing SET x = 3 WHERE x = 1;", Â Â Â params=<value optimized out>, dest=0x2ab33b8, Â Â Â completionTag=0x7fff510be700 "") at pquery.c:185 #16 0x00000000006840d7 in PortalRunMulti (portal=0x29d12a8, Â Â Â isTopLevel=<value optimized out>, dest=0x2ab33b8, altdest=0x2ab33b8, Â Â Â completionTag=0x7fff510be700 "") at pquery.c:1279 #17 0x0000000000684cf2 in PortalRun (portal=0x29d12a8, Â Â Â count=9223372036854775807, isTopLevel=1 '\001', dest=0x2ab33b8, Â Â Â altdest=0x2ab33b8, completionTag=0x7fff510be700 "") at pquery.c:816 #18 0x000000000068105d in exec_simple_query ( Â Â Â query_string=0x2a67128 "UPDATE testing SET x = 3 WHERE x = 1;") Â Â Â at postgres.c:1048 #19 0x000000000068255e in PostgresMain (argc=<value optimized out>, Â Â Â argv=<value optimized out>, dbname=0x29d6088 "postgres", Â Â Â username=<value optimized out>) at postgres.c:3985 #20 0x00000000006355b6 in ServerLoop () at postmaster.c:3987 #21 0x0000000000638a77 in PostmasterMain (argc=5, argv=0x29b0b80) Â Â Â at postmaster.c:1246 #22 0x00000000005ce293 in main (argc=5, argv=<value optimized out>) Â Â Â at main.c:196
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Jeff Janes
Date:
On Mon, Jul 1, 2013 at 3:43 AM, <levertond@googlemail.com> wrote: > The following bug has been logged on the website: > > Bug reference: 8273 > Logged by: David Leverton > Email address: levertond@googlemail.com > PostgreSQL version: Unsupported/Unknown > Operating system: RHEL 5 x86_64 > Description: > > The following test case causes a backend assertion failure in 9.3 beta2: > > > START TRANSACTION ISOLATION LEVEL SERIALIZABLE; > CREATE TABLE testing( > x INTEGER PRIMARY KEY > ); > INSERT INTO testing VALUES(1); > SELECT * FROM testing WHERE x = 1 FOR UPDATE; > SAVEPOINT test; > UPDATE testing SET x = 2 WHERE x = 1; > ROLLBACK TO test; > UPDATE testing SET x = 3 WHERE x = 1; > ROLLBACK; > > > TRAP: FailedAssertion("!(((xid) != ((TransactionId) 0)))", File: > "predicate.c", Line: 3936) > > > Postgres was installed using the RPMs from http://yum.pgrpms.org/, and is > using a near-default configuration (changes to port number, data directory > and pg_hba.conf, and some roles created, but nothing likely to influence > this bug). > I can confirm this compiling from source on CentOS 6.4. The bug was introduced in commit: 0ac5ad5... Improve concurrency of foreign key locking. I don't know what more to look into on this, so I'm cc Alvaro, the patch author. Cheers, Jeff
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Alvaro Herrera
Date:
Jeff Janes escribió: > The bug was introduced in commit: 0ac5ad5... Improve concurrency of foreign > key locking. > > I don't know what more to look into on this, so I'm cc Alvaro, the patch > author. Thanks, will look. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Alvaro Herrera
Date:
levertond@googlemail.com wrote: > START TRANSACTION ISOLATION LEVEL SERIALIZABLE; > CREATE TABLE testing( > x INTEGER PRIMARY KEY > ); > INSERT INTO testing VALUES(1); > SELECT * FROM testing WHERE x = 1 FOR UPDATE; > SAVEPOINT test; > UPDATE testing SET x = 2 WHERE x = 1; > ROLLBACK TO test; > UPDATE testing SET x = 3 WHERE x = 1; > ROLLBACK; Thanks for the test case. This one-liner patch fixes it: diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c index 55563ea..fa9c9d7 100644 --- a/src/backend/utils/time/tqual.c +++ b/src/backend/utils/time/tqual.c @@ -1287,7 +1287,7 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin, { if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */ return HEAPTUPLE_INSERT_IN_PROGRESS; - if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) + if (HeapTupleHeaderIsOnlyLocked(tuple)) return HEAPTUPLE_INSERT_IN_PROGRESS; /* inserted and then deleted by same xact */ return HEAPTUPLE_DELETE_IN_PROGRESS; That is, I was taking a shortcut here by checking only the infomask bits about locking, and not resolving the MultiXact to see if the updater (sub)transaction was aborted; but this was wrong, because a tuple which was created here and updated by an aborted subxact needs to be seen as in-progress insertion, not an in-progress delete. This causes trouble to the predicate.c code because it tries to obtain the update XID for the tuple, but since the update has aborted, that returned zero, causing the assert failure. Sadly, this has performance implications, because what previously was just an in-place check of bit flags has now become a function call. Perhaps it'd be smart to optimize it a bit so that we first check the flags, and only call the function if that fails. Before pushing this patch, I'm going to examine the other users of HEAP_XMAX_IS_LOCKED_ONLY to ensure they don't also need the same change. I think some of them were prepared for the possibility that there were aborted updaters; but this was a late addition so perhaps I missed others that aren't. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Andres Freund
Date:
On 2013-07-19 13:46:44 -0400, Alvaro Herrera wrote: > levertond@googlemail.com wrote: > > > START TRANSACTION ISOLATION LEVEL SERIALIZABLE; > > CREATE TABLE testing( > > x INTEGER PRIMARY KEY > > ); > > INSERT INTO testing VALUES(1); > > SELECT * FROM testing WHERE x = 1 FOR UPDATE; > > SAVEPOINT test; > > UPDATE testing SET x = 2 WHERE x = 1; > > ROLLBACK TO test; > > UPDATE testing SET x = 3 WHERE x = 1; > > ROLLBACK; > > Thanks for the test case. This one-liner patch fixes it: > > diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c > index 55563ea..fa9c9d7 100644 > --- a/src/backend/utils/time/tqual.c > +++ b/src/backend/utils/time/tqual.c > @@ -1287,7 +1287,7 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin, > { > if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */ > return HEAPTUPLE_INSERT_IN_PROGRESS; > - if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)) > + if (HeapTupleHeaderIsOnlyLocked(tuple)) > return HEAPTUPLE_INSERT_IN_PROGRESS; > /* inserted and then deleted by same xact */ > return HEAPTUPLE_DELETE_IN_PROGRESS; > > > That is, I was taking a shortcut here by checking only the infomask bits > about locking, and not resolving the MultiXact to see if the updater > (sub)transaction was aborted; but this was wrong, because a tuple which > was created here and updated by an aborted subxact needs to be seen as > in-progress insertion, not an in-progress delete. This causes trouble > to the predicate.c code because it tries to obtain the update XID for > the tuple, but since the update has aborted, that returned zero, causing > the assert failure. > Sadly, this has performance implications, because what previously was > just an in-place check of bit flags has now become a function call. Well, the impact imo primarily comes from actually resolving the multixact, not from the function call itself... But I don't think we need to overly worried. That path is only entered if xmin is in-progress, so that shouldn't have too big implications. If you're worried you could do a SetHintBits(HEAP_XMAX_INVALID) like it's done if xmin isn't in progress like it's done when xmin has committed. > Perhaps it'd be smart to optimize it a bit so that we first check the > flags, and only call the function if that fails. Sounds like a good idea to me. The duplicated amount of work by that should by fairly minimal. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Alvaro Herrera
Date:
Andres Freund wrote: > On 2013-07-19 13:46:44 -0400, Alvaro Herrera wrote: > > Sadly, this has performance implications, because what previously was > > just an in-place check of bit flags has now become a function call. > > Well, the impact imo primarily comes from actually resolving the > multixact, not from the function call itself... But I don't think we > need to overly worried. That path is only entered if xmin is > in-progress, so that shouldn't have too big implications. What I am worried about is those code paths that previously executed all this only by going through straight code in the visibility routine, without calling any other functions, for the cases where there's no multixact involved at all. If we introduce function calls here, that would cause a performance regression for everybody. Of course, people who benefit from the new multixacts can be happy that they don't block waiting for a remote transaction, but this doesn't help people who don't benefit from multixacts. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: BUG #8273: Assertion failure in 9.3 beta2 with serializable and savepoints
From
Alvaro Herrera
Date:
Andres Freund wrote: > On 2013-07-19 13:46:44 -0400, Alvaro Herrera wrote: > > Sadly, this has performance implications, because what previously was > > just an in-place check of bit flags has now become a function call. > > Well, the impact imo primarily comes from actually resolving the > multixact, not from the function call itself... But I don't think we > need to overly worried. That path is only entered if xmin is > in-progress, so that shouldn't have too big implications. True. I have pushed it, with this tweak: > > Perhaps it'd be smart to optimize it a bit so that we first check the > > flags, and only call the function if that fails. > > Sounds like a good idea to me. The duplicated amount of work by that > should by fairly minimal. Thanks. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services