Thread: BUG #5918: SummarizeOldestCommittedSxact assertion failure
The following bug has been logged online: Bug reference: 5918 Logged by: YAMAMOTO Takashi Email address: yamt@mwd.biglobe.ne.jp PostgreSQL version: 9.1devel Operating system: NetBSD Description: SummarizeOldestCommittedSxact assertion failure Details: running 05d93c38a791836eeceaf8edb0ea8cb19cdf2760 with my patch in BUG #5915 applied, i got the following assertion failure. given that availableList is not empty and SxactGlobalXminCount == 0, i guess it was raced with ReleasePredicateLocks. (gdb) bt #0 0xbbba4cc7 in _lwp_kill () from /usr/lib/libc.so.12 #1 0xbbba4c85 in raise (s=6) at /siro/nbsd/src/lib/libc/gen/raise.c:48 #2 0xbbba445a in abort () at /siro/nbsd/src/lib/libc/stdlib/abort.c:74 #3 0x083dbfa4 in ExceptionalCondition ( conditionName=0x854d904 "!(!SHMQueueEmpty(FinishedSerializableTransactions)) ", errorType=0x854d360 "FailedAssertion", fileName=0x854d354 "predicate.c", lineNumber=1311) at assert.c:57 #4 0x082e8423 in SummarizeOldestCommittedSxact () at predicate.c:1311 #5 0x082e87ab in RegisterSerializableTransactionInt (snapshot=0x8596aa0) at predicate.c:1451 #6 0x082e86ca in RegisterSerializableTransaction (snapshot=0x8596aa0) at predicate.c:1415 #7 0x0840ebb2 in GetTransactionSnapshot () at snapmgr.c:138 #8 0x082f44df in exec_bind_message (input_message=0xbfbfe2c4) at postgres.c:1545 #9 0x082f7839 in PostgresMain (argc=2, argv=0xbb9196a4, username=0xbb9195f8 "takashi") at postgres.c:3944 #10 0x082a8359 in BackendRun (port=0xbb94f0f0) at postmaster.c:3593 #11 0x082a7a1d in BackendStartup (port=0xbb94f0f0) at postmaster.c:3278 #12 0x082a4c9d in ServerLoop () at postmaster.c:1452 #13 0x082a444c in PostmasterMain (argc=3, argv=0xbfbfe594) at postmaster.c:1113 #14 0x0822571c in main (argc=3, argv=0xbfbfe594) at main.c:199 (gdb) p PredXact $3 = (PredXactList) 0xbb53fc80 (gdb) p *PredXact $4 = {availableList = {prev = 0xbb5479d0, next = 0xbb5407e4}, activeList = { prev = 0xbb548e4c, next = 0xbb53fcc0}, SxactGlobalXmin = 0, SxactGlobalXminCount = 0, WritableSxactCount = 0, LastSxactCommitSeqNo = 4582775, CanPartialClearThrough = 4582775, HavePartialClearedThrough = 3576768, OldCommittedSxact = 0xbb53fcc8, element = 0xbb53fcc0} (gdb)
On 08.03.2011 02:37, YAMAMOTO Takashi wrote: > > The following bug has been logged online: > > Bug reference: 5918 > Logged by: YAMAMOTO Takashi > Email address: yamt@mwd.biglobe.ne.jp > PostgreSQL version: 9.1devel > Operating system: NetBSD > Description: SummarizeOldestCommittedSxact assertion failure > Details: > > running 05d93c38a791836eeceaf8edb0ea8cb19cdf2760 with my patch > in BUG #5915 applied, i got the following assertion failure. > given that availableList is not empty and SxactGlobalXminCount == 0, > i guess it was raced with ReleasePredicateLocks. Yeah, that's what it looks like. One backend calls RegisterSerializableTransaction() while all the serializablexact slots are in use. So it releases SerializableXactHashLock and calls SummarizeOldestCommittedSxact(). Before SummarizeOldestCommittedSxact() acquires SerializableFinishedListLock, another backend calls ReleasePredicateLocks(false), triggering cleanup of old predicate locks, and ClearOldPredicateLocks() clears all old locks. Now when SummarizeOldestCommittedSxact() finally gets the lock, it sees that there are no old transactions to summarize, and trips the assertion. I think we need to just treat an empty list as normal in SummarizeOldestcommittedSxact(), patch attached. Thanks for yet another excellent bug report! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Attachment
On Tue, Mar 08, 2011 at 01:22:20PM +0200, Heikki Linnakangas wrote: > I think we need to just treat an empty list as normal in > SummarizeOldestcommittedSxact(), patch attached. I just hit the same assertion. Testing this patch now. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Looks good -- with this patch I didn't hit any assertion failures or other = errors during an hour of stress testing with DBT-2. Dan
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 08.03.2011 02:37, YAMAMOTO Takashi wrote: >> i got the following assertion failure. given that availableList >> is not empty and SxactGlobalXminCount == 0, i guess it was raced >> with ReleasePredicateLocks. > > Yeah, that's what it looks like. One backend calls > RegisterSerializableTransaction() while all the serializablexact > slots are in use. So it releases SerializableXactHashLock and > calls SummarizeOldestCommittedSxact(). Before > SummarizeOldestCommittedSxact() acquires > SerializableFinishedListLock, another backend calls > ReleasePredicateLocks(false), triggering cleanup of old predicate > locks, and ClearOldPredicateLocks() clears all old locks. Now when > SummarizeOldestCommittedSxact() finally gets the lock, it sees > that there are no old transactions to summarize, and trips the > assertion. > > I think we need to just treat an empty list as normal in > SummarizeOldestcommittedSxact(), patch attached. Looks good. I suggest we get that one in before the alpha is cut. Especially since Dan was able to hit that same assertion an hour into DBT-2 testing, and didn't hit problems with this patch. > Thanks for yet another excellent bug report! Indeed! I'm quite curious about the testing environment which is finding these, and very much appreciate all the work to help in making this feature so solid before we even hit an alpha release. -Kevin
On 08.03.2011 18:27, Kevin Grittner wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote: >> I think we need to just treat an empty list as normal in >> SummarizeOldestcommittedSxact(), patch attached. > > Looks good. I suggest we get that one in before the alpha is cut. > Especially since Dan was able to hit that same assertion an hour > into DBT-2 testing, and didn't hit problems with this patch. Ok, committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com