Re: [HACKERS] Deadlock in XLogInsert at AIX - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: [HACKERS] Deadlock in XLogInsert at AIX |
Date | |
Msg-id | 0184657d-cf64-10f6-9706-7329f23f5f89@iki.fi Whole thread Raw |
In response to | [HACKERS] Deadlock in XLogInsert at AIX (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: [HACKERS] Deadlock in XLogInsert at AIX
|
List | pgsql-hackers |
On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote: > As I already mentioned, we built Postgres with LOCK_DEBUG , so we can > inspect lock owner. Backend is waiting for itself! > Now please look at two frames in this stack trace marked with red. > XLogInsertRecord is setting WALInsert locks at the beginning of the > function: > > if (isLogSwitch) > WALInsertLockAcquireExclusive(); > else > WALInsertLockAcquire(); > > WALInsertLockAcquire just selects random item from WALInsertLocks array > and exclusively locks: > > if (lockToTry == -1) > lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS; > MyLockNo = lockToTry; > immed = LWLockAcquire(&WALInsertLocks[MyLockNo].l.lock, LW_EXCLUSIVE); > > Then, following the stack trace, AdvanceXLInsertBuffer calls > WaitXLogInsertionsToFinish: > > /* > * Now that we have an up-to-date LogwrtResult value, see if we > * still need to write it or if someone else already did. > */ > if (LogwrtResult.Write < OldPageRqstPtr) > { > /* > * Must acquire write lock. Release WALBufMappingLock > first, > * to make sure that all insertions that we need to > wait for > * can finish (up to this same position). Otherwise we risk > * deadlock. > */ > LWLockRelease(WALBufMappingLock); > > WaitXLogInsertionsToFinish(OldPageRqstPtr); > > LWLockAcquire(WALWriteLock, LW_EXCLUSIVE); > > > It releases WALBufMappingLock but not WAL insert locks! > Finally in WaitXLogInsertionsToFinish tries to wait for all locks: > > for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++) > { > XLogRecPtr insertingat = InvalidXLogRecPtr; > > do > { > /* > * See if this insertion is in progress. LWLockWait will > wait for > * the lock to be released, or for the 'value' to be set by a > * LWLockUpdateVar call. When a lock is initially > acquired, its > * value is 0 (InvalidXLogRecPtr), which means that we > don't know > * where it's inserting yet. We will have to wait for it. If > * it's a small insertion, the record will most likely fit > on the > * same page and the inserter will release the lock without > ever > * calling LWLockUpdateVar. But if it has to sleep, it will > * advertise the insertion point with LWLockUpdateVar before > * sleeping. > */ > if (LWLockWaitForVar(&WALInsertLocks[i].l.lock, > &WALInsertLocks[i].l.insertingAt, > insertingat, &insertingat)) > > And here we stuck! Interesting.. What should happen here is that for the backend's own insertion slot, the "insertingat" value should be greater than the requested flush point ('upto' variable). That's because before GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the backend's insertingat value, to the position that it wants to insert to. And AdvanceXLInsertBuffer() only calls WaitXLogInsertionsToFinish() with value smaller than what was passed as the 'upto' argument. > The comment to WaitXLogInsertionsToFinish says: > > * Note: When you are about to write out WAL, you must call this function > * *before* acquiring WALWriteLock, to avoid deadlocks. This function might > * need to wait for an insertion to finish (or at least advance to next > * uninitialized page), and the inserter might need to evict an old WAL > buffer > * to make room for a new one, which in turn requires WALWriteLock. > > Which contradicts to the observed stack trace. Not AFAICS. In the stack trace you showed, the backend is not holding WALWriteLock. It would only acquire it after the WaitXLogInsertionsToFinish() call finished. > I wonder if it is really synchronization bug in xlog.c or there is > something wrong in this stack trace and it can not happen in case of > normal work? Yeah, hard to tell. Something is clearly wrong.. This line in the stack trace is suspicious: > WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c" AdvanceXLInsertBuffer() should only ever call WaitXLogInsertionsToFinish() with an xlog position that points to a page bounary, but that upto value points to the middle of a page. Perhaps the value stored in the stack trace is not what the caller passed, but it was updated because it was past the 'reserveUpto' value? That would explain the "request to flush past end of generated WAL" notices you saw in the log. Now, why would that happen, I have no idea. If you can and want to provide me access to the system, I could have a look myself. I'd also like to see if the attached additional Assertions will fire. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: