Re: Lots of stuck queries after upgrade to 9.4 - Mailing list pgsql-general
| From | Andres Freund |
|---|---|
| Subject | Re: Lots of stuck queries after upgrade to 9.4 |
| Date | |
| Msg-id | 20150720120142.GB5520@alap3.anarazel.de Whole thread Raw |
| In response to | Re: Lots of stuck queries after upgrade to 9.4 (Andres Freund <andres@anarazel.de>) |
| Responses |
Re: Lots of stuck queries after upgrade to 9.4
Re: Lots of stuck queries after upgrade to 9.4 |
| List | pgsql-general |
Heikki,
On 2015-07-20 13:27:12 +0200, Andres Freund wrote:
> On 2015-07-20 13:22:42 +0200, Andres Freund wrote:
> > Hm. The problem seems to be the WaitXLogInsertionsToFinish() call in
> > XLogFlush().
>
> These are the relevant stack traces:
> db9lock/debuglog-commit.txt
> #2 0x00007f7405bd44f4 in LWLockWaitForVar (l=0x7f70f2ab6680, valptr=0x7f70f2ab66a0, oldval=<optimized out>,
newval=0xffffffffffffffff)at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:1011
> #3 0x00007f7405a0d3e6 in WaitXLogInsertionsToFinish (upto=121713318915952) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1755
> #4 0x00007f7405a0e1d3 in XLogFlush (record=121713318911056) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2849
>
> db9lock/debuglog-insert-8276.txt
> #1 0x00007f7405b77d91 in PGSemaphoreLock (sema=0x7f73ff6531d0, interruptOK=0 '\000') at pg_sema.c:421
> #2 0x00007f7405bd4849 in LWLockAcquireCommon (val=<optimized out>, valptr=<optimized out>, mode=<optimized out>,
l=<optimizedout>) at /tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:626
> #3 LWLockAcquire (l=0x7f70ecaaa1a0, mode=LW_EXCLUSIVE) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/storage/lmgr/lwlock.c:467
> #4 0x00007f7405a0dcca in AdvanceXLInsertBuffer (upto=<optimized out>, opportunistic=<optimized out>) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:2161
> #5 0x00007f7405a0e301 in GetXLogBuffer (ptr=121713318928384) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1848
> #6 0x00007f7405a0e9c9 in CopyXLogRecordToWAL (EndPos=<optimized out>, StartPos=<optimized out>,
rdata=0x7ffff1c21b90,isLogSwitch=<optimized out>, write_len=<optimized out>) at
/tmp/buildd/postgresql-9.4-9.4.4/build/../src/backend/access/transam/xlog.c:1494
> #7 XLogInsert (rmid=<optimized out>, info=<optimized out>, rdata=<optimized out>) at /tmp/buildd/postgre
XLogFlush() has the following comment:
/*
* Re-check how far we can now flush the WAL. It's generally not
* safe to call WaitXLogInsertionsToFinish while holding
* WALWriteLock, because an in-progress insertion might need to
* also grab WALWriteLock to make progress. But we know that all
* the insertions up to insertpos have already finished, because
* that's what the earlier WaitXLogInsertionsToFinish() returned.
* We're only calling it again to allow insertpos to be moved
* further forward, not to actually wait for anyone.
*/
insertpos = WaitXLogInsertionsToFinish(insertpos);
but I don't think that's valid reasoning. WaitXLogInsertionsToFinish()
calls LWLockWaitForVar(oldval = InvalidXLogRecPtr), which will block if
there's a exlusive locker and some backend doesn't yet have set
initializedUpto. Which seems like a ossible state?
pgsql-general by date: