Re: "stuck spinlock" - Mailing list pgsql-hackers

From Tom Lane
Subject Re: "stuck spinlock"
Date
Msg-id 2787.1386890339@sss.pgh.pa.us
Whole thread Raw
In response to "stuck spinlock"  (Christophe Pettus <xof@thebuild.com>)
Responses Re: "stuck spinlock"
List pgsql-hackers
Christophe Pettus <xof@thebuild.com> writes:
> Immediately after an upgrade from 9.3.1 to 9.3.2, we have a client getting frequent (hourly) errors of the form:

> /var/lib/postgresql/9.3/main/pg_log/postgresql-2013-12-12_211710.csv:2013-12-12 21:40:10.328
UTC,"n","n",32376,"10.2.1.142:52451",52aa24eb.7e78,5,"SELECT",2013-12-1221:04:43 UTC,9/7178,0,PANIC,XX000,"stuck
spinlock(0x7f7df94672f4) detected at
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/storage/buffer/bufmgr.c:1099",,,,,,"<redacted>"

> uname -a: Linux postgresql3-master 3.8.0-33-generic #48~precise1-Ubuntu SMP Thu Oct 24 16:28:06 UTC 2013 x86_64
x86_64x86_64 GNU/Linux.
 

> Generally, there's no core file (which is currently enable), as the postmaster just normally exits the backend.

Hm, a PANIC really ought to result in a core file.  You sure you don't
have that disabled (perhaps via a ulimit setting)?

As for the root cause, it's hard to say.  The file/line number says it's
a buffer header lock that's stuck.  I rechecked all the places that lock
buffer headers, and all of them have very short code paths to the
corresponding unlock, so there's no obvious explanation how this could
happen.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pgsql: Fix a couple of bugs in MultiXactId freezing
Next
From: Andres Freund
Date:
Subject: Re: "stuck spinlock"