Re: Apparent deadlock 7.0.1 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Apparent deadlock 7.0.1
Date
Msg-id 14084.960432230@sss.pgh.pa.us
Whole thread Raw
In response to Re: Apparent deadlock 7.0.1  (Michael Simms <grim@ewtoo.org>)
List pgsql-hackers
Michael Simms <grim@ewtoo.org> writes:
>>>> I have noticed a deadlock happening on 7.0.1 on updates.
>>>> The backends just lock, and take up as much CPU as they can. I kill
>>>> the postmaster, and the backends stay alive, using CPU at the highest
>>>> rate possible. The operations arent that expensive, just a single line
>>>> of update.
>>>> Anyone else seen this? Anyone dealing with this?
>> 
>> News to me.  What sort of hardware are you running on?  It sort of
>> sounds like the spinlock code not working as it should --- and since
>> spinlocks are done with platform-dependent assembler, it matters...

> The hardware/software is:

> Linux kernel 2.2.15 (SMP kernel)
> Glibc  2.1.1
> Dual Intel PIII/500

Dual CPUs huh?  I have heard of motherboards that have (misdesigned)
memory caching such that the two CPUs don't reliably see each others'
updates to a shared memory location.  Naturally that plays hell with the
spinlock code :-(.  It might be necessary to insert some kind of cache-
flushing instruction into the spinlock wait loop to ensure that the
CPUs see each others' changes to the lock.

This is all theory at this point, and a hole in the theory is that the
backends ought to give up with a "stuck spinlock" error after a minute
or two of not being able to grab the lock.  I assume you have left them
go at it for longer than that without seeing such an error?

Anyway, the next step is to "kill -ABORT" some of the stuck processes
and get backtraces from their coredumps to see where they are stuck.
If you find they are inside s_lock() then it's definitely some kind of
spinlock problem.  If not...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Ed Loehr
Date:
Subject: [GENERAL] NOTIFY/LISTEN in pgsql 7.0
Next
From: Bruce Momjian
Date:
Subject: Re: Doc updates for index cost estimator change