Re: stuck spin lock with many concurrent users - Mailing list pgsql-hackers

From Hiroshi Inoue
Subject Re: stuck spin lock with many concurrent users
Date
Msg-id 3B3858B4.C02B53C6@tpf.co.jp
Whole thread Raw
In response to Re: stuck spin lock with many concurrent users  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: stuck spin lock with many concurrent users  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
Tatsuo Ishii wrote:
> 
> > > Tatsuo Ishii <t-ishii@sra.co.jp> writes
> > > >>> How can I check it?
> > > >>
> > > >> The 'stuck' message should at least give you a code location...
> > >
> > > > FATAL: s_lock(0x2ac2d016) at spin.c:158, stuck spinlock. Aborting.
> > >
> > > Hmm, that's SpinAcquire, so it's one of the predefined spinlocks
> > > (and not, say, a buffer spinlock).  You could try adding some
> > > debug logging here, although the output would be voluminous.
> > > But what would really be useful is a stack trace for the stuck
> > > process.  Consider changing the s_lock code to abort() when it
> > > gets a stuck spinlock --- then you could gdb the coredump.
> >
> > Nice idea. I will try that.
> 
> It appeared that the deadlock checking timer seems to be the source of
> the problem. With the default settings, it checks deadlocks every 1
> second PER backend. 

IIRC deadlock check was called only once per backend.
It seems to have been changed between 7.0 and 7.1.
Does it take effect to disable timer at the beginging of
HandleDeadLock() ?

regards,
Hiroshi Inoue


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: stuck spin lock with many concurrent users
Next
From: Tatsuo Ishii
Date:
Subject: Re: stuck spin lock with many concurrent users