Re: stuck spinlock - Mailing list pgsql-hackers

From Tom Lane
Subject Re: stuck spinlock
Date
Msg-id 27555.983241067@sss.pgh.pa.us
Whole thread Raw
In response to stuck spinlock  (Peter Schindler <pschindler@synchronicity.com>)
List pgsql-hackers
Peter Schindler <pschindler@synchronicity.com> writes:
> FATAL: s_lock(fcc01067) at xlog.c:2088, stuck spinlock. Aborting.

Judging from the line number, this is in CreateCheckPoint.  I'm
betting that your platform (Solaris 2.7, you said?) has the same odd
behavior that I discovered a couple days ago on HPUX: a select with
a delay of tv_sec = 0, tv_usec = 1000000 doesn't delay 1 second like
a reasonable person would expect, but fails instantly with EINVAL.
This causes the spinlock timeout in CreateCheckPoint to effectively
be only a few microseconds rather than the intended ten minutes.
So, if the postmaster happens to fire off a checkpoint process while
some regular backend is doing something with the WAL log, kaboom.

In short: please try the latest nightly snapshot (this fix is since
beta5, unfortunately) and let me know if you still see a problem.
        regards, tom lane


pgsql-hackers by date:

Previous
From: jamexu
Date:
Subject: Re[2]: Re: [PATCHES] A patch for xlog.c
Next
From: The Hermit Hacker
Date:
Subject: Re[2]: Re: [PATCHES] A patch for xlog.c