Re: hung backends stuck in spinlock heavy endless loop - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: hung backends stuck in spinlock heavy endless loop
Date
Msg-id CAHyXU0yeHAE=MsdC=0gnU=AWro19671a--0qLiypfSiNH=c77Q@mail.gmail.com
Whole thread Raw
In response to Re: hung backends stuck in spinlock heavy endless loop  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: hung backends stuck in spinlock heavy endless loop
List pgsql-hackers
On Fri, Jan 16, 2015 at 8:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2015-01-16 08:05:07 -0600, Merlin Moncure wrote:
>> On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> > On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> >> Running this test on another set of hardware to verify -- if this
>> >> turns out to be a false alarm which it may very well be, I can only
>> >> offer my apologies!  I've never had a new drive fail like that, in
>> >> that manner.  I'll burn the other hardware in overnight and report
>> >> back.
>>
>> huh -- well possibly. not.  This is on a virtual machine attached to a
>> SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
>> checksums on) hours then the starting having issues:
>
> Damn.
>
> Is there any chance you can package this somehow so that others can run
> it locally? It looks hard to find the actual bug here without adding
> instrumentation to to postgres.

FYI, a two hour burn in on my workstation on 9.3 ran with no issues.
An overnight run would probably be required to prove it, ruling out
both hardware and pl/sh.   If proven, it's possible we may be facing a
regression, perhaps a serious one.

ISTM the next step is to bisect the problem down over the weekend in
order to to narrow the search.  If that doesn't turn up anything
productive I'll look into taking other steps.

merlin



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: lock_time for pg_stat_database
Next
From: Jim Nasby
Date:
Subject: Re: proposal: lock_time for pg_stat_database