Re: Cpu usage 100% on slave. s_lock problem. - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Cpu usage 100% on slave. s_lock problem.
Date
Msg-id CAHyXU0xzu3c9vNc82bt-UUvYiUZVmZCEonFLidjWZuYkZZRtHA@mail.gmail.com
Whole thread Raw
In response to Re: Cpu usage 100% on slave. s_lock problem.  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Cpu usage 100% on slave. s_lock problem.
List pgsql-performance
On Tue, Sep 17, 2013 at 8:35 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-09-17 08:32:30 -0500, Merlin Moncure wrote:
>> On Tue, Sep 17, 2013 at 8:24 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > On 2013-09-17 08:18:54 -0500, Merlin Moncure wrote:
>> >> Do you think it's worth submitting the lock avoidance patch for formal review?
>> >
>> > You mean the bufmgr.c thing? Generally I think that that code needs a
>> > good of scalability work - there's a whole thread about it
>> > somewhere. But TBH the theories you've voiced about the issues you've
>> > seen haven't convinced me so far.
>>
>> er, no (but I share your skepticism -- my challenge right now is to
>> demonstrate measurable benefit which so far I've been unable to do).
>> I was talking about the patch on  *this* thread which bypasses the
>> s_lock in RecoveryInProgress()  :-).
>
> Ah, yes. Sorry confused issues ;). Yes, I think that'd made sense.
>
>> > Quick question: Do you happen to have pg_locks output from back then
>> > around? We've recently found servers going into somewhat similar
>> > slowdowns because they exhausted the fastpath locks which made lwlocks
>> > far more expensive and made s_lock go up very high in the profle.
>>
>> I do. Unfortunately I don't have profile info.   Not sure how useful
>> it is -- I'll send it off-list.
>
> Great.
>
> The primary thing I'd like to know is whether there are lots of
> non-fastpath locks...
>
> If you ever get into the situation I mistakenly referred to again, I'd
> strongly suggest recompling postgres with -fno-omit-frame-pointer. That
> makes hierarchical profiles actually useful which can help tremendously
> with diagnosing issues like this...

We may get an opportunity to do that.  I'm curious enough about the
THP compaction issues that Kevin mentioned to to maybe consider
cranking buffers again.  If I do that, it will be with strict
instructions to the site operators to catch a profile before taking
further action.

merlin


pgsql-performance by date:

Previous
From: Andres Freund
Date:
Subject: Re: Cpu usage 100% on slave. s_lock problem.
Next
From: Andres Freund
Date:
Subject: Re: Cpu usage 100% on slave. s_lock problem.