Re: High SYS CPU - need advise - Mailing list pgsql-general

From Vlad
Subject Re: High SYS CPU - need advise
Date
Msg-id CAKeSUqVuAOTs7vzEkdjDui-=BQPdPv9SD3NigJ4q+BD4C4m_DA@mail.gmail.com
Whole thread Raw
In response to Re: High SYS CPU - need advise  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: High SYS CPU - need advise
List pgsql-general

We're looking for spikes in 'blk' which represents when lwlocks bump.
If you're not seeing any then this is suggesting a buffer pin related
issue -- this is also supported by the fact that raising shared
buffers didn't help.   If you're not seeing 'bk's, go ahead and
disable the stats macro.

most blk comes with 0, some with 1, few hitting 100. I can't say that during stall times the number of blk 0 vs blk non-0 are very different.  

 
So, what we need to know now is:
*) What happens when you drastically *lower* shared buffers?   Say, to
64mb?  Note, you may experience higher load for unrelated reasons and
have to scuttle the test.  Also, if you have to crank higher to handle
internal server structures, do that.  This is a hail mary, but maybe
something interesting spits out.

lowering shared_buffers didn't help. 
 
*) How many specific query plans are needed to introduce the
condition,  Hopefully, it's not too many.  If so, let's start
gathering the plans.  If you have a lot of plans to sift through, one
thing we can attempt to eliminate noise is to tweak
log_min_duration_statement so that during stall times (only) it logs
offending queries that are unexpectedly blocking.

unfortunately, there are quite a few query plans... also, I don't think setting log_min_duration_statement will help us, cause when server is hitting high load average, it reacts slowly even on a key press. So even non-offending queries will be taking long to execute. I see all sorts of queries a being executed long during stall: spanning from simple 
LOG:  duration: 1131.041 ms  statement: SELECT 'DBD::Pg ping test'
to complex ones, joining multiple tables. 
We are still looking into all the logged queries in attempt to find the ones that are causing the problem, I'll report if we find any clues.
 

*) Approximately how big is your 'working set' -- the data your
queries are routinely hitting?

I *think* it's within few hundreds MB range.
 

*) Is the distribution of the *types* of queries uniform?  Or do you
have special processes that occur on intervals?

it's pretty uniform.
 

Thanks for your patience.


oh no, thank you for trying to help me to resolve this issue.

-- vlad

pgsql-general by date:

Previous
From: Mike Blackwell
Date:
Subject: Re: Check table storage parameters
Next
From: Nicolas Grilly
Date:
Subject: Re: Full text search ranking: ordering using index and proximiti ranking with OR queries