Re: General performance/load issue - Mailing list pgsql-general

From Robert Treat
Subject Re: General performance/load issue
Date
Msg-id CABV9wwOyDFwXzuPjvyTOGUDmnmoMT-orzx0+oYiKRhEG7SZ+0Q@mail.gmail.com
Whole thread Raw
In response to Re: General performance/load issue  ("Tomas Vondra" <tv@fuzzy.cz>)
Responses Re: General performance/load issue  ("Tomas Vondra" <tv@fuzzy.cz>)
Re: General performance/load issue  (Gaëtan Allart <gaetan@nexylan.com>)
List pgsql-general
On Thu, Nov 24, 2011 at 9:09 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
> On 24 Listopad 2011, 14:51, Gaëtan Allart wrote:
>> Hello everyone,
>>
>> I'm having some troubles with a Postgresql server.
>> We're using PG has a database backend for a very big website (lots of data
>> and much traffic).
>>
>> The issue : server suddenly (1H after restart) becomes slow (queries not
>> responding), load rises (>20 instead of 1), iowait rises (20 to 70%)
>>
>> Version : 9.0.5
>> Server : Dual Xeon X5650 (24  cores total)
>> Memory : 48 GB
>> Disks : SSD
>>
>>
>> Top when overloaded :
>
> Top is not the most useful tool here, I guess. Use "iotop" (will show you
> which processes are doing the I/O) and tools like vmstat / iostat.
>
>> Postgresql.conf :
>>
>> max_connections = 50
>> shared_buffers = 12G
>> temp_buffers = 40MB
>> work_mem = 128MB
>> maintenance_work_mem = 256MB
>> max_files_per_process = 8192
>> checkpoint_segments = 256
>> checkpoint_timeout = 30min
>> checkpoint_completion_target = 0.9
>
> Fine. Let's see the options that look suspicious.
>

I think you missed some suspicious settings... I'd recommend setting
shared buffers to 8gb, and I'd likely reduce checkpoint segements to
30 and set the checkpoint timeout back to 5 minutes. Everything about
the way this server is configured (including those vm settings) is
pushing it towards delaying the WAL/Buffer/Checkpoint as long as
possible, which matches with the idea of good performance initial
followed by a period of poor performance and heavy i/o.

On a side note, I'd guess your work_mem is probably too high. 50
(connections) x 128 (mb work mem) x 2 (sorts per query) = 12GB RAM,
which is 25% of total ram on the box. That doesn't necessarily mean
game over, but it seem like it wouldn't be that hard to get thrashing
being set up that way. YMMV.

Robert Treat
conjecture: xzilla.net
consulting: omniti.com

pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Reassign value of IN parameter in 9.1.1
Next
From: Phoenix Kiula
Date:
Subject: Re: Incremental backup with RSYNC or something?