Re: General performance/load issue - Mailing list pgsql-general

From Tomas Vondra
Subject Re: General performance/load issue
Date
Msg-id ccc3da91f69e1407b12b529fb263812b.squirrel@sq.gransy.com
Whole thread Raw
In response to General performance/load issue  (Gaëtan Allart <gaetan@nexylan.com>)
Responses Re: General performance/load issue  (Gaëtan Allart <gaetan@nexylan.com>)
Re: General performance/load issue  (Robert Treat <rob@xzilla.net>)
List pgsql-general
On 24 Listopad 2011, 14:51, Gaëtan Allart wrote:
> Hello everyone,
>
> I'm having some troubles with a Postgresql server.
> We're using PG has a database backend for a very big website (lots of data
> and much traffic).
>
> The issue : server suddenly (1H after restart) becomes slow (queries not
> responding), load rises (>20 instead of 1), iowait rises (20 to 70%)
>
> Version : 9.0.5
> Server : Dual Xeon X5650 (24  cores total)
> Memory : 48 GB
> Disks : SSD
>
>
> Top when overloaded :

Top is not the most useful tool here, I guess. Use "iotop" (will show you
which processes are doing the I/O) and tools like vmstat / iostat.

> Postgresql.conf :
>
> max_connections = 50
> shared_buffers = 12G
> temp_buffers = 40MB
> work_mem = 128MB
> maintenance_work_mem = 256MB
> max_files_per_process = 8192
> checkpoint_segments = 256
> checkpoint_timeout = 30min
> checkpoint_completion_target = 0.9

Fine. Let's see the options that look suspicious.

> effective_cache_size = 12GB

Why have you set it like this? According to the "free" output you've
posted the cache has about 38G, so why just 12G here? There are possible
reasons, but I don't think this is the case.

> fsync = off

A really bad idea. I guess your data are worthless to you, right?

> seq_page_cost = 2.0
> random_page_cost = 2.0

Eh? First of all, what really matters is the relative value of those two
values, and it's good habit to leave seq_page_cost = 1.0 and change just
the other values.

Plus the random I/O is not as cheap as sequential I/O even on SSD drives,
so I't recommend something like this:

seq_page_cost = 1.0
random_page_cost = 2.0 (or maybe 1.5)

Anyway this needs to be tested properly - watch the performance and tune
if needed.

> Did I do anything wrong? Any idea?

Not sure. My guess is you're getting bitten by a checkpoint. We need to
know a few more details.

1) What is dirty_background_ratio / dirty_ratio (see /proc/sys/vm/ directory)

2) enable log_checkpoints in postgresql.conf and see how it correlates to
the bad performance

3) check which processes are responsible for the I/O (use iotop)

Tomas


pgsql-general by date:

Previous
From: Gavin Casey
Date:
Subject: Reassign value of IN parameter in 9.1.1
Next
From: Alban Hertroys
Date:
Subject: Re: Reassign value of IN parameter in 9.1.1