Re: Tuning Tips for a new Server - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Tuning Tips for a new Server
Date
Msg-id 01b1e09ae6df37cd32543df0dee95bfa.squirrel@sq.gransy.com
Whole thread Raw
In response to Re: Tuning Tips for a new Server  (Ogden <lists@darkstatic.com>)
Responses Re: Tuning Tips for a new Server  (Ogden <lists@darkstatic.com>)
Re: Tuning Tips for a new Server  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On 17 Srpen 2011, 18:39, Ogden wrote:
>> Yes, but it greatly depends on the amount of WAL and your workload. If
>> you
>> need to write a lot of WAL data (e.g. during bulk loading), this may
>> significantly improve performance. It may also help when you have a
>> write-heavy workload (a lot of clients updating records, background
>> writer
>> etc.) as that usually means a lot of seeking (while WAL is written
>> sequentially).
>
> The database is about 200Gb so using /usr/local/pgsql/pg_xlog on a virtual
> disk with 100Gb should not be a problem with the disk space should it?

I think you've mentioned the database is on 6 drives, while the other
volume is on 2 drives, right? That makes the OS drive about 3x slower
(just a rough estimate). But if the database drive is used heavily, it
might help to move the xlog directory to the OS disk. See how is the db
volume utilized and if it's fully utilized, try to move the xlog
directory.

The only way to find out is to actualy try it with your workload.

>> What is the size of the database? So those are the new servers? What's
>> the difference compared to the old ones? What is the RAID controller, how
>> much write cache is there?
>
> I am sorry I overlooked specifying this. The database is about 200Gb and
> yes these are new servers which bring more power (RAM, CPU) over the last
> one. The RAID Controller is a Perc H700 and there is 512Mb write cache.
> The servers are Dells.

OK, sounds good although I don't have much experience with this controller.

>>> Currently, I have this for the current system which as 16Gb Ram:
>>>
>>> max_connections = 350
>>>
>>> work_mem = 32MB
>>> maintenance_work_mem = 512MB
>>> wal_buffers = 640kB
>>
>> Anyway if this worked fine for your workload, I don't think you need to
>> change those settings. I'd probably bump up the wal_buffers to 16MB - it
>> might help a bit, definitely won't hurt and it's so little memory it's
>> not
>> worth the effort I guess.
>
> So just increasing the wal_buffers is okay? I thought there would be more
> as the memory in the system is now 4 times as much. Perhaps shared_buffers
> too (down below).

Yes, I was just commenting that particular piece of config. Shared buffers
should be increased too.

>>> # This is what I was helped with before and made reporting queries
>>> blaze
>>> by
>>> seq_page_cost = 1.0
>>> random_page_cost = 3.0
>>> cpu_tuple_cost = 0.5
>>> effective_cache_size = 8192MB
>>
>> Are you sure the cpu_tuple_cost = 0.5 is correct? That seems a bit crazy
>> to me, as it says reading a page sequentially is just twice as expensive
>> as processing it. This value should be abou 100x lower or something like
>> that.
>
> These settings are for the old server, keep in mind. It's a 16GB machine
> (the new one is 64Gb). The value for cpu_tuple_cost should be 0.005? How
> are the other ones?

The default values are like this:

seq_page_cost = 1.0
random_page_cost = 4.0
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.005
cpu_operator_cost = 0.0025

Increasing the cpu_tuple_cost to 0.5 makes it way too expensive I guess,
so the database believes processing two 8kB pages is just as expensive as
reading one from the disk. I guess this change penalizes plans that read a
lot of pages, e.g. sequential scans (and favor index scans etc.). Maybe it
makes sense in your case, I'm just wondering why you set it like that.

>> What are the checkpoint settings (segments, completion target). What
>> about
>> shared buffers?
>
>
> #checkpoint_segments = 3                # in logfile segments, min 1, 16MB
> each
> #checkpoint_timeout = 5min              # range 30s-1h
> checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0
> - 1.0 - was 0.5
> #checkpoint_warning = 30s               # 0 disables

You need to bump checkpoint segments up, e.g. 64 or maybe even more. This
means how many WAL segments will be available until a checkpoint has to
happen. Checkpoint is a process when dirty buffers from shared buffers are
written to the disk, so it may be very I/O intensive. Each segment is
16MB, so 3 segments is just 48MB of data, while 64 is 1GB.

More checkpoint segments result in longer recovery in case of database
crash (because all the segments since last checkpoint need to be applied).
But it's essential for good write performance.

Completion target seems fine, but I'd consider increasing the timeout too.

> shared_buffers = 4096MB

The usual recommendation is about 25% of RAM for shared buffers, with 64GB
of RAM that is 16GB. And you should increase effective_cache_size too.

See this: http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Tomas


pgsql-performance by date:

Previous
From: Ogden
Date:
Subject: Re: Raid 5 vs Raid 10 Benchmarks Using bonnie++
Next
From: Anish Kejariwal
Date:
Subject: Re: Calculating statistic via function rather than with query is slowing my query