Thread: What is the most optimal config parameters to keep stable write TPS ?..

Hi,

what may you suggest as the most optimal postgresql.conf to keep
writing as stable as possible?..

What I want is to avoid "throughput waves"  - I want to keep my
response times stable without any activity holes. I've tried to reduce
checkpoint timeout from 5min to 30sec - it helped, throughput is more
stable now, but instead of big waves I have now short waves anyway..

What is the best options combination here?..

Rgds,
-Dimitri

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
"Kevin Grittner"
Date:
Dimitri <dimitrik.fr@gmail.com> wrote:

> what may you suggest as the most optimal postgresql.conf to keep
> writing as stable as possible?..
>
> What I want is to avoid "throughput waves"  - I want to keep my
> response times stable without any activity holes. I've tried to
> reduce checkpoint timeout from 5min to 30sec - it helped, throughput
> is more stable now, but instead of big waves I have now short waves
> anyway..
>
> What is the best options combination here?..

What version of PostgreSQL?  What operating system?  What hardware?

The answers are going to depend on the answers to those questions.

It would also be good to show all lines from postgresql.conf which are
not commented out.

-Kevin

Hi Kevin,

PostgreSQL: 8.3.7 & 8.4
Server: Sun M5000 32cores
OS: Solaris 10

current postgresql.conf:

#================================
max_connections = 2000                  # (change requires restart)
effective_cache_size = 48000MB
shared_buffers = 12000MB
temp_buffers = 200MB
work_mem = 100MB                                # min 64kB
maintenance_work_mem = 600MB            # min 1MB

max_fsm_pages = 2048000
fsync = on                              # turns forced synchronization on or off
synchronous_commit = off                # immediate fsync at commit
wal_sync_method = fdatasync
wal_buffers = 2MB
wal_writer_delay = 400ms                # 1-10000 milliseconds

checkpoint_segments = 128
checkpoint_timeout = 30s

archive_mode = off
track_counts = on
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 4
autovacuum_naptime = 20                 # time between autovacuum runs
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.001

lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'

#================================

Rgds,
-Dimitri


On 5/11/09, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> Dimitri <dimitrik.fr@gmail.com> wrote:
>
>> what may you suggest as the most optimal postgresql.conf to keep
>> writing as stable as possible?..
>>
>> What I want is to avoid "throughput waves"  - I want to keep my
>> response times stable without any activity holes. I've tried to
>> reduce checkpoint timeout from 5min to 30sec - it helped, throughput
>> is more stable now, but instead of big waves I have now short waves
>> anyway..
>>
>> What is the best options combination here?..
>
> What version of PostgreSQL?  What operating system?  What hardware?
>
> The answers are going to depend on the answers to those questions.
>
> It would also be good to show all lines from postgresql.conf which are
> not commented out.
>
> -Kevin
>

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
"Kevin Grittner"
Date:
Dimitri <dimitrik.fr@gmail.com> wrote:

> PostgreSQL: 8.3.7 & 8.4
> Server: Sun M5000 32cores
> OS: Solaris 10

Does that have a battery backed RAID controller?  If so, is it
configured for write-back?  These both help a lot with smoothing
checkpoint I/O gluts.

We've minimized problems by making the background writer more
aggressive.  8.3 and later does a better job in general, but we've
still had to go with:

bgwriter_lru_maxpages = 1000
bgwriter_lru_multiplier = 4.0

> shared_buffers = 12000MB

You might want to test with that set to something much lower, to see
what the checkpoint delays look like.  We've found it best to use a
small (256MB) setting, and leave caching to the OS; in our
environment, it seems to do a better job of scheduling the disk I/O.
YMMV, of course.

-Kevin

Thanks a lot, I'll try them all!

Yes, I have a good external storage box with battery backed cache enabled.
There are 64GB of RAM so I expected it'll help little bit to increase
a buffer cache, but ok, will see if with 256MB it'll be better :-)

What about "full_page_writes" ? seems it's "on"  by default. Does it
makes sense to put if off?..

Rgds,
-Dimitri




On 5/11/09, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> Dimitri <dimitrik.fr@gmail.com> wrote:
>
>> PostgreSQL: 8.3.7 & 8.4
>> Server: Sun M5000 32cores
>> OS: Solaris 10
>
> Does that have a battery backed RAID controller?  If so, is it
> configured for write-back?  These both help a lot with smoothing
> checkpoint I/O gluts.
>
> We've minimized problems by making the background writer more
> aggressive.  8.3 and later does a better job in general, but we've
> still had to go with:
>
> bgwriter_lru_maxpages = 1000
> bgwriter_lru_multiplier = 4.0
>
>> shared_buffers = 12000MB
>
> You might want to test with that set to something much lower, to see
> what the checkpoint delays look like.  We've found it best to use a
> small (256MB) setting, and leave caching to the OS; in our
> environment, it seems to do a better job of scheduling the disk I/O.
> YMMV, of course.
>
> -Kevin
>

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
"Kevin Grittner"
Date:
Dimitri <dimitrik.fr@gmail.com> wrote:

> What about "full_page_writes" ? seems it's "on"  by default. Does it
> makes sense to put if off?..

It would probably help with performance, but the description is a
little disconcerting in terms of crash recovery.  We tried running
with it off for a while (a year or so back), but had problems with
corruption.  I think the specific cause of that has since been fixed,
it's left us a bit leery of the option.

Maybe someone else can speak to how safe (or not) the current
implementation of that option is.

-Kevin

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
Scott Marlowe
Date:
On Mon, May 11, 2009 at 10:31 AM, Dimitri <dimitrik.fr@gmail.com> wrote:
> Hi Kevin,
>
> PostgreSQL: 8.3.7 & 8.4
> Server: Sun M5000 32cores
> OS: Solaris 10
>
> current postgresql.conf:
>
> #================================
> max_connections = 2000                  # (change requires restart)
> effective_cache_size = 48000MB
> shared_buffers = 12000MB
> temp_buffers = 200MB
> work_mem = 100MB                                # min 64kB
> maintenance_work_mem = 600MB            # min 1MB
>
> max_fsm_pages = 2048000
> fsync = on                              # turns forced synchronization on or off
> synchronous_commit = off                # immediate fsync at commit
> wal_sync_method = fdatasync
> wal_buffers = 2MB
> wal_writer_delay = 400ms                # 1-10000 milliseconds
>
> checkpoint_segments = 128
> checkpoint_timeout = 30s

What's your checkpoint completion target set to? Crank that up a bit
ot 0.7, 0.8 etc and make the timeout more, not less.  That should
help.

Also, look into better hardware (RAID controller with battery backed
cache) and also putting pg_xlog on a separate RAID-1 set (or RAID-10
set if you've got a lot of drives under the postgres data set).

OK, it'll be better to avoid a such improvement :-)
Performance - yes, but not for any price :-)

Thank you!

Rgds,
-Dimitri

On 5/11/09, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> Dimitri <dimitrik.fr@gmail.com> wrote:
>
>> What about "full_page_writes" ? seems it's "on"  by default. Does it
>> makes sense to put if off?..
>
> It would probably help with performance, but the description is a
> little disconcerting in terms of crash recovery.  We tried running
> with it off for a while (a year or so back), but had problems with
> corruption.  I think the specific cause of that has since been fixed,
> it's left us a bit leery of the option.
>
> Maybe someone else can speak to how safe (or not) the current
> implementation of that option is.
>
> -Kevin
>

Hi Scott,

good point - the current checkpoint completion target is a default
0.5, and it makes sense to set it to 0.8 to make writing more smooth,
great!

yes, data and xlog are separated, each one is sitting on an
independent storage LUN RAID1, and storage box is enough performant

Thank you!

Rgds,
-Dimitri


On 5/11/09, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Mon, May 11, 2009 at 10:31 AM, Dimitri <dimitrik.fr@gmail.com> wrote:
>> Hi Kevin,
>>
>> PostgreSQL: 8.3.7 & 8.4
>> Server: Sun M5000 32cores
>> OS: Solaris 10
>>
>> current postgresql.conf:
>>
>> #================================
>> max_connections = 2000                  # (change requires restart)
>> effective_cache_size = 48000MB
>> shared_buffers = 12000MB
>> temp_buffers = 200MB
>> work_mem = 100MB                                # min 64kB
>> maintenance_work_mem = 600MB            # min 1MB
>>
>> max_fsm_pages = 2048000
>> fsync = on                              # turns forced synchronization on
>> or off
>> synchronous_commit = off                # immediate fsync at commit
>> wal_sync_method = fdatasync
>> wal_buffers = 2MB
>> wal_writer_delay = 400ms                # 1-10000 milliseconds
>>
>> checkpoint_segments = 128
>> checkpoint_timeout = 30s
>
> What's your checkpoint completion target set to? Crank that up a bit
> ot 0.7, 0.8 etc and make the timeout more, not less.  That should
> help.
>
> Also, look into better hardware (RAID controller with battery backed
> cache) and also putting pg_xlog on a separate RAID-1 set (or RAID-10
> set if you've got a lot of drives under the postgres data set).
>

On Mon, 11 May 2009, Dimitri wrote:

> I've tried to reduce checkpoint timeout from 5min to 30sec - it helped,
> throughput is more stable now, but instead of big waves I have now short
> waves anyway..

Tuning for very tiny checkpoints all of the time is one approach here.
The other is to push up checkpoint_segments (done in your case),
checkpoint_timeout, and checkpoint_completion_target to as high as you
can, in order to spread the checkpoint period over as much time as
possible.  Reducing shared_buffers can also help in both cases, you've set
that to an extremely high value.

http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm is a
long discussion of just this topic, if you saw a serious change by
adjusting checkpoint_timeout than further experimentation in this area is
likely to help you out.

You might also want to look at the filesystem parameters you're using
under Solaris.  ZFS in particular can cache more writes than you may
expect, which can lead to that all getting pushed out at the very end of
checkpoint time.  That may very well be the source of your "waves", on a
system with 64GB of RAM for all we know *every* write you're doing between
checkpoints is being buffered until the fsyncs at the checkpoint end.
There were a couple of sessions at PG East last year that mentioned this
area, I put a summary of suggestions and links to more detail at
http://notemagnet.blogspot.com/2008/04/conference-east-08-and-solaris-notes.html

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
Scott Marlowe
Date:
On Mon, May 11, 2009 at 8:15 PM, Greg Smith <gsmith@gregsmith.com> wrote:

> http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm is a long
> discussion of just this topic, if you saw a serious change by adjusting
> checkpoint_timeout than further experimentation in this area is likely to
> help you out.

I highly recommend your article on the background writer.  Reading the
one on tuning the 8.2 bgw allowed me to make some changes to the
production servers at my last job that made a huge difference in
sustained tps on a logging server

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
Laurent Laborde
Date:
On Mon, May 11, 2009 at 6:31 PM, Dimitri <dimitrik.fr@gmail.com> wrote:
> Hi Kevin,
>
> PostgreSQL: 8.3.7 & 8.4
> Server: Sun M5000 32cores
> OS: Solaris 10
>
> current postgresql.conf:
>
> #================================
> max_connections = 2000                  # (change requires restart)

Are you sure about the 2000 connections ?
Why don't you use a pgbouncer or pgpool instead ?


--
F4FQM
Kerunix Flan
Laurent Laborde

It's just one of the test conditions - "what if there 2000 users?" - I
know I may use pgpool or others, but I also need to know the limits of
the database engine itself.. For the moment I'm limiting to 256
concurrent sessions, but config params are kept like for 2000 :-)

Rgds,
-Dimitri

On 5/12/09, Laurent Laborde <kerdezixe@gmail.com> wrote:
> On Mon, May 11, 2009 at 6:31 PM, Dimitri <dimitrik.fr@gmail.com> wrote:
>> Hi Kevin,
>>
>> PostgreSQL: 8.3.7 & 8.4
>> Server: Sun M5000 32cores
>> OS: Solaris 10
>>
>> current postgresql.conf:
>>
>> #================================
>> max_connections = 2000                  # (change requires restart)
>
> Are you sure about the 2000 connections ?
> Why don't you use a pgbouncer or pgpool instead ?
>
>
> --
> F4FQM
> Kerunix Flan
> Laurent Laborde
>

Re: What is the most optimal config parameters to keep stable write TPS ?..

From
bock@openit.de (Julian v. Bock)
Date:
Hi

>>>>> "D" == Dimitri  <dimitrik.fr@gmail.com> writes:

D> current postgresql.conf:

D> #================================
D> max_connections = 2000 # (change requires restart)
D> temp_buffers = 200MB

temp_buffers are kept per connection and not freed until the session
ends. If you use some kind of connection pooling this can eat up a lot
of ram that could be used for caching instead.

Regards,
Julian

Good point!  I missed it.. - will 20MB be enough?

Rgds,
-Dimitri

On 5/12/09, Julian v. Bock <bock@openit.de> wrote:
> Hi
>
>>>>>> "D" == Dimitri  <dimitrik.fr@gmail.com> writes:
>
> D> current postgresql.conf:
>
> D> #================================
> D> max_connections = 2000 # (change requires restart)
> D> temp_buffers = 200MB
>
> temp_buffers are kept per connection and not freed until the session
> ends. If you use some kind of connection pooling this can eat up a lot
> of ram that could be used for caching instead.
>
> Regards,
> Julian
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>