Thread: WAL tuning advice

WAL tuning advice

From
Joseph Marlin
Date:
I'm hoping someone can help me figure out how to better tune my WAL parameters. Right now I am seeing WAL activity that
isfaster than I think it should be - we average about 30 write queries a second at our peak, and yet we're getting a
newWAL file every 15-20 seconds.  


Our postgresql.conf parameters:
wal_level = hot_standby                 # minimal, archive, or hot_standby
checkpoint_segments = 10                # in logfile segments, min 1, 16MB each
checkpoint_timeout = 5min               # range 30s-1h
checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s               # 0 disables


Here is "select * from pg_stat_bgwriter;"

checkpoints_timed: 2081;
checkpoints_req: 18889;
buffers_checkpoint: 4320036;
buffers_clean: 416180117;
maxwritten_clean: 1580518;
buffers_backend: 77676446;
bueffers_backend_fsync: 0;
buffers_alloc: 4804432940;
stats_reset: "2012-06-24 23:13:10.716096-04"


From my previous research, I *think* that that data implies:
1) Due to high ratio of checkpoints_req to checkpoints_timed, I should increase checkpoints_segments, maybe from 10 to
15? 
2) There are only about 206 buffers (buffers_checkpoint/(checkpoints_timed+checkpoints_req)) for each checkpoint. I
thinkthat is far lower than it should be, but I'm not positive about that, and I don't know what would increase it?  
3) maxwitten_clean should be much lower than that?


Thanks for your help!

Re: WAL tuning advice

From
Tom Lane
Date:
Joseph Marlin <jmarlin@saucontech.com> writes:
> I'm hoping someone can help me figure out how to better tune my WAL parameters. Right now I am seeing WAL activity
thatis faster than I think it should be - we average about 30 write queries a second at our peak, and yet we're getting
anew WAL file every 15-20 seconds.  

You would be better off asking about this on pgsql-performance, but yes,
the first thing you should try is boosting checkpoint_segments and
checkpoint_timeout.  The longer the interval between checkpoints, the lower
the WAL volume will be, because of reduction of full-page-image updates.
The limiting factor of course is how long a replay you can stand after
a database crash.

> 1) Due to high ratio of checkpoints_req to checkpoints_timed, I should increase checkpoints_segments, maybe from 10
to15?  

Right now, at one segment every 15 seconds, "10" is forcing a checkpoint
every 150 seconds.  You probably want at least four or five times that
interval if you want to make a serious dent in the WAL volume.  And
checkpoint_timeout has to increase enough to allow that, of course.

            regards, tom lane