Tuning checkpoint_segments and checkpoint_timeout. - Mailing list pgsql-admin

From Jesper Krogh
Subject Tuning checkpoint_segments and checkpoint_timeout.
Date
Msg-id 500EFE5D.50000@krogh.cc
Whole thread Raw
List pgsql-admin
Hi.

I have a 1.8TB PG database, we're doing "fairly heavy" batch updates on.
Say 2/3 of
the database monthly in a background batch process. The system is
working really
well and performing good, but we're always haunting more speed (and
smaller amount of WAL-log).

So I tried to look into how frequent it did checkpoints during these
batch updates,
as I have understood that once after each checkpoint it needs to write a
full-page-image,
so less frequent checkpoint should result in smaller log volume from the
system.

I can cope with significant downtime to replay logs in events of crash.
(an hour shouldn't
be an issue, as long as it is rare).

So, this is the outout from log-checkpoints this morning

2012-07-22 07:00:16 CEST LOG:  checkpoint starting: xlog
2012-07-22 07:03:22 CEST LOG:  checkpoint complete: wrote 399040 buffers
(19.0%); 0 transaction log file(s) added, 0 removed, 256 recycled;
write=133.317 s, sync=52.373 s, total=185.793 s
2012-07-22 07:03:50 CEST LOG:  checkpoint starting: xlog
2012-07-22 07:06:39 CEST LOG:  checkpoint complete: wrote 369302 buffers
(17.6%); 0 transaction log file(s) added, 0 removed, 256 recycled;
write=120.177 s, sync=48.685 s, total=169.003 s
2012-07-22 07:07:20 CEST LOG:  checkpoint starting: xlog
2012-07-22 07:10:00 CEST LOG:  checkpoint complete: wrote 386087 buffers
(18.4%); 0 transaction log file(s) added, 0 removed, 256 recycled;
write=109.709 s, sync=50.065 s, total=159.886 s

I have checkpoint_segments set to 256 and checkpoint_timeout set to 1h ..

Every 3 minutes seemed really frequent and often it didnt even get to
sleep before starting over again. So I bumped
the checkpoint_segments to 512 set the bgwriter parameteres more aggressive:
bgwriter_delay = 100ms                  # 10-10000ms between rounds
bgwriter_lru_maxpages = 500             # 0-1000 max buffers written/round
bgwriter_lru_multiplier = 4.0           # 0-10.0 multipler on buffers
scanned/round

and that did change it a bit:
2012-07-24 21:27:27 CEST LOG:  checkpoint starting: xlog
2012-07-24 21:31:57 CEST LOG:  checkpoint complete: wrote 318368 buffers
(15.2%); 0 transaction log file(s) added, 0 removed, 512 recycled;
write=149.180 s, sync=120.061 s, total=269.563 s
2012-07-24 21:34:39 CEST LOG:  checkpoint starting: xlog
2012-07-24 21:39:11 CEST LOG:  checkpoint complete: wrote 395183 buffers
(18.8%); 0 transaction log file(s) added, 0 removed, 512 recycled;
write=149.734 s, sync=121.448 s, total=271.633 s
2012-07-24 21:41:24 CEST LOG:  checkpoint starting: xlog
2012-07-24 21:46:13 CEST LOG:  checkpoint complete: wrote 352612 buffers
(16.8%); 0 transaction log file(s) added, 0 removed, 512 recycled;
write=157.306 s, sync=131.687 s, total=289.080 s

But this only seemed to be due to the more checkpoint_segments.

Can I just crank checkpoint_segments further up? Or what is the downside
of doing that.

Hardware is a 25xIntel 320SSD in Raid50 with BBWC. Above load gives a
sustained
write throughput of about 30MB/s and an IO-wait load of about 1-2 cores
(not
saturated in any way, queries still perform fine).

I also tried to get a feeling about log-volume and above change didnt
have any huge
effect, sitting about 1 logfile/s..

A big part of the update, updates a different parts of the dataset, so I
wouldn't expect
them to hit the same page within a 3 minute period, but another big part
is about
updating a big GIN-index and that would most likely get some benefits if
the interval
could be pushed higher.

Can the log-volume be decreased by tuning in this area?

--
Jesper

pgsql-admin by date:

Previous
From: Radovan Jablonovsky
Date:
Subject: PostgreSQL oom_adj postmaster process to -17
Next
From: wangqi
Date:
Subject: about select use random