BUG #7801: Streaming failover checkpoints much slower than master, pg_xlog space problems during db load - Mailing list pgsql-bugs
From | briank@openroadtech.com |
---|---|
Subject | BUG #7801: Streaming failover checkpoints much slower than master, pg_xlog space problems during db load |
Date | |
Msg-id | E1TsemH-0004dK-KN@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #7801: Streaming failover checkpoints much slower than
master, pg_xlog space problems during db load
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 7801 Logged by: Brian Krznarich Email address: briank@openroadtech.com PostgreSQL version: 9.2.2 Operating system: RHEL6 Description: = I was told to resubmit this- the original submission (7777) may have been lost. Simply stated, pg_xlog grows out of control on a streaming-replication backup server with a high volume of writes on the master server. This occurs only with checkpoint_completion_target>0 and very large (eg. 8GB) shared_buffers. pg_xlog on the master stays a fixed size (1.2G for me). = Detail: I have two new/fast servers using streaming replication, with 9.2.2 from the PostgreSQL yum repository. The servers are connected via 10G network, so the failover receives data as fast as it is created. It appears that to trigger this problem you need large shared_buffers (mine is set to 8GB) and checkpoint_completion_target > 0. (.8 or .9 will do). I have checkpoint_timeout=3D5min. (though setting this to 30s *does not* alleviate the problem). The failover machine doesn't seem to be concerned with checkpoint_timeout or checkpoint_segments. When doing large writes (like pg_restore) the failover machine's pg_xlog does not clear files. The reason appears to be that it is not checkpointing(verified by log_checkpoints=3Don and examining the log). If I run 'checkpoint;' manually on the failover in psql(hot_standby=3Don), it immediately cleans up pg_xlog(after a pause to actually perform the checkpoint). To generate the data,you can just run = pgbench -i -s1000 on the master server, then watch pg_xlog grow on the failover. This produces a ~10GB of data, then creates indices. If I run du on the failover after the data is loaded, I see: = # du -s base pg_xlog | sort -n 10928276 pg_xlog 13144536 base I have a 10GB pg_xlog. The load completes in less than 5 minutes, which may be relevant here, since eventually the failover will perform a checkpoint. = On the primary server, pg_xlog caps at 1.2GB. = If checkpoint_completion_target=3D0, or shared_buffers=3D256MB, pg_xlog gro= ws to around 1GB and stays there. This guess may be off target, but it appears that the database only checkpoints every approximately checkpoint_completion_target* checkpoint_timeout seconds. So, for a .9 completion target and the default 5min timeout, the failover will go for 4+ minutes without executing a single checkpoint, regardless of how much data arrives (verified with log_checkpoints=3Don). = All the while the master server is spitting out 'checkpoints are occurring too frequently' messages, which I assume is to be expected during a reload. This means that the failover machine needs 2x the database size to not crash during a database reload. We run VMs with limited disk allocation where this is not the case. We periodically need to dump/restore, and I'm concerned this simple operation will crash our failover machines. I can avoid the problem by setting checkpoint_completion_target=3D0, which I will do for now. Here are settings that differ from the default initdb install: shared_buffers =3D 8GB = work_mem =3D 64MB = maintenance_work_mem =3D 256MB max_stack_depth =3D 8MB = wal_level =3D hot_standby = synchronous_commit =3D off = wal_buffers =3D 64MB = checkpoint_segments =3D 10 = checkpoint_completion_target =3D 0 = max_wal_senders =3D 2 = hot_standby =3D on = effective_cache_size =3D 8GB log_checkpoints =3D on
pgsql-bugs by date: