Thread: Postgres Replaying WAL slowly

Postgres Replaying WAL slowly

From

Soni M

Date:

29 June 2014, 08:14:29

Hello Everyone ...

We have 6 PG 9.1.12 installation, one master (Ubuntu 10.04), one slony slave(Ubuntu 10.04), and four streaming replica (2 on Ubuntu 10.04 and 2 on RHEL 6.5 (Santiago) which lies on different datacenter). All Ubuntu is on same datacenter. Master send wal archive to slony slave.

This is the infrastructure description :

200Mbit link between data centers, esx 5.5 on hp blade chassis. proliant gen 7 blades. postgres servers dedicated to esx hosts (no other vms on those esx hosts). 3par disk backends with 4 and 8 Gbit fiber channel connections. 10Gbit ethernet virtual connects on the hp chassis. cisco fabric and network switches.

All postgres installed from Ubuntu/RHEL package.

Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all.

We thought it was due to network, so we decide to copy wal files to local of a streaming server, and replaying wal from local. After PG restart, it replays wal on a good speed about 3 seconds per wal file, but as the time goes the speed decreasing. We had 30 seconds per wal file. The worst we get is 3 minutes to replay 1 wal file.

The rate of wal produced from master is normal like usual. And also on Thursday we had wal files on pg_xlog on streaming replica server, but no other wal files.

This is the configuration :

SELECT name, current_setting(name)

FROM pg_settings

WHERE source NOT IN ('default', 'override');

name | current_setting

--------------------------------+---------------------------------------------------

application_name | psql

archive_command | /var/lib/postgresql/scripts/wal_archive "%p" "%f"

archive_mode | on

checkpoint_completion_target | 0.7

checkpoint_segments | 30

client_encoding | UTF8

DateStyle | ISO, MDY

default_text_search_config | pg_catalog.english

effective_cache_size | 125GB

effective_io_concurrency | 3

external_pid_file | /var/run/postgresql/9.1-main.pid

hot_standby | on

hot_standby_feedback | on

lc_messages | en_US.UTF-8

lc_monetary | en_US.UTF-8

lc_numeric | en_US.UTF-8

lc_time | en_US.UTF-8

listen_addresses | *

log_checkpoints | on

log_connections | on

log_destination | csvlog

log_directory | pg_log

log_disconnections | on

log_filename | postgresql-%a.log

log_line_prefix | %t

log_lock_waits | on

log_rotation_age | 1d

log_rotation_size | 0

log_temp_files | 100kB

log_timezone | localtime

log_truncate_on_rotation | on

logging_collector | on

maintenance_work_mem | 1GB

max_connections | 750

max_locks_per_transaction | 900

max_pred_locks_per_transaction | 900

max_stack_depth | 2MB

max_wal_senders | 6

port | 5432

shared_buffers | 8GB

ssl | on

temp_buffers | 64MB

TimeZone | America/Chicago

unix_socket_directory | /var/run/postgresql

wal_keep_segments | 50

wal_level | hot_standby

work_mem | 256MB

(47 rows)

Thanks for any help

Regards,

Soni Maula Harriz

Thread: Postgres Replaying WAL slowly

Attachment