Re: [ADMIN] High replication lag - Stream Replication - Mailing list pgsql-admin

From Shreeyansh Dba
Subject Re: [ADMIN] High replication lag - Stream Replication
Date
Msg-id CAGDYbUOP4AAxFAmdm1LpxJDJfGkCzCLoTnosqYksriL=77eSqA@mail.gmail.com
Whole thread Raw
In response to [ADMIN] High replication lag - Stream Replication  (Marcelo Kruger <marcelo.kruger@neoway.com.br>)
Responses Re: [ADMIN] High replication lag - Stream Replication  (Marcelo Kruger <marcelo.kruger@neoway.com.br>)
List pgsql-admin


On Wed, Nov 8, 2017 at 7:08 PM, Marcelo Kruger <marcelo.kruger@neoway.com.br> wrote:
Good afternoon,

I have two servers hosted on Azure with PostgreSQL 9.6 databases. One database is production, the other is standby.
For replication I use stream replication between servers. However the application of the archives in the standby database is slow, always generating replication lag.

The replication lag increases considerably when queries are performed on the StandBy database. I wonder if there is any setting that can be made to improve this performance.

Server Production
SO: CentOS Linux release 7.4.1708 (Core) 64Bits
CPU: Intel(R) Xeon(R) CPU E5-2698B v3 @ 2.00GHz - 15 Cores
Memory: 240GB
I/O: 17TB RAID0 (17 Disks (1TB for disk) SSD with read and write cache enabled)

Server StandBy
SO: CentOS Linux release 7.4.1708 (Core) 64Bits
CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz - 20 Cores
Memory: 140GB
I/O: 22TB RAID0 (22 Disks (1TB for disk) SSD with read and write cache enabled)

Configuration PostgreSQL Production

postgres.conf

listen_addresses = '*' 
port = 5433 
max_connections = 2000
superuser_reserved_connections = 3 
shared_buffers = 56GB 
work_mem = 29360kB
maintenance_work_mem = 4GB 
autovacuum_work_mem = 8GB 
max_stack_depth = 5MB 
dynamic_shared_memory_type = posix 
bgwriter_delay = 10ms 
bgwriter_lru_maxpages = 700 
bgwriter_lru_multiplier = 2.0 
fsync = off 
synchronous_commit = on 
full_page_writes = off 
wal_buffers = 1500MB 
wal_writer_delay = 1ms 
checkpoint_timeout = 10min 
max_wal_size = 4GB
min_wal_size = 2GB
checkpoint_completion_target = 0.9 
effective_cache_size = 168GB
default_statistics_target = 500
log_destination = 'csvlog' 
logging_collector = on 
log_directory = 'pg_log' 
log_filename = 'postgresql-%w.log' 
log_file_mode = 0640 
log_truncate_on_rotation = on 
log_rotation_age = 1d 
log_rotation_size = 600MB 
log_min_duration_statement = 0 
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h' 
log_lock_waits = on 
log_timezone = 'Brazil/East'
autovacuum = on 
log_autovacuum_min_duration = -1 
autovacuum_max_workers = 12 
autovacuum_naptime = 30s 
autovacuum_vacuum_threshold = 20 
autovacuum_analyze_threshold = 20 
autovacuum_vacuum_scale_factor = 0.2 
autovacuum_analyze_scale_factor = 0.1 
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = 800
vacuum_cost_delay = 20 
vacuum_cost_limit = 800
datestyle = 'iso, mdy'
timezone = 'Brazil/East'
lc_messages = 'en_US.UTF-8' 
lc_monetary = 'en_US.UTF-8' 
lc_numeric = 'en_US.UTF-8' 
lc_time = 'en_US.UTF-8' 
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 256
pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
effective_io_concurrency = 10
wal_level = hot_standby
max_wal_senders = 10
max_replication_slots = 3
wal_keep_segments = 38400

Configuration PostgreSQL StandBy

postgres.conf

listen_addresses = '*' 
port = 5433 
max_connections = 2000
superuser_reserved_connections = 3 
shared_buffers = 10GB 
work_mem = 183500kB
maintenance_work_mem = 4GB 
autovacuum_work_mem = 8GB 
max_stack_depth = 5MB 
dynamic_shared_memory_type = posix 
bgwriter_delay = 10ms 
bgwriter_lru_maxpages = 700 
bgwriter_lru_multiplier = 2.0 
fsync = off 
synchronous_commit = on 
full_page_writes = off 
wal_buffers = 16MB 
wal_writer_delay = 1ms 
checkpoint_timeout = 10min 
max_wal_size = 4GB
min_wal_size = 2GB
checkpoint_completion_target = 0.9 
effective_cache_size = 104GB
default_statistics_target = 500
log_destination = 'csvlog' 
logging_collector = on 
log_directory = 'pg_log' 
log_filename = 'postgresql-%w.log' 
log_file_mode = 0640 
log_truncate_on_rotation = on 
log_rotation_age = 1d 
log_rotation_size = 600MB 
log_min_duration_statement = 0 
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h' 
log_lock_waits = on 
log_timezone = 'Brazil/East'
autovacuum = on
log_autovacuum_min_duration = -1 
autovacuum_max_workers = 3
autovacuum_naptime = 30s 
autovacuum_vacuum_threshold = 20 
autovacuum_analyze_threshold = 20 
autovacuum_vacuum_scale_factor = 0.2 
autovacuum_analyze_scale_factor = 0.1 
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = 800
vacuum_cost_delay = 20 
vacuum_cost_limit = 800
datestyle = 'iso, mdy'
timezone = 'Brazil/East'
lc_messages = 'en_US.UTF-8' 
lc_monetary = 'en_US.UTF-8' 
lc_numeric = 'en_US.UTF-8' 
lc_time = 'en_US.UTF-8' 
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 256
pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
effective_io_concurrency = 10
max_worker_processes = 18
hot_standby = on
hot_standby_feedback = on

recovery.conf 

standby_mode       = 'on'
primary_slot_name  = 'replicacao'
primary_conninfo   = 'host=bdreplica00 port=5433 user=replicator password='
trigger_file       = '/var/lib/pgsql/9.6/data/master_db.conf'

--


Marcelo Krüger

Data Integration

Office: +55 48 3333-2030 | Mobile: +55 48 999679585

R. Patrício Farias, 131 - Itacorubi - Florianópolis - 3º andar | 
88034-132 | Santa Catarina, Brasil

São Paulo | Florianópolis | New York

Conteúdo confidencial. Caso você não seja o real destinatário deste e-mail por favor notifique o remetente e elimine esta mensagem. Privileged and confidential. If you are not the intended addressee of this e-mail please notify the sender and promptly delete this message.



Hi Marcelo Kruger,

From server configuration details provided, looks like a heavy trans DB & more logging enabled which might be resulting more write act on Master and Slave.
At our first glance, we see log_min_duration_statement is set to 0 which might be writing all the queries into the log resulting heavy writing activity that maybe cause for replication lag.


Hope this helps you.


--

pgsql-admin by date:

Previous
From: Douglas Reed
Date:
Subject: [ADMIN] Problems rebuilding slave using pg_basebackup
Next
From: Igor Neyman
Date:
Subject: Re: [ADMIN] How to measure replication lag in logical replicationsetup.