Thank you so much for sending the query. It was exactly what I needed. I just made 1 modification to beautify the transfer and replay lag and I can see the size in bytes.
SELECT application_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn)) AS transfer_lag, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) AS replay_lag FROM pg_stat_replication;
I am now using zabbix to constantly monitor them and notify myself if it breaches a certain threshold.
On Wed, 2021-04-14 at 17:50 -0400, Viral Shah wrote: > We have a PostgreSQL 10.12 cluster of servers in two different data centers. > Off lately, in the case of a large WAL generation, we are seeing replication > delay between the master and the standby server. These delays have off lately > been there for an unusually long time. I was wondering if we have any metric > that can calculate the amount (size) of WAL transfer left between master and > standby? > > PS: We have ensured we have upgraded our firewalls for better speed transfer. > > Any help on how to figure out the slowness in the WAL transfer would be much appreciated.
SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn) AS transfer_lag, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS replay_lag FROM pg_stat_replication;
If both are delayed, it might be that the network cannot cope.
If only the second number is delayed, you have replication conflicts with queries on the standby.