Re: Replication streaming issue - Mailing list pgsql-admin

From Keith
Subject Re: Replication streaming issue
Date
Msg-id CAHw75vvt5fwKdrppSZTPscjep_491QDNNZwzFxyAhSaRHiRzcg@mail.gmail.com
Whole thread Raw
In response to Replication streaming issue  (Mai Peng <maily.peng@webedia-group.com>)
List pgsql-admin


On Tue, Jul 23, 2019 at 9:39 AM Mai Peng <maily.peng@webedia-group.com> wrote:
Hello,

I’ve got a strange issue.
Here is the error in pg log:ERROR:  requested WAL segment 000000020000A01A0000004F has already been removed.
Everything is working fine: the lag is ok.
The check queries:
SELECT  client_addr,
        state,
        write_lag,
        flush_lag,
        replay_lag
FROM    pg_stat_replication
WHERE   application_name ='walreceiver’;

SELECT COALESCE(ROUND(EXTRACT(epoch FROM now() - pg_last_xact_replay_timestamp())),0) AS seconds;

How could I monitor the problem? Why the replication streaming is still working?

Thank you in advance.
Mai
 

Are you sure replication is still actually working? Your first query just checks to see if a streaming replica is connected. Without doing further calculations with that info, it doesn't tell you if it's actually replicating. Your second query can be misleading if the calculation returns null and will just instead tell you zero. Also if no writes are actually occurring on the primary, it can lead to false positives about it actually being behind since it just tells you the last time a WAL file was replayed. If no writes are happening, then no WAL will be replayed. It can still be useful as a monitoring query in general, but shouldn't be your only replication monitoring method.

Try creating a new object on your primary and see if it actually shows up on the replica. If not, this means your replica needs to be rebuilt for it to resume replication unless you have your WAL files backed up somewhere else.

I've written a post about better monitoring practices for replicas - https://www.keithf4.com/monitoring_streaming_slave_lag/

Keith

pgsql-admin by date:

Previous
From: Mai Peng
Date:
Subject: Replication streaming issue
Next
From: jmz
Date:
Subject: Re: Database.Schema.Table