Re: pg_stat_replication when standby is unreachable - Mailing list pgsql-hackers

From Abhishek Rai
Subject Re: pg_stat_replication when standby is unreachable
Date
Msg-id CA+sC4q6fx8StRu1nKLia9cLh7HeOAnhR=F9ay6qJ3fBbWZUQdQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_replication when standby is unreachable  (Abhishek Rai <abhishekrai@gmail.com>)
List pgsql-hackers
I looked a bit more into the code and it appears to me that the following are true:

- A separate wal sender process is created on the primary side for each connected standby.
- The wal sender process terminates (walsender.c / WalSndLoop) when there is an error to write to the standby's socket.
- If the standby machine is reachable but postgres is not running there any more, then the wal sender terminates almost immediately, probably because the standby machine sends a TCP RST to the wal sender.
- If the standby machine is unreachable, then the wal sender will keep trying to send wal data.  However, since the wal sender uses a non-blocking socket to talk to the standby, it will timeout and exit after "replication_timeout" (configured in postgresql.conf).

So it seems like the wal sender should exit within replication_timeout or sooner, and this will be reflected using an update to pg_stat_replication.  Therefore, I could just wait for up to replication_timeout before declaring the standby as dead.

Thanks,
Abhishek

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Unsigned integer types
Next
From: "Clark C. Evans"
Date:
Subject: Re: GRANT role_name TO role_name ON database_name