Thread: Synchronous replication: Admin command for replication_timeout_action

Synchronous replication: Admin command for replication_timeout_action

From
"K, Niranjan (NSN - IN/Bangalore)"
Date:
Hi,

This is to support an admin command or utility which can trigger the
server to be taken to a standalone mode if there a connection failure
detection between Primary and server. It need not be always, that the
replication_timeout needs to be accomplished to detect the connection
failure because it could happen that cluster/hearbeat framework might
detect the connection failure earlier to the replication_timeout. So the
admin command, which will abstract the implementation details will
assist in taking the server to standalone mode earlier to
replication_timeout.

Are there any suggestions from your side with respect to this?

regards,
Niranjan


Hi,

On Tue, May 5, 2009 at 2:37 AM, K, Niranjan (NSN - IN/Bangalore)
<niranjan.k@nsn.com> wrote:
> Hi,
>
> This is to support an admin command or utility which can trigger the
> server to be taken to a standalone mode if there a connection failure
> detection between Primary and server. It need not be always, that the
> replication_timeout needs to be accomplished to detect the connection
> failure because it could happen that cluster/hearbeat framework might
> detect the connection failure earlier to the replication_timeout. So the
> admin command, which will abstract the implementation details will
> assist in taking the server to standalone mode earlier to
> replication_timeout.
>
> Are there any suggestions from your side with respect to this?

Yes. Since walsender is treated as special backend, we can use
pg_terminate_backend() to terminate replication and let the server
standalone. This feature is simple but very useful, so I'll address it
(my previous patch has not provided this completely yet).

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


On Tue, 2009-05-26 at 11:06 +0900, Fujii Masao wrote:

> Yes. Since walsender is treated as special backend, we can use
> pg_terminate_backend() to terminate replication and let the server
> standalone. This feature is simple but very useful, so I'll address it
> (my previous patch has not provided this completely yet).

I think we need something better than that. We shouldn't be shooting at
pids in a production database: we may get it wrong and take something
else down instead.

We need a graceful termination of replication and an immediate one.
There may be other things we need to add later, so a specific command
will be better and allow us to produce messages like "replication isn't
running" if used inappropriately.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support