Re: streaming replication breaks horribly if master crashes - Mailing list pgsql-hackers

From Rafael Martinez
Subject Re: streaming replication breaks horribly if master crashes
Date
Msg-id 4C19C89E.7050705@usit.uio.no
Whole thread Raw
In response to Re: streaming replication breaks horribly if master crashes  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: streaming replication breaks horribly if master crashes
List pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Heikki Linnakangas wrote:

> 
> We're not talking about a timeout for promoting standby to master. The
> problem is that the standby doesn't notice that from the master's point
> of view, the connection has been broken. Whether it's because of a
> network error or because the master server crashed doesn't matter, the
> standby should reconnect in any case. TCP keepalives are a perfect fit,
> as long as you can tune the keepalive time short enough. Where "Short
> enough" is up to the admin to decide depending on the application.
> 
>

I tested this yesterday and I could not get any reaction from the wal
receiver even after using minimal values compared to the default values  .

The default values in linux for tcp_keepalive_time, tcp_keepalive_intvl
and tcp_keepalive_probes are 7200, 75 and 9. I reduced these values to
60, 3, 3 and nothing happened, it continuous with status ESTABLISHED
after 60+3*3 seconds.

I did not restart the network after I changed these values on the fly
via /proc. I wonder if this is the reason the connection didn't die
neither with the new keppalive values after the connection was broken. I
will check this later today.

regards,
- --Rafael Martinez, <r.m.guerrero@usit.uio.no>Center for Information Technology ServicesUniversity of Oslo, Norway
PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkwZyJ4ACgkQBhuKQurGihT3kgCgn4iQkZ8YKr/nAk5/QqpwYfnc
4lsAn2CKvgeeIOon+lWRHe908hbJ+zK6
=VymH
-----END PGP SIGNATURE-----


pgsql-hackers by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: Partitioning syntax
Next
From: Fujii Masao
Date:
Subject: Debug message in RemoveOldXlogFiles