Re: BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases. - Mailing list pgsql-bugs

From mark
Subject Re: BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases.
Date
Msg-id AANLkTi=kg4NsYFSGKzKgoMF+RXy_QD4V51L_hWpFMfT4@mail.gmail.com
Whole thread Raw
In response to Re: BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases.  ("mark" <dvlhntr@gmail.com>)
Responses Re: BUG #5851: ROHS (read only hot standby) needs to be restarted manually in somecases.  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-bugs
On Sun, Jan 30, 2011 at 12:45 PM, mark <dvlhntr@gmail.com> wrote:
>
>
>> -----Original Message-----
>> From: Robert Haas [mailto:robertmhaas@gmail.com]
>> Sent: Sunday, January 30, 2011 12:19 PM
>> To: mark
>> Cc: pgsql-bugs@postgresql.org
>> Subject: Re: [BUGS] BUG #5851: ROHS (read only hot standby) needs to be
>> restarted manually in somecases.
>>
>> On Fri, Jan 28, 2011 at 1:03 PM, mark <dvlhntr@gmail.com> wrote:
>> > When showing the setting on the slave or master all tcp_keepalive
>> settings
>> > (idle, interval and count) are showing 0;
>> >
>> > The config file shows interval and count commented out, but idle in
>> the
>> > config file is set to 2100.
>> >
>> > Possible that "show tcp_keepalive_idle;" isn't reporting accurately ?
>> (or a
>> > value that high isn't be accepted?)
>> >
>> > I have reloaded configs and still seeing 0's
>> >
>> >
>> >
>> > I assume you would suggest I turn that number down... a lot.
>>
>> Yeah, the defaults are way too long for our purposes. =A0The way to get
>> this set correctly, I think, is to set it in the primary_conninfo
>> stream on the slave. =A0You end up with something like this:
>>
>> primary_conninfo=3D'host=3Dblahblah user=3Dbob keepalives_idle=3DXX
>> keepalives_interval=3DXX keepalives_count=3DXX'
>>
> Thanks I will try this on Monday and will report back if it fixes the
> problem. (however since I can't reproduce the issue on demand it might be=
 a
> waiting game. Might not know for a month or so tho)
>
> -Mark
>
>
>> I'm of the opinion that we really need an application-level keepalive
>> here, but the above is certainly a lot better than nothing.

my streaming replication woes continue.


I made those changes  in the recovery.conf file but I am still having
streaming replication stay broken after any sort of network
interruption until someone manaully comes along and fixes things by
restarting the standby or if it's been too long resynchronizing the
base.

I think it's a network interruption that is triggering the break down,
but I don't have anything to prove it.

wal_keep_segments are set to 250, which was supposed to give us a few
hours to fix the issue but it seems we blew through that many last
night and such when someone got around to fixing it the standby was
too far behind.


my #1 problem with this right now is I can't seem to reproduce on
demand with virtual machines in our development area.

this is the recovery.conf file, see any problems with it? maybe I
didn't do some syntax right right ?

[postgres@<redacted> data9.0]$ cat recovery.conf
standby_mode =3D 'on'
primary_conninfo =3D 'host=3D<redacted> port=3D5432 user=3Dpostgres
keepalives_idle=3D30 keepalives_interval=3D30 keepalives_count=3D30'



thanks
..: Mark

p.s. looking forward to 9.1 where a standby can be started with
streaming from scratch. that sounds nice.

>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>
>

pgsql-bugs by date:

Previous
From: John R Pierce
Date:
Subject: Re: Duplicate table name within the sme schema
Next
From: "Rodolfo Campero"
Date:
Subject: BUG #5872: Function call in SQL function executed only once