Re: [BUGS] replication_timeout not effective - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [BUGS] replication_timeout not effective
Date
Msg-id 20130410144402.GD15043@awork2.anarazel.de
Whole thread Raw
In response to Re: [BUGS] replication_timeout not effective  (Dang Minh Huong <kakalot49@gmail.com>)
Responses Re: [BUGS] replication_timeout not effective  (Dang Minh Huong <kakalot49@gmail.com>)
List pgsql-hackers
On 2013-04-10 23:37:44 +0900, Dang Minh Huong wrote:
> Thanks all,
> 
> (2013/04/10 22:55), Andres Freund wrote:
> >On 2013-04-10 22:38:07 +0900, Kyotaro HORIGUCHI wrote:
> >>Hello,
> >>
> >>On Wed, Apr 10, 2013 at 6:57 PM, Dang Minh Huong <kakalot49@gmail.com> wrote:
> >>>In 9.3, it sounds replication_timeout is replaced by wal_sender_timeout.
> >>>So if it is solved in 9.3 i think there is a way to terminate it.
> >>>I hope it is fixed in 9.1 soon
> >>Hmm. He said that,
> >>
> >>>But in my environment the sender process is hang up (in several tens of minunites) if i turn off  (by power off)
StandbyPC while *pg_basebackup* is excuting.
 
> >>Does basebackup run only on 'replication connection' ?
> >>As far as I saw base backup uses 'base backup' connection in addition
> >>to 'streaming' connection. The former seems not under the control of
> >>wal_sender_timeout or replication_timeout and easily blocked at
> >>send(2) after sudden cut out of the network connection underneath.
> >>Although the latter indeed is terminated by them.
> >Yes, it's run via a walsender connection. The only "problem" is that it
> >doesn't check for those timeouts. I am not sure it would be a good thing
> >to do so to be honest. At least not using the same timeout as actual WAL
> >sending, thats just has different characteristics.
> >On the other hand, hanging around that long isn't nice either...
> I tried max_wal_sender with 1, so when the walsender is hanging.
> I can not run again pg_basebackup (or start the standby DB).
> I'm increasing it to 2, so the seconds successfully. But i'm afraid
>  that when the third occures the hanging walsender in the first
>  is not yet terminated...
> 
>  I think not, but is there a way to terminate hanging up but not
>  restart PostgreSQL server or kill walsender process?
>  (kill walsender process can caused a crash to DB server,
>  so i don't want to do it).

Depending on where its hanging a normal SELECT
pg_terminate_backend(pid); might do it.

Otherwise you will have to wait for the operating system's tcp timeout.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Dang Minh Huong
Date:
Subject: Re: [BUGS] replication_timeout not effective
Next
From:
Date:
Subject: [GSOC] questions about idea "rewrite pg_dump as library"