Re: Primary not sending to synchronous standby - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Primary not sending to synchronous standby
Date
Msg-id 20150223153844.GD30784@awork2.anarazel.de
Whole thread Raw
In response to Primary not sending to synchronous standby  (Thom Brown <thom@linux.com>)
Responses Re: Primary not sending to synchronous standby  (Andres Freund <andres@2ndquadrant.com>)
Re: Primary not sending to synchronous standby  (Thom Brown <thom@linux.com>)
List pgsql-hackers
Hi,

On 2015-02-23 15:25:57 +0000, Thom Brown wrote:
> I've noticed that if the primary is started and then a base backup is
> immediately taken from it and started as as a synchronous standby, it
> doesn't replicate and the primary hangs indefinitely when trying to run any
> WAL-generating statements.  It only recovers when either the primary is
> restarted (which has to use a fast shutdown otherwise it also hangs
> forever), or the standby is restarted.
> 
> Here's a way of reproducing it:
> ...
> Note that if you run the commands one by one, there isn't a problem.  If
> you run it as a script, the standby doesn't connect to the primary.  There
> aren't any errors reported by either the standby or the primary.  The
> primary's wal sender process reports the following:
> 
> wal sender process rep_user 127.0.0.1(45243) startup waiting for 0/3000158
> 
> Anyone know why this would be happening?  And if this could be a problem in
> other scenarios?

Given that normally a walsender doesn't wait for syncrep I guess this is
the above backend just did authentication. If you gdb into the
walsender, what's the backtrace?

We previously had discussions about that being rather annoying; I
unfortunately don't remember enough of the thread to reference it
here. If it really is this, I think we should add some more smarts about
only enabling syncrep once a backend is fully up and maybe even remove
it from more scenarios during commits generally (e.g. if no xid was
assigned and we just pruned something).

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Primary not sending to synchronous standby
Next
From: Andres Freund
Date:
Subject: Re: Primary not sending to synchronous standby