Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Date
Msg-id 20130117154225.GC19562@awork2.anarazel.de
Whole thread Raw
In response to Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
List pgsql-hackers
On 2013-01-17 16:23:44 +0100, Andres Freund wrote:
> On 2013-01-17 17:18:14 +0200, Heikki Linnakangas wrote:
> > On 17.01.2013 15:05, Andres Freund wrote:
> > >On 2013-01-17 13:47:41 +0900, Michael Paquier wrote:
> > >>I think that bug has been introduced by commit 7fcbf6a.
> > >>Before splitting xlog reading as a separate facility things worked
> > >>correctly.
> > >>There are also no delay problems before this commit.
> > >
> > >Ok, my inkling proved to be correct, its two related issues:
> > >
> > >a ) The error handling in XLogReadRecord is inconsistent, it doesn't
> > >always reset the necessary things.
> > >
> > >b) ReadRecord: We cannot not break out of the retry loop in readRecord
> > >just so, just removing the break seems correct.
> > >
> > >c) ReadRecord: (minor): We should log an error even if errormsg is not
> > >set, otherwise we wont jump out far enough.
> > >
> > >I think at least a) and b) is the result of merges between development
> > >of different people, sorry for that.
> >
> > Got a patch?
>
> Yes, I am just testing some scenarios with it, will send it after that.

Ok, the attached patch seems to fix a) and b). c) above is bogus, as
explained in a comment in the patch.  I also noticed that the TLI check
didn't mark the last source as failed.

Not a real issue and its independent from this patch but I found that
when promoting from streaming rep the code first fails over to archive
recovery and only then to recovering from pg_xlog.  Is that intended?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Next
From: Tom Lane
Date:
Subject: Re: could not create directory "...": File exists