Home > mailing lists

Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Date	January 17, 2013 16:50:45
Msg-id	50F82BDB.1070905@vmware.com Whole thread Raw
In response to	Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
List	pgsql-hackers

Tree view

On 17.01.2013 18:42, Andres Freund wrote:
> On 2013-01-17 18:33:42 +0200, Heikki Linnakangas wrote:
>> On 17.01.2013 17:42, Andres Freund wrote:
>>> Ok, the attached patch seems to fix a) and b). c) above is bogus, as
>>> explained in a comment in the patch.  I also noticed that the TLI check
>>> didn't mark the last source as failed.
>>
>> This looks fragile:
>>
>>>             /*
>>>              * We only end up here without a message when XLogPageRead() failed
>>>              * - in that case we already logged something.
>>>              * In StandbyMode that only happens if we have been triggered, so
>>>              * we shouldn't loop anymore in that case.
>>>              */
>>>             if (errormsg == NULL)
>>>                 break;
>>
>> I don't like relying on the presence of an error message to control logic
>> like that. Should we throw in an explicit CheckForStandbyTrigger() check in
>> the condition of that loop?
>
> I agree, I wasn't too happy about that either. But in some sense its
> only propagating state from XLogReadPage which already has dealt with
> the error and decided its ok.
> Its the solution closest to what happened in the old implementation,
> thats why I thought it would be halfway to acceptable.
>
> Adding the CheckForStandbyTrigger() in the condition would mean
> promotion would happen before all the available records are processed
> and it would increase the amount of stat()s tremendously.
> So I don't really like that either.

I was thinking of the attached. As long as we check for
CheckForStandbyTrigger() after the "record == NULL" check, we won't
perform extra stat() calls on successful reads, only when we're polling
after reaching the end of valid WAL. That seems acceptable. If we want
to avoid even that, we could move the static 'triggered' variable from
CheckForStandbyTrigger() out of that function, and check that in the loop.

- Heikki

Attachment

fix-error-handling-xlogreader-2.patch

pgsql-hackers by date:

From: Simon Riggs
Date: 17 January 2013, 16:47:02
Subject: Re: Materialized views WIP patch

From: Andres Freund
Date: 17 January 2013, 16:55:15
Subject: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave

Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave - Mailing list pgsql-hackers

Attachment

Previous

Next