Home > mailing lists

Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Switching timeline over streaming replication
Date	December 19, 2012 15:27:12
Msg-id	50D1DCC8.7080609@vmware.com Whole thread Raw
In response to	Re: Switching timeline over streaming replication (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: Switching timeline over streaming replication Re: Switching timeline over streaming replication
List	pgsql-hackers

Tree view

On 19.12.2012 15:55, Heikki Linnakangas wrote:
> On 19.12.2012 04:57, Josh Berkus wrote:
>> Heikki,
>>
>> I ran into an unexpected issue while testing. I just wanted to fire up
>> a chain of 5 replicas to see if I could connect them in a loop.
>> However, I ran into a weird issue when starting up "r3": it refused to
>> come out of "the database is starting up" mode until I did a write on
>> the master. Then it came up fine.
>>
>> master-->r1-->r2-->r3-->r4
>>
>> I tried doing the full replication sequence (basebackup, startup, test)
>> with it twice and got the exact same results each time.
>>
>> This is very strange because I did not encounter the same issues with r2
>> or r4. Nor have I seen this before in my tests.
>
> Ok.. I'm going to need some more details on how to reproduce this, I'm
> not seeing that when I set up four standbys.

Ok, I managed to reproduce this now. The problem seems to be a timing 
problem, when a standby switches to follow a new timeline. Four is not a 
magic number here, it can happen with just one cascading standby too.

When the timline switch happens, for example, the standby changes 
recovery target timeline from 1 to 2, at WAL position 0/30002D8, it has 
all the WAL up to that WAL position. However, it only has that WAL in 
file 000000010000000000000003, corresponding to timeline 1, and not in 
the file 000000020000000000000003, corresponding to the new timeline. 
When a cascaded standby connects, it requests to start streaming from 
point 0/3000000 at timeline 2 (we always start streaming from the 
beginning of a segment, to avoid leaving partially-filled segments in 
pg_xlog). The walsender in the 1st standby tries to read that from file 
000000020000000000000003, which does not exist yet.

The problem goes away after some time, after the 1st standby has 
streamed the contents of 000000020000000000000003 and written it to 
disk, and the cascaded standby reconnects. But it would be nice to avoid 
that situation. I'm not sure how to do that yet, we might need to track 
the timeline we're currently receiving/sending more carefully. Or 
perhaps we need to copy the previous WAL segment to the new name when 
switching recovery target timeline, like we do when a server is 
promoted. I'll try to come up with something...

- Heikki

pgsql-hackers by date:

From: Albe Laurenz
Date: 19 December 2012, 15:14:04
Subject: Documentation bug for LDAP authentication

From: Heikki Linnakangas
Date: 19 December 2012, 15:29:44
Subject: Re: Switching timeline over streaming replication

Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

Previous

Next