Re: Switching timeline over streaming replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Switching timeline over streaming replication
Date
Msg-id 00f801cd9a59$1fa7a620$5ef6f260$@kapila@huawei.com
Whole thread Raw
In response to Switching timeline over streaming replication  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Switching timeline over streaming replication
List pgsql-hackers
On Tuesday, September 11, 2012 10:53 PM Heikki Linnakangas wrote:
> I've been working on the often-requested feature to handle timeline
> changes over streaming replication. At the moment, if you kill the
> master and promote a standby server, and you have another standby
> server that you'd like to keep following the new master server, you
> need a WAL archive in addition to streaming replication to make it
> cross the timeline change. Streaming replication will just error out.
> Having a WAL archive is usually a good idea in complex replication
> scenarios anyway, but it would be good to not require it.

Confirm my understanding of this feature:

This feature is for case when standby-1 who is going to be promoted to
master has archive mode 'on'.
As in that case only its timeline will change.

If above is right, then there can be other similar scenario's where it can
be used:

Scenario-1 (1 Master, 1 Stand-by)
1. Master (archive_mode=on) goes down.
2. Master again comes up
3. Stand-by tries to follow it

Now in above scenario also due to timeline mismatch it gives error, but your
patch should fix it.


> 
> 
> Some parts of this patch are just refactoring that probably make sense
> regardless of the new functionality. For example, I split off the
> timeline history file related functions to a new file, timeline.c.
> That's not very much code, but it's fairly isolated, and xlog.c is
> massive, so I feel that anything that we can move off from xlog.c is a
> good thing. I also moved off the two functions RestoreArchivedFile()
> and ExecuteRecoveryCommand(), to a separate file. Those are also not
> much code, but are fairly isolated. If no-one objects to those changes,
> and the general direction this work is going to, I'm going split off
> those refactorings to separate patches and commit them separately.
> 
> I also made the timeline history file a bit more detailed: instead of
> recording just the WAL segment where the timeline was changed, it now
> records the exact XLogRecPtr. That was required for the walsender to
> know the switchpoint, without having to parse the XLOG records (it
> reads and parses the history file, instead)

IMO separating timeline history file related functions to a new file is
good.
However I am not sure about splitting for RestoreArchivedFile() and
ExecuteRecoveryCommand() into separate file.
How about splitting for all Archive related functions:
static void XLogArchiveNotify(const char *xlog); 
static void XLogArchiveNotifySeg(XLogSegNo segno); 
static bool XLogArchiveCheckDone(const char *xlog); 
static bool XLogArchiveIsBusy(const char *xlog); 
static void XLogArchiveCleanup(const char *xlog);
..
..

In any case, it will be better if you can split it into multiple patches:
1. Having new functionality of "Switching timeline over streaming
replication"
2. Refactoring related changes.

It can make my testing and review for new feature patch little easier.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed
Next
From: Fabien COELHO
Date:
Subject: PostgreSQL in the French news