Re: Cascade replication - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Cascade replication
Date
Msg-id CA+U5nMJRCfU-crF68MR+O_37uTXPz8NHFGtGV8ofS2=btdi6Og@mail.gmail.com
Whole thread Raw
In response to Re: Cascade replication  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Cascade replication
List pgsql-hackers
On Tue, Jul 5, 2011 at 4:34 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Jul 4, 2011 at 6:24 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Tue, Jun 14, 2011 at 6:08 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>>>> The standby must not accept replication connection from that standby itself.
>>>> Otherwise, since any new WAL data would not appear in that standby,
>>>> replication cannot advance any more. As a safeguard against this, I introduced
>>>> new ID to identify each instance. The walsender sends that ID as the fourth
>>>> field of the reply of IDENTIFY_SYSTEM, and then walreceiver checks whether
>>>> the IDs are the same between two servers. If they are the same, which means
>>>> that the standby is just connecting to that standby itself, so walreceiver
>>>> emits ERROR.
>>
>> Thanks for waiting for review.
>
> Thanks for the review!

> I agree to focus on the main problem first. I removed that. Attached
> is the updated version.

Now for the rest of the review...

I'd rather not include another chunk of code related to
wal_keep_segments. The existing code in CreateCheckPoint() should be
refactored so that we call the same code from both CreateCheckPoint()
and CreateRestartPoint().

IMHO it's time to get rid of RECOVERYXLOG as an initial target for
de-archived files. That made sense once, but now we have streaming it
makes more sense for us to de-archive straight onto the correct file
name and let the file be cleaned up later. So de-archiving it and then
copying to the new location doesn't seem the right thing to do
(especially not to copy rather than rename). RECOVERYXLOG allowed us
to de-archive the file without removing a pre-existing file, so we
must handle that still - the current patch would fail if a
pre-existing WAL file were there.

Those changes will make this code cleaner for the long term.

I don't think we should simply shutdown a WALSender when we startup.
That is indistinguishable from a failure, which is going to be very
worrying if we do a switchover. Is there another way to do this? Or if
not, at least a log message to explain it was normal that we requested
this.

It would be possible to have synchronous cascaded replication but that
is probably another patch :-)

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Inconsistency between postgresql.conf and docs
Next
From: Yeb Havinga
Date:
Subject: Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)