Re: Sync Rep: First Thoughts on Code - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Sync Rep: First Thoughts on Code
Date
Msg-id 1228902492.20796.799.camel@hp_dx2400_1
Whole thread Raw
In response to Re: Sync Rep: First Thoughts on Code  ("Fujii Masao" <masao.fujii@gmail.com>)
Responses Re: Sync Rep: First Thoughts on Code
Re: Sync Rep: First Thoughts on Code
List pgsql-hackers
On Wed, 2008-12-10 at 14:51 +0900, Fujii Masao wrote:

> Yes, I basically agree with you! The only difference between us is
> whether the primary also has to switch two modes (FLS <-> SLS).
> I think that the primary don't need to stop archiving forcibly when
> replication starts, which should be optional for the user. The user
> who doesn't want to archive can disable archiving by using existing
> mechanism (change archive_command & pg_ctl reload). It's more
> complicated to switch the modes on each servers.

Yes, I see that a manual change of parameter is possible. But it is
difficult to get the timing of the manual change correct and yet
important not to get that wrong. I don't want to spend the next year
answering questions on list about how that works and agreeing that it
isn't ideal.

We should have an optional mechanism that will turn archiving on the
primary off *automatically* when the mode changes. Maybe a third mode on
archive_mode to cater for this, but other ways possible also.

> For clarity: the user can choose the strategy of archiving from the
> following.
> 
> 1) each primary and standby archives
> 2) only primary archives
> 3) only standby archives
> 4) no server archives

Those are all possible, but they aren't all equally usable as it stands.

In my experience most people do things very simply, so (4) is the common
use case. So it needs to Just Work.

We need to cater for a range of use cases, from simple implementations
through to complex multi-node cases. I don't think its right to assume
that everybody is implementing a complex use case and so we mostly cater
for that.

> The user who don't want to share an archive would choose 1).

If we include a feature you need to explain why its there. Asking the
question doesn't mean that I'm opposed, just that I'm checking why you
think its important to have that option.

So, why would you want to run with multiple archives?

> The user who want to share an archive and cannot accept any
> increase of bandwidth would choose 4). On the other hand,
> the user who can accept it would choose 2) or 3). I prefer 2) to
> 3), for multiple standby in the future. And, if 3) is adopted,

> I wonder if we can get a base backup. Can we get it from the
> standby during recovery?

That's an important feature, so we should make it "yes". (Can't
understand why you've built this with the archiver active on standby
node if this isn't possible).

People I talk to consider "low impact on primary" to be an important
aspect of this feature. Though if you forced me to prioritise I would
say making (4) automatic is more important than (3).

> > I agree that is the way to do it *if* the archive is not shared. But why
> > would you want to *not* share the archive??
> 
> First of all, I'd not like to buy a machine only for an archive other than
> the primary and standby. Meanwhile, if an archive is located on either
> the primary or standby (which should we locate it on?), post-failure
> processing is complicated.

Are you saying that putting the archive on the primary is an option?

What is complicated about having the archive on the standby server? 

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: ITAGAKI Takahiro
Date:
Subject: Re: posix_fadvise v22
Next
From: ohp@pyrenet.fr
Date:
Subject: Re: cvs head initdb hangs on unixware