Thread: is sync rep stalled?

is sync rep stalled?

From
Robert Haas
Date:
So we've got two patches that implement synchronous replication, and
no agreement on which one, if either, should be committed.  We have no
agreement on how synchronous replication should be configured, and at
most a tenuous agreement that it should involve standby registration.

This is bad.

This feature is important, and we need to get it done.  How do we get
the ball rolling again?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> So we've got two patches that implement synchronous replication, and
> no agreement on which one, if either, should be committed.  We have no
> agreement on how synchronous replication should be configured, and at
> most a tenuous agreement that it should involve standby registration.
>
> This is bad.
>
> This feature is important, and we need to get it done.  How do we get
> the ball rolling again?

ISTM that it still takes long to make consensus on standby registration.
So, how about putting the per-standby parameters in recovery.conf, and
focusing on the basic features in synchronous replication at first?
During that time, we can deepen discussion on standby registration, and
then we can implement that.

The basic features that I mean is for most basic use case, that is, one
master and one synchronous standby case. In detail,

> * Support multiple standbys with various synchronization levels.

Not required for that case.

> * What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.

The wait-forever option is not required for that case. Let's implement
the return-immediately at first.

> * Per-transaction control. Some transactions are important, others are not.

Not required for that case.

> * Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers can be seen as important special cases of
this.

Not required for that case.

> * async, recv, fsync and replay levels of synchronization.

At least one of three synchronous levels should be included in the first
commit. I think that either recv or fsync is suitable for first try
because those don't require wake-up signaling from startup process to
walreceiver and are relatively easy to implement.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
Robert Haas
Date:
On Wed, Sep 29, 2010 at 3:56 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> So we've got two patches that implement synchronous replication, and
>> no agreement on which one, if either, should be committed.  We have no
>> agreement on how synchronous replication should be configured, and at
>> most a tenuous agreement that it should involve standby registration.
>>
>> This is bad.
>>
>> This feature is important, and we need to get it done.  How do we get
>> the ball rolling again?
>
> ISTM that it still takes long to make consensus on standby registration.
> So, how about putting the per-standby parameters in recovery.conf, and
> focusing on the basic features in synchronous replication at first?
> During that time, we can deepen discussion on standby registration, and
> then we can implement that.
>
> The basic features that I mean is for most basic use case, that is, one
> master and one synchronous standby case. In detail,
>
>> * Support multiple standbys with various synchronization levels.
>
> Not required for that case.
>
>> * What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.
>
> The wait-forever option is not required for that case. Let's implement
> the return-immediately at first.
>
>> * Per-transaction control. Some transactions are important, others are not.
>
> Not required for that case.
>
>> * Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers can be seen as important special cases of
this.
>
> Not required for that case.
>
>> * async, recv, fsync and replay levels of synchronization.
>
> At least one of three synchronous levels should be included in the first
> commit. I think that either recv or fsync is suitable for first try
> because those don't require wake-up signaling from startup process to
> walreceiver and are relatively easy to implement.

I'm not sure this really gets us anywhere.  We already have two
patches; writing a third one won't fix anything.  We need to decide
which patch can be the basis for future work.  According to my
understanding, the most significant difference between the patches is
the way that ACKs get sent from standby to master.  Whose idea is
better, yours or Simon's?  And why?  Are there other reasons to prefer
one patch to the other?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 29.09.2010 10:56, Fujii Masao wrote:
> On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com>  wrote:
>> So we've got two patches that implement synchronous replication, and
>> no agreement on which one, if either, should be committed.  We have no
>> agreement on how synchronous replication should be configured, and at
>> most a tenuous agreement that it should involve standby registration.
>>
>> This is bad.
>>
>> This feature is important, and we need to get it done.  How do we get
>> the ball rolling again?

Agreed. Actually, given the lack of people jumping in and telling us 
what they'd like to do with the feature, maybe it's not that important 
after all.

> ISTM that it still takes long to make consensus on standby registration.
> So, how about putting the per-standby parameters in recovery.conf, and
> focusing on the basic features in synchronous replication at first?
> During that time, we can deepen discussion on standby registration, and
> then we can implement that.
>
> The basic features that I mean is for most basic use case, that is, one
> master and one synchronous standby case. In detail,

ISTM the problem is exactly that there is no consensus on what the basic 
use case is. I'm sure there's several things you can accomplish with 
synchronous replication, perhaps you could describe what the important 
use case for you is?

>> * Support multiple standbys with various synchronization levels.
>
> Not required for that case.

IMHO at least we'll still need to support asynchronous standbys in the 
same mix, that's an existing feature.

>> * What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.
>
> The wait-forever option is not required for that case. Let's implement
> the return-immediately at first.
>
> ..-
>
>> * async, recv, fsync and replay levels of synchronization.
>
> At least one of three synchronous levels should be included in the first
> commit. I think that either recv or fsync is suitable for first try
> because those don't require wake-up signaling from startup process to
> walreceiver and are relatively easy to implement.

What is the use case for that combination? For zero data loss, you 
*must* wait forever if a standby isn't connected. For keeping a hot 
standby server up-to-date so that you can freely query the standby 
instead of the master, you need replay level synchronization.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Simon Riggs
Date:
On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
> On 29.09.2010 10:56, Fujii Masao wrote:
> > On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com>  wrote:

> >> This feature is important, and we need to get it done.  How do we get
> >> the ball rolling again?
> 
> Agreed. Actually, given the lack of people jumping in and telling us 
> what they'd like to do with the feature, maybe it's not that important 
> after all.

I don't see anything has stalled. I've been busy for a few days, so
haven't had a chance to follow up on the use cases, as suggested. I'm
busy again today, so cannot reply further. Anyway, taking a few days to
let us think some more about the technical comments is no bad thing.

I think we need to relax about this feature some more because trying to
get something actually done when basic issues need analysis is hard and
that creates tension. Between us we can work out the code in a few days,
once we know which code to write.

What we actually need to do is talk and listen. I'd like to suggest that
we have an online "focus day" (onlist) on Sync Rep on Oct 5 and maybe 6
as well?. Meeting in person is possible, but probably impractical. But a
design sprint, not a code sprint. 

This is important and I'm sure we'll work something out. 

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: is sync rep stalled?

From
David Fetter
Date:
On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
> > On 29.09.2010 10:56, Fujii Masao wrote:
> > > On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas<robertmhaas@gmail.com>  wrote:
> 
> > >> This feature is important, and we need to get it done.  How do
> > >> we get the ball rolling again?
> > 
> > Agreed. Actually, given the lack of people jumping in and telling
> > us what they'd like to do with the feature, maybe it's not that
> > important after all.
> 
> I don't see anything has stalled.

I do.  We're half way through this commitfest, so if no one's actually
ready to commit one of the patches, I kinda have to bounce them both,
at least to the next CF.

The very likely outcome of that, given that it's a pretty enormous
feature that involves even more enormous amounts of testing on various
hardware, networks, etc., is that we don't get SR in 9.1, and you
among others will be very unhappy.

So yes, it is stalled, and yes, there's a real urgency to actually
getting a baseline something in there in the next couple of weeks.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: is sync rep stalled?

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
>> I don't see anything has stalled.

> I do.  We're half way through this commitfest, so if no one's actually
> ready to commit one of the patches, I kinda have to bounce them both,
> at least to the next CF.

[ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
stuff.  The CF end is still two weeks away.

But while I'm thinking about that...

The actual facts on the ground are that practically no CF work has
gotten done yet (at least not in my house) due to the git move and the
9.0.0 release and the upcoming back-branch releases.  Maybe we shouldn't
have started the CF while all that was going on, but that's water over
the dam now.  What we can do is rethink the scheduled end date.  IMHO
we should push out the end date by at least a week to reflect the lack
of time spent on the CF so far.
        regards, tom lane


Re: is sync rep stalled?

From
Aidan Van Dyk
Date:
On Thu, Sep 30, 2010 at 2:09 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

> Agreed. Actually, given the lack of people jumping in and telling us what
> they'd like to do with the feature, maybe it's not that important after all.

>> The basic features that I mean is for most basic use case, that is, one
>> master and one synchronous standby case. In detail,
>
> ISTM the problem is exactly that there is no consensus on what the basic use
> case is. I'm sure there's several things you can accomplish with synchronous
> replication, perhaps you could describe what the important use case for you
> is?

OK, So I'll throw in my ideal use case.  I'm starting to play with
Magnus's "streaming -> archive".

*that's* what I want, with synchronous.  Yes, again, I'm looking for
"data durability", not "server query-ability", and I'ld like to rely
on the PG user-space side of things instead of praying that replicated
block-devices hold together....

If my master flips out, I'm quite happy to do a normal archive
restore.  Except I don't want that last 16MB (or archive timeout) of
transactions lost.  The streaming -> archive in it's current state
get's me pretty close, but I'ld love to be able to guarantee that my
recovery from that archive has *every* transaction that the master
committed...

a.

a.


Re: is sync rep stalled?

From
David Fetter
Date:
On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> >> I don't see anything has stalled.
> 
> > I do.  We're half way through this commitfest, so if no one's
> > actually ready to commit one of the patches, I kinda have to
> > bounce them both, at least to the next CF.
> 
> [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
> stuff.  The CF end is still two weeks away.

If people are still wrangling over the design, I'd say two weeks is
a ludicrously short time, not a long one.

> But while I'm thinking about that...
> 
> The actual facts on the ground are that practically no CF work has
> gotten done yet (at least not in my house)

Your non-involvement in the first half or more--I'd say maybe 3 weeks
or so--is precisely what commitfests are for.  The point is that
people who are *not* committers need to do a bunch of QA on patches,
review them, get or create new patches as needed.  Only then should a
committer get involved.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: is sync rep stalled?

From
"Kevin Grittner"
Date:
Aidan Van Dyk <aidan@highrise.ca> wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
>> I'm sure there's several things you can accomplish with
>> synchronous replication, perhaps you could describe what the
>> important use case for you is?
> I'm looking for "data durability", not "server query-ability"
Same here.  If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database.  We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.
-Kevin


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 30.09.2010 17:09, Kevin Grittner wrote:
> Aidan Van Dyk<aidan@highrise.ca>  wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  wrote:
>
>>> I'm sure there's several things you can accomplish with
>>> synchronous replication, perhaps you could describe what the
>>> important use case for you is?
>
>> I'm looking for "data durability", not "server query-ability"
>
> Same here.  If we used synchronous replication, the important thing
> for us would be to hold up the master for the minimum time required
> to ensure remote persistence -- not actual application to the remote
> database.  We could tolerate some WAL replay time on recovery better
> than poor commit performance on the master.

You do realize that to be able to guarantee zero data loss, the master 
will have to stop committing new transactions if the streaming stops for 
any reason, like a network glitch. Maybe that's a tradeoff you want, but 
I'm asking because that point isn't clear to many people.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Yeb Havinga
Date:
Heikki Linnakangas wrote:
> On 30.09.2010 17:09, Kevin Grittner wrote:
>> Aidan Van Dyk<aidan@highrise.ca>  wrote:
>> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  wrote:
>>
>>>> I'm sure there's several things you can accomplish with
>>>> synchronous replication, perhaps you could describe what the
>>>> important use case for you is?
>>
>>> I'm looking for "data durability", not "server query-ability"
>>
>> Same here.  If we used synchronous replication, the important thing
>> for us would be to hold up the master for the minimum time required
>> to ensure remote persistence -- not actual application to the remote
>> database.  We could tolerate some WAL replay time on recovery better
>> than poor commit performance on the master.
>
> You do realize that to be able to guarantee zero data loss, the master 
> will have to stop committing new transactions if the streaming stops 
> for any reason, like a network glitch. Maybe that's a tradeoff you 
> want, but I'm asking because that point isn't clear to many people.
If there's a network glitch, it'd probably affect networked client 
connections as well, so it would mean no extra degration of service.

-- Yeb



Re: is sync rep stalled?

From
"Kevin Grittner"
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> You do realize that to be able to guarantee zero data loss, the
> master will have to stop committing new transactions if the
> streaming stops for any reason, like a network glitch. Maybe
> that's a tradeoff you want, but I'm asking because that point
> isn't clear to many people.
Yeah, I get that.  I do think the quorum approach or some simplified
special case of it would be important for us -- possibly even a
requirement -- for that reason.
-Kevin


Re: is sync rep stalled?

From
Simon Riggs
Date:
On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
> On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
> > David Fetter <david@fetter.org> writes:
> > > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> > >> I don't see anything has stalled.
> > 
> > > I do.  We're half way through this commitfest, so if no one's
> > > actually ready to commit one of the patches, I kinda have to
> > > bounce them both, at least to the next CF.
> > 
> > [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
> > stuff.  The CF end is still two weeks away.
> 
> If people are still wrangling over the design, I'd say two weeks is
> a ludicrously short time, not a long one.

Yes, there is design work still to do.

What purpose would be served by "bouncing" these patches?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: is sync rep stalled?

From
Robert Haas
Date:
On Thu, Sep 30, 2010 at 12:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
>> On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
>> > David Fetter <david@fetter.org> writes:
>> > > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
>> > >> I don't see anything has stalled.
>> >
>> > > I do.  We're half way through this commitfest, so if no one's
>> > > actually ready to commit one of the patches, I kinda have to
>> > > bounce them both, at least to the next CF.
>> >
>> > [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
>> > stuff.  The CF end is still two weeks away.
>>
>> If people are still wrangling over the design, I'd say two weeks is
>> a ludicrously short time, not a long one.
>
> Yes, there is design work still to do.
>
> What purpose would be served by "bouncing" these patches?

None whatsoever, IMHO.  That having been said, I would like to see us
make some forward progress.  I'm open to your ideas expressed
up-thread, but I'm not sure whether they'll be sufficient to resolve
the problem.  Seems worth a try, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Thu, Sep 30, 2010 at 3:09 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
>>> * Support multiple standbys with various synchronization levels.
>>
>> Not required for that case.
>
> IMHO at least we'll still need to support asynchronous standbys in the same
> mix, that's an existing feature.

My intention is to commit the core part of synchronous replication (which would
be used for every use cases) at first. Then we can implement the
feature for each
use case.

I agree that 9.1 should support asynchronous standbys in the same mix, but this
seems to be extended feature rather than very core.

>>> * What happens if a synchronous standby isn't connected at the moment?
>>> Return immediately vs. wait forever.
>>
>> The wait-forever option is not required for that case. Let's implement
>> the return-immediately at first.
>>
>> ..-
>>
>>> * async, recv, fsync and replay levels of synchronization.
>>
>> At least one of three synchronous levels should be included in the first
>> commit. I think that either recv or fsync is suitable for first try
>> because those don't require wake-up signaling from startup process to
>> walreceiver and are relatively easy to implement.
>
> What is the use case for that combination? For zero data loss, you *must*
> wait forever if a standby isn't connected. For keeping a hot standby server
> up-to-date so that you can freely query the standby instead of the master,
> you need replay level synchronization.

For high availability, and zero data loss unless the disk on one of master
and standby gets corrupted after the other goes down. It's the same use case
that cluster with shared disk covers.

I proposed to implement the "return-immediately" at first because it doesn't
require standby registration. But if many people think that the "wait-forever"
is the core rather than the "return-immediately", I'll follow them. We can
implement the "return-immediately" after that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
David Fetter
Date:
On Fri, Oct 01, 2010 at 07:48:25PM +0900, Fujii Masao wrote:
> I proposed to implement the "return-immediately" at first because it
> doesn't require standby registration. But if many people think that
> the "wait-forever" is the core rather than the "return-immediately",
> I'll follow them.  We can implement the "return-immediately" after
> that.

In my experience, most people who want "synchronous" behavior are
willing to put up with "wait forever," especially when asynchronous
behavior is already available.

In short, +1 for "push 'wait forever' soonest."

Anybody who's got a Secret Base, Hidden in a Hollowed-Out Mountain,
Making Grand Plans While Stroking a Long-Haired Cat[1], should please
to update their public repository, or create a public repository if it
doesn't already exist, and in either case keep it current.

Cheers,
David

[1]  While the Hollowed-Out Mountain trick worked back in the 60s,
it's gotten a little trite.  The cool kids are keeping things pretty
public these days when they plan to go public.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: is sync rep stalled?

From
Dimitri Fontaine
Date:
Fujii Masao <masao.fujii@gmail.com> writes:
> I proposed to implement the "return-immediately" at first because it doesn't
> require standby registration. But if many people think that the "wait-forever"
> is the core rather than the "return-immediately", I'll follow them. We can
> implement the "return-immediately" after that.

Wait forever can be done without standby registration, with quorum commit.

-- 
dim


Re: is sync rep stalled?

From
Josh Berkus
Date:
On 09/30/2010 10:52 PM, Tom Lane wrote:
>   IMHO
> we should push out the end date by at least a week to reflect the lack
> of time spent on the CF so far.

I agree that we should postpone the end of the CF by one week to deal 
with the distractions people have had.

--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: is sync rep stalled?

From
Josh Berkus
Date:
> What we actually need to do is talk and listen. I'd like to suggest that
> we have an online "focus day" (onlist) on Sync Rep on Oct 5 and maybe 6
> as well?. Meeting in person is possible, but probably impractical. But a
> design sprint, not a code sprint.

I'd suggest something even simpler:

(1) Create a wiki page which lists all of the design descisions we need 
to make in order to finish the specification for synch rep.

(2) Link each item to any prior discussion we've had about the item.

(3) Invite people to comment on the wiki by leaving per-item comments 
and suggestions with their own names.

I believe that right now only a handful of people (Simon, Heikki, Fujii, 
Zoltan) are really acquainted with all of the decisions which need to be 
made.  No wonder the rest of us fly off on minutia like file formats; we 
really have no sense of scope.


--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: is sync rep stalled?

From
Markus Wanner
Date:
Hi,

On 10/03/2010 05:52 AM, Josh Berkus wrote:
> (3) Invite people to comment on the wiki by leaving per-item comments
> and suggestions with their own names.

Please keep discussions on the mailing list. On Wikis, those are very
hard to follow (Date or From missing, no offline capabilities, indirect
notification, etc..)

I like Simon's suggestion, but thought of something *more* direct (maybe
IRC), not less (like Wikis).

> I believe that right now only a handful of people (Simon, Heikki, Fujii,
> Zoltan) are really acquainted with all of the decisions which need to be
> made.

I at least try to follow. And I actually think we had quite some DBA
inputs as well.

Regards

Markus


Re: is sync rep stalled?

From
Markus Wanner
Date:
On 09/30/2010 04:54 PM, Yeb Havinga wrote:
> Heikki Linnakangas wrote:
>> You do realize that to be able to guarantee zero data loss, the master
>> will have to stop committing new transactions if the streaming stops
>> for any reason, like a network glitch. Maybe that's a tradeoff you
>> want, but I'm asking because that point isn't clear to many people.
> If there's a network glitch, it'd probably affect networked client
> connections as well, so it would mean no extra degration of service.

Agreed.

I think the network glitch example is too general, it could affect any
part of the whole network. Even just the connection between the master
and the standby, in which case all client connections would keep up.

Let's quickly think about that scenario. AFAIU in such a case, the
standby would continue to answer read-only queries, independent of what
the master does, right? Or does the standby stop processing read-only
queries in case it looses connection to the master?

It seems to me the later is required, if we let the master continue to
commit transactions. Otherwise the standby would serve stale data to its
clients without knowing.

Given that scenario, I'd clearly favor a master that stops committing
new transactions, but allow both (i.e. master and standbies) to continue
answering read-only queries.

Regards

Markus Wanner


Re: is sync rep stalled?

From
Markus Wanner
Date:
On 10/01/2010 05:06 PM, Dimitri Fontaine wrote:
> Wait forever can be done without standby registration, with quorum commit.

Yeah, I also think the only reason for standby registration is ease of
configuration (if at all). There's no technical requirement for standby
registration, AFAICS. Or does anybody know of a realistic use case
that's possible with standby registration, but not with quorum commit?

Regards

Markus Wanner


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 04.10.2010 10:03, Markus Wanner wrote:
> On 09/30/2010 04:54 PM, Yeb Havinga wrote:
>> Heikki Linnakangas wrote:
>>> You do realize that to be able to guarantee zero data loss, the master
>>> will have to stop committing new transactions if the streaming stops
>>> for any reason, like a network glitch. Maybe that's a tradeoff you
>>> want, but I'm asking because that point isn't clear to many people.
>> If there's a network glitch, it'd probably affect networked client
>> connections as well, so it would mean no extra degration of service.
>
> Agreed.
>
> I think the network glitch example is too general, it could affect any
> part of the whole network. Even just the connection between the master
> and the standby, in which case all client connections would keep up.
>
> Let's quickly think about that scenario. AFAIU in such a case, the
> standby would continue to answer read-only queries, independent of what
> the master does, right?

Right.

> Or does the standby stop processing read-only
> queries in case it looses connection to the master?

As far as the current proposals go, no.

> It seems to me the later is required, if we let the master continue to
> commit transactions. Otherwise the standby would serve stale data to its
> clients without knowing.

Yep. If you want to guarantee that a hot standby doesn't return stale 
data, if the connection is lost you need to either stop processing 
read-only queries in the standby, or stop processing commits in the master.

Note that this assumes that you use the 'replay' synchronization level. 
In the weaker levels, read-only queries can always return stale data.

With 'replay' and hot standby combination, you'll want to set 
max_standby_archive_delay to a very low value, or a read-only query can 
cause master to stop processing commits (or the standby to stop 
accepting new queries, if that's preferred).

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Markus Wanner
Date:
On 10/04/2010 09:18 AM, Heikki Linnakangas wrote:
> Note that this assumes that you use the 'replay' synchronization level.
> In the weaker levels, read-only queries can always return stale data.

I'm not too found of those various synchronization levels, but IIUC all
other levels only allow a rather limited staleness. But a master that's
continuing to commit new transactions with a disconnected standby that
happily continues to answer read-only queries, the age of the standby's
snapshot can grow without limitation.

> With 'replay' and hot standby combination, you'll want to set
> max_standby_archive_delay to a very low value, or a read-only query can
> cause master to stop processing commits (or the standby to stop
> accepting new queries, if that's preferred).

Well, given that DML-only transactions aren't prone such to conflicts, I
think of this as a corner case.

Also note, that this requirement seems to apply whether we wait forever
on standby failure or not. (Because even if we don't, there must be some
kind of timeout on the master from the very first suspicion to actually
declare the standby dead - anything else is called anync).

Regards

Markus Wanner


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 04.10.2010 10:49, Markus Wanner wrote:
> On 10/04/2010 09:18 AM, Heikki Linnakangas wrote:
>> With 'replay' and hot standby combination, you'll want to set
>> max_standby_archive_delay to a very low value, or a read-only query can
>> cause master to stop processing commits (or the standby to stop
>> accepting new queries, if that's preferred).
>
> Well, given that DML-only transactions aren't prone such to conflicts, I
> think of this as a corner case.

Yes they are. Any DML operation, and even read-only queries IIRC, can 
trigger HOT pruning, which can conflict with a read-only query in a hot 
standby. And then there's autovacuum which can cause conflicts in the 
standby, even if no user transactions are running in the master.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Fri, Oct 1, 2010 at 11:16 PM, David Fetter <david@fetter.org> wrote:
> On Fri, Oct 01, 2010 at 07:48:25PM +0900, Fujii Masao wrote:
>> I proposed to implement the "return-immediately" at first because it
>> doesn't require standby registration. But if many people think that
>> the "wait-forever" is the core rather than the "return-immediately",
>> I'll follow them.  We can implement the "return-immediately" after
>> that.
>
> In my experience, most people who want "synchronous" behavior are
> willing to put up with "wait forever," especially when asynchronous
> behavior is already available.
>
> In short, +1 for "push 'wait forever' soonest."

I have one question for clarity:

If we make all the transactions wait until specified standbys have
connected to the master, how do we take a base backup from the
master for those standbys? We seem to be unable to do that because
pg_start_backup also waits forever. Is this right?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
Aidan Van Dyk
Date:
On Mon, Oct 4, 2010 at 10:22 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

> I have one question for clarity:
>
> If we make all the transactions wait until specified standbys have
> connected to the master, how do we take a base backup from the
> master for those standbys? We seem to be unable to do that because
> pg_start_backup also waits forever. Is this right?

Well, in my *opinion*, if you've told the master to not "commit to"
*anything* unless it's synchronously replicated, you should already
have a synchronously replicating slave up and running.

I'm happy with the docs saying (maybe some what more politely): Before configuring your master to be completly,
wait-fully-synchronous, make sure you have a slave capable of being
synchronous ready.  Because if you've told it to never be
un-synchronous, it won't be.


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Tue, Oct 5, 2010 at 2:06 AM, Aidan Van Dyk <aidan@highrise.ca> wrote:
> On Mon, Oct 4, 2010 at 10:22 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>
>> I have one question for clarity:
>>
>> If we make all the transactions wait until specified standbys have
>> connected to the master, how do we take a base backup from the
>> master for those standbys? We seem to be unable to do that because
>> pg_start_backup also waits forever. Is this right?
>
> Well, in my *opinion*, if you've told the master to not "commit to"
> *anything* unless it's synchronously replicated, you should already
> have a synchronously replicating slave up and running.
>
> I'm happy with the docs saying (maybe some what more politely):
>  Before configuring your master to be completly,
> wait-fully-synchronous, make sure you have a slave capable of being
> synchronous ready.  Because if you've told it to never be
> un-synchronous, it won't be.

How can we take a base backup for that synchronous standby? You mean
that we should disable the wait-forever option, start the master, take
a base backup, shut down the master, enable the wait-forever option,
start the master, and start the standby from that base backup?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
Tom Lane
Date:
Fujii Masao <masao.fujii@gmail.com> writes:
> On Tue, Oct 5, 2010 at 2:06 AM, Aidan Van Dyk <aidan@highrise.ca> wrote:
>> I'm happy with the docs saying (maybe some what more politely):
>> �Before configuring your master to be completly,
>> wait-fully-synchronous, make sure you have a slave capable of being
>> synchronous ready. �Because if you've told it to never be
>> un-synchronous, it won't be.

> How can we take a base backup for that synchronous standby? You mean
> that we should disable the wait-forever option, start the master, take
> a base backup, shut down the master, enable the wait-forever option,
> start the master, and start the standby from that base backup?

I think the point here is that it's possible to have sync-rep
configurations in which it's impossible to take a base backup.  That
doesn't seem to me to be unacceptable in itself.  What *is* unacceptable
is to be unable to change the configuration to another state in which
you could take a base backup.  Which is why "keep the config in a system
catalog" doesn't work.
        regards, tom lane


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 04.10.2010 17:22, Fujii Masao wrote:
> If we make all the transactions wait until specified standbys have
> connected to the master, how do we take a base backup from the
> master for those standbys? We seem to be unable to do that because
> pg_start_backup also waits forever. Is this right?

Hmm, pg_start_backup() writes WAL, but it doesn't commit. Only a commit 
needs to wait for acknowledgment from the standby, so 'wait forever' 
behavior doesn't necessarily mean that you can't take a base backup. If 
you run it outside a transaction you get an implicit commit, though, 
which will wait, so you might need to do something odd like "begin; 
select pg_start_backup(); rollback".

But I agree with Tom that as long as it's possible to change the 
configuration on the fly, it's not a show-stopper if you can't take a 
new base backup while the standby is disconnected.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Tue, Oct 5, 2010 at 5:49 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 04.10.2010 17:22, Fujii Masao wrote:
>>
>> If we make all the transactions wait until specified standbys have
>> connected to the master, how do we take a base backup from the
>> master for those standbys? We seem to be unable to do that because
>> pg_start_backup also waits forever. Is this right?
>
> Hmm, pg_start_backup() writes WAL, but it doesn't commit. Only a commit
> needs to wait for acknowledgment from the standby, so 'wait forever'
> behavior doesn't necessarily mean that you can't take a base backup. If you
> run it outside a transaction you get an implicit commit, though, which will
> wait, so you might need to do something odd like "begin; select
> pg_start_backup(); rollback".

Yep. Similarly, we would need to enclose also pg_stop_backup with begin
and rollback.

I have another question: when should the waiting transactions resume?
It's a moment the standby has connected to the master? It's a moment
the standby has caught up with the master? For no data loss, the
latter seems to be required. Right?

The third question: if the WAL file is unfortunately recycled when a
transaction waits for that WAL file to be shipped forever, how should
that transaction behave? Still waiting? Cause PANIC? Give up waiting?
For no data loss, ISTM that the second should be chosen. Right?

This can happen because we can write WAL to the master without waiting
for replication by enclosing a query with begin and rollback, even if
all the transaction *commit* are waiting for replication forever.

> But I agree with Tom that as long as it's possible to change the
> configuration on the fly, it's not a show-stopper if you can't take a new
> base backup while the standby is disconnected.

Yep. If people who want the "wait-forever" can live with such an odd
backup procedure, I have no objection to implement that.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
Heikki Linnakangas
Date:
On 05.10.2010 12:47, Fujii Masao wrote:
> I have another question: when should the waiting transactions resume?
> It's a moment the standby has connected to the master? It's a moment
> the standby has caught up with the master? For no data loss, the
> latter seems to be required. Right?

Yep.

> The third question: if the WAL file is unfortunately recycled when a
> transaction waits for that WAL file to be shipped forever, how should
> that transaction behave? Still waiting? Cause PANIC? Give up waiting?
> For no data loss, ISTM that the second should be chosen. Right?

Right, it should keep waiting.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: is sync rep stalled?

From
Simon Riggs
Date:
On Tue, 2010-10-05 at 18:47 +0900, Fujii Masao wrote:
> On Tue, Oct 5, 2010 at 5:49 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > On 04.10.2010 17:22, Fujii Masao wrote:
> >>
> >> If we make all the transactions wait until specified standbys have
> >> connected to the master, how do we take a base backup from the
> >> master for those standbys? We seem to be unable to do that because
> >> pg_start_backup also waits forever. Is this right?
> >
> > Hmm, pg_start_backup() writes WAL, but it doesn't commit. Only a commit
> > needs to wait for acknowledgment from the standby, so 'wait forever'
> > behavior doesn't necessarily mean that you can't take a base backup. If you
> > run it outside a transaction you get an implicit commit, though, which will
> > wait, so you might need to do something odd like "begin; select
> > pg_start_backup(); rollback".
> 
> Yep. Similarly, we would need to enclose also pg_stop_backup with begin
> and rollback.

Presumably we will have an option to *not* wait forever? So we would be
able to set the option prior to running the base backup? So there isn't
any need to do this rollback trick suggested.

pg_start_backup() and pg_stop_backup() have two use cases: 

1) ensuring both are sent through to the standby would make it very easy
to allow backups from the standby. 

2) make sure we don't wait, so we can take a base backup at any time

So there's no argument here to prevent it being in a table.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: is sync rep stalled?

From
Simon Riggs
Date:
On Fri, 2010-10-01 at 07:16 -0700, David Fetter wrote:
> On Fri, Oct 01, 2010 at 07:48:25PM +0900, Fujii Masao wrote:
> > I proposed to implement the "return-immediately" at first because it
> > doesn't require standby registration. But if many people think that
> > the "wait-forever" is the core rather than the "return-immediately",
> > I'll follow them.  We can implement the "return-immediately" after
> > that.
> 
> In my experience, most people who want "synchronous" behavior are
> willing to put up with "wait forever," especially when asynchronous
> behavior is already available.
> 
> In short, +1 for "push 'wait forever' soonest."
> 
> Anybody who's got a Secret Base, Hidden in a Hollowed-Out Mountain,
> Making Grand Plans While Stroking a Long-Haired Cat[1], should please
> to update their public repository, or create a public repository if it
> doesn't already exist, and in either case keep it current.

You've long held the belief that I code in secret and don't reveal my
code to people. Not really sure why, since I've contributed so much, so
openly. Strange.

I am trying to establish a sensible design based upon public discussion.
I'm not working on any code currently; my understanding was that we
would discuss what we were going to do and only then do it.

I *could* add automatic registration or many other features to my patch.
Doing so would take hours or days. How would that help us decide what to
do? I'm not treating this as a race between people's patches; is it a
race? Or is it a discussion and move forwards by mutual agreement
towards something sensible?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: is sync rep stalled?

From
Simon Riggs
Date:
On Fri, 2010-10-01 at 19:48 +0900, Fujii Masao wrote:

> My intention is to commit the core part of synchronous replication (which would
> be used for every use cases) at first. Then we can implement the
> feature for each
> use case.

I completely agree that we should commit the core part of sync rep, but
the question is: what is that? We both have equally valid "cores".

> I agree that 9.1 should support asynchronous standbys in the same mix, but this
> seems to be extended feature rather than very core.

That is trivial, so no need to exclude that.

> I proposed to implement the "return-immediately" at first because it doesn't
> require standby registration. But if many people think that the "wait-forever"
> is the core rather than the "return-immediately", I'll follow them. We can
> implement the "return-immediately" after that.

I think its fair to say that many people don't like the specific form of
standby registration that has been proposed. I really don't mind if it
exists as an option, but it looks way too complex to me to manage for
realistic systems.

Wait-forever needs to be an option. Nobody actually will wait forever,
so if people select it, they will need some form of clusterware to
control it and I don't want to see people forced to use clusterware.

If people do choose wait-forever, then we could also do standby
registration automatically, to give them something to wait for.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: is sync rep stalled?

From
Aidan Van Dyk
Date:
On Mon, Oct 4, 2010 at 11:48 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

> How can we take a base backup for that synchronous standby? You mean
> that we should disable the wait-forever option, start the master, take
> a base backup, shut down the master, enable the wait-forever option,
> start the master, and start the standby from that base backup?

All I'm saying is that *after* you've configured that everything must
be synchronous is *not* the time to start trying to figure out if your
PITR backups/archive are working, and starting to try and get a slave
replicating synchronously.

Yes, High-Durability sync rep has caveats.  One of them is that you
must have a working synchronous slave before you can enforce
synchronousity.

a.


Re: is sync rep stalled?

From
Fujii Masao
Date:
On Tue, Oct 5, 2010 at 8:25 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Presumably we will have an option to *not* wait forever? So we would be
> able to set the option prior to running the base backup? So there isn't
> any need to do this rollback trick suggested.

At the initial setup of the standby, we can easily disable wait-forever
option and take a base backup. I'm concerned about the case where the
standby goes down while replication is working. ISTM that we cannot
easily disable wait-forever option for backup because that disablement
resumes the waiting transactions.

In this case, we would need to issue rollback. Or we seem to need to
shut down the master, take a cold backup, start the master and start
the standby from that cold backup. Though I'm not sure if this is really
right procedure..

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: is sync rep stalled?

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I think the point here is that it's possible to have sync-rep
> configurations in which it's impossible to take a base backup.

Sorry to be slow. I still don't understand that problem.

I can understand why people want "wait forever", but I can't understand
when the following strange idea apply: consider my non-ready standby
there as a full member of the distributed setup already.

I've been making plenty of noise about this topic in the past, at the
beginning of plans for SR in 9.0 IIRC, pushing Heikki into having a
worked out state machine to figure out what are the known states of a
standby and what we can do with each. We've cancelled that and said it
would maybe necessary for Synchronous Replication. Here we go, right?

So, first thing first, when is it a good idea to consider a standby
that's not yet had its base backup, let alone validated that after
taking it the master still has enough WAL for the backup to be valid as
far as initialising the slave goes, to consider this broken standby as
someone we wait forever on?

I say a standby is registered when it's currently "attached" and already
able to keep up in async. That's a time when you can slow down the
master until this new member catches up to full sync or whatever you've
setup.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Lack of google and archives-fu today means no link to those mails. Yet…


Re: is sync rep stalled?

From
Bruce Momjian
Date:
Heikki Linnakangas wrote:
> On 04.10.2010 10:49, Markus Wanner wrote:
> > On 10/04/2010 09:18 AM, Heikki Linnakangas wrote:
> >> With 'replay' and hot standby combination, you'll want to set
> >> max_standby_archive_delay to a very low value, or a read-only query can
> >> cause master to stop processing commits (or the standby to stop
> >> accepting new queries, if that's preferred).
> >
> > Well, given that DML-only transactions aren't prone such to conflicts, I
> > think of this as a corner case.
> 
> Yes they are. Any DML operation, and even read-only queries IIRC, can 
> trigger HOT pruning, which can conflict with a read-only query in a hot 
> standby. And then there's autovacuum which can cause conflicts in the 
> standby, even if no user transactions are running in the master.

I can confirm that SELECT can trigger HOT pruning, based on research for
my PG West MVCC talk.  Anything that does a tuple lookup can cause it
--- INSERT VALUES does not.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +