Home > mailing lists

Re: Sync Replication with transaction-controlled durability - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Sync Replication with transaction-controlled durability
Date	October 11, 2010 07:48:57
Msg-id	AANLkTi=U0rBDj7eLZPAQUJ-+GzJoGkUBUuTX4J_hMT0Y@mail.gmail.com Whole thread Raw
In response to	Re: Sync Replication with transaction-controlled durability (Simon Riggs <simon@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

On Sat, Oct 9, 2010 at 3:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, 2010-10-08 at 12:23 -0400, Robert Haas wrote:
>
>> It seems like it would be more helpful if you were working on
>> implementing a design that had more than one vote.  As far as I can
>> tell, we have rough consensus that for the first commit we should only
>> worry about the case where k = 1; that is, only one ACK is ever
>> required for commit; and Greg Smith spelled out some more particulars
>> for a minimum acceptable implementation in the second part of the
>> email found here:
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-10/msg00384.php
>
> Robert,
>
> I'm working on k = 1, as suggested by Josh Berkus and with whom many
> people agree. It is a simple default behaviour that will be easy to
> test.
>
> Greg's proposal to implement other alternatives via a function is simply
> a restatement of what I had already proposed: we should have a plugin to
> provide alternate behaviours. We can add the plugin API later once we
> have a stable committed version. I am happy to do that, just as I
> originally proposed.
>
> I don't believe it will be helpful to attempt to implement something
> more complex until we have the basic version.

I agree that we should start with a basic version, but it seems to me
that you're ripping out things which are uncontroversial and leaving
untouched things with are.  To the best of my knowledge, there are no
serious or widespread objections to allowing three synchronous
replication levels: recv, fsync, apply.  There are, however, a number
of people, including me, who don't feel that whether or not the slave
is synchronous should be configured on the slave.  As Greg said:

That would be a simple to configure setup where I list a subset of
"important" nodes, and the appropriate acknowledgement level I want to
hear from one of them. And when one of those nodes gives that
acknowledgement, commit on the master happens too.

I am not going to put words in Greg's mouth, so I won't claim that
when he speaks of listing a subset of important nodes, he actually
means putting a list of them someplace, but that's what I and at least
some other people want.  In your design, AIUI, the list is implied by
the settings on the slaves, not explicit.

I also think that if you're removing things for the first version, the
timeout might be one to rip out.  In between all the discussion of how
and where synchronous replication replication ought to be configured,
we have had some discussion of whether there should be a timeout, but
almost no discussion of what the behavior of that timeout should be.
Are we going to wait for that timeout on every commit?  That seems
almost certain to reduce a busy master to unusability.  Is it a
timeout before the slave is "declared dead" even though it's still
connected, and if so how does the slave come back to life again?  Is
it a timeout before we forcibly disconnect the slave, and if so how is
it better/worse/different than configuring TCP keepalives?  I'm sure
we can figure out good answers to all of those questions but it might
take a while to get consensus on any particular approach.

One other question that occurred to me this morning, not directly
related to anything you're doing here.  What exactly happens if the
user types COMMIT, it hangs for a long time because it can't get an
ACK from any other server, and the user gets tired of waiting and
starts hitting ^C?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 11 October 2010, 07:33:21
Subject: Re: wip: functions median and percentile

From: Vaibhav Kaushal
Date: 11 October 2010, 07:50:14
Subject: Re: Which file does the SELECT?

Re: Sync Replication with transaction-controlled durability - Mailing list pgsql-hackers

Previous

Next