Re: Sync Replication with transaction-controlled durability - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Sync Replication with transaction-controlled durability |
Date | |
Msg-id | AANLkTi=U0rBDj7eLZPAQUJ-+GzJoGkUBUuTX4J_hMT0Y@mail.gmail.com Whole thread Raw |
In response to | Re: Sync Replication with transaction-controlled durability (Simon Riggs <simon@2ndQuadrant.com>) |
List | pgsql-hackers |
On Sat, Oct 9, 2010 at 3:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Fri, 2010-10-08 at 12:23 -0400, Robert Haas wrote: > >> It seems like it would be more helpful if you were working on >> implementing a design that had more than one vote. As far as I can >> tell, we have rough consensus that for the first commit we should only >> worry about the case where k = 1; that is, only one ACK is ever >> required for commit; and Greg Smith spelled out some more particulars >> for a minimum acceptable implementation in the second part of the >> email found here: >> >> http://archives.postgresql.org/pgsql-hackers/2010-10/msg00384.php > > Robert, > > I'm working on k = 1, as suggested by Josh Berkus and with whom many > people agree. It is a simple default behaviour that will be easy to > test. > > Greg's proposal to implement other alternatives via a function is simply > a restatement of what I had already proposed: we should have a plugin to > provide alternate behaviours. We can add the plugin API later once we > have a stable committed version. I am happy to do that, just as I > originally proposed. > > I don't believe it will be helpful to attempt to implement something > more complex until we have the basic version. I agree that we should start with a basic version, but it seems to me that you're ripping out things which are uncontroversial and leaving untouched things with are. To the best of my knowledge, there are no serious or widespread objections to allowing three synchronous replication levels: recv, fsync, apply. There are, however, a number of people, including me, who don't feel that whether or not the slave is synchronous should be configured on the slave. As Greg said: That would be a simple to configure setup where I list a subset of "important" nodes, and the appropriate acknowledgement level I want to hear from one of them. And when one of those nodes gives that acknowledgement, commit on the master happens too. I am not going to put words in Greg's mouth, so I won't claim that when he speaks of listing a subset of important nodes, he actually means putting a list of them someplace, but that's what I and at least some other people want. In your design, AIUI, the list is implied by the settings on the slaves, not explicit. I also think that if you're removing things for the first version, the timeout might be one to rip out. In between all the discussion of how and where synchronous replication replication ought to be configured, we have had some discussion of whether there should be a timeout, but almost no discussion of what the behavior of that timeout should be. Are we going to wait for that timeout on every commit? That seems almost certain to reduce a busy master to unusability. Is it a timeout before the slave is "declared dead" even though it's still connected, and if so how does the slave come back to life again? Is it a timeout before we forcibly disconnect the slave, and if so how is it better/worse/different than configuring TCP keepalives? I'm sure we can figure out good answers to all of those questions but it might take a while to get consensus on any particular approach. One other question that occurred to me this morning, not directly related to anything you're doing here. What exactly happens if the user types COMMIT, it hangs for a long time because it can't get an ACK from any other server, and the user gets tired of waiting and starts hitting ^C? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: