Re: Configuring synchronous replication - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Configuring synchronous replication |
Date | |
Msg-id | 1284716969.1733.3699.camel@ebony Whole thread Raw |
In response to | Configuring synchronous replication (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Configuring synchronous replication
|
List | pgsql-hackers |
On Fri, 2010-09-17 at 11:09 +0300, Heikki Linnakangas wrote: > (changed subject again.) > > On 17/09/10 10:06, Simon Riggs wrote: > > I don't think we can determine how far to implement without considering > > both approaches in detail. With regard to your points below, I don't > > think any of those points could be committed first. > > Yeah, I think we need to decide on the desired feature set first, before > we dig deeper into the the patches. The design and implementation will > fall out of that. Well, we've discussed these things many times and talking hasn't got us very far on its own. We need measurements and neutral assessments. The patches are simple and we have time. This isn't just about UI, there are significant and important differences between the proposals in terms of the capability and control they offer. I propose we develop both patches further and performance test them. Many of the features I have proposed are performance related and people need to be able to see what is important, and what is not. But not through mere discussion, we need numbers to show which things matter and which things don't. And those need to be derived objectively. > * Support multiple standbys with various synchronization levels. > > * What happens if a synchronous standby isn't connected at the moment? > Return immediately vs. wait forever. > > * Per-transaction control. Some transactions are important, others are not. > > * Quorum commit. Wait until n standbys acknowledge. n=1 and n=all > servers can be seen as important special cases of this. > > * async, recv, fsync and replay levels of synchronization. That's a reasonable starting list of points, there may be others. > So what should the user interface be like? Given the 1st and 2nd > requirement, we need standby registration. If some standbys are > important and others are not, the master needs to distinguish between > them to be able to determine that a transaction is safely delivered to > the important standbys. My patch provides those two requirements without standby registration, so we very clearly don't "need" standby registration. The question is do we want standby registration on master and if so, why? > For per-transaction control, ISTM it would be enough to have a simple > user-settable GUC like synchronous_commit. Let's call it > "synchronous_replication_commit" for now. If you wish to change the name of the GUC away from the one I have proposed, fine. Please note that aspect isn't important to me and I will happily concede all such points to the majority view. > For non-critical transactions, > you can turn it off. That's very simple for developers to understand and > use. I don't think we need more fine-grained control than that at > transaction level, in all the use cases I can think of you have a stream > of important transactions, mixed with non-important ones like log > messages that you want to finish fast in a best-effort fashion. Sounds like we're getting somewhere. See below. > I'm > actually tempted to tie that to the existing synchronous_commit GUC, the > use case seems exactly the same. http://archives.postgresql.org/pgsql-hackers/2008-07/msg01001.php Check the date! I think that particular point is going to confuse us. It will draw much bike shedding and won't help us decide between patches. It's a nicety that can be left to a time after we have the core feature committed. > OTOH, if we do want fine-grained per-transaction control, a simple > boolean or even an enum GUC doesn't really cut it. For truly > fine-grained control you want to be able to specify exceptions like > "wait until this is replayed in slave named 'reporting'" or 'don't wait > for acknowledgment from slave named 'uk-server'". With standby > registration, we can invent a syntax for specifying overriding rules in > the transaction. Something like SET replication_exceptions = > 'reporting=replay, uk-server=async'. > > For the control between async/recv/fsync/replay, I like to think in > terms of > a) asynchronous vs synchronous > b) if it's synchronous, how synchronous is it? recv, fsync or replay? > > I think it makes most sense to set sync vs. async in the master, and the > level of synchronicity in the slave. Although I have sympathy for the > argument that it's simpler if you configure it all from the master side > as well. I have catered for such requests by suggesting a plugin that allows you to implement that complexity without overburdening the core code. This strikes me as an "ad absurdum" argument. Since the above over-complexity would doubtless be seen as insane by Tom et al, it attempts to persuade that we don't need recv, fsync and apply either. Fujii has long talked about 4 levels of service also. Why change? I had thought that part was pretty much agreed between all of us. Without performance tests to demonstrate "why", these do sound hard to understand. But we should note that DRBD offers recv ("B") and fsync ("C") as separate options. And Oracle implements all 3 of recv, fsync and apply. Neither of them describe those options so simply and easily as the way we are proposing with a 4 valued enum (with async as the fourth option). If we have only one option for sync_rep = 'on' which of recv | fsync | apply would it implement? You don't mention that. Which do you choose? For what reason do you make that restriction? The code doesn't get any simpler, in my patch at least, from my perspective it would be a restriction without benefit. I no longer seek to persuade by words alone. The existence of my patch means that I think that only measurements and tests will show why I have been saying these things. We need performance tests. I'm not ready for them today, but will be very soon. I suspect you aren't either since from earlier discussions you didn't appear to have much about overall throughput, only about response times for single transactions. I'm happy to be proved wrong there. > Putting all of that together. I think Fujii-san's standby.conf is pretty > close. > What it needs is the additional GUC for transaction-level control. The difference between the patches is not a simple matter of a GUC. My proposal allows a single standby to provide efficient replies to multiple requested durability levels all at the same time. With efficient use of network resources. ISTM that because the other patch cannot provide that you'd like to persuade us that we don't need that, ever. You won't sell me on that point, cos I can see lots of uses for it. Another use case for you: * customer orders are important, but we want lots of them, so we use recv mode for those. * pricing data hardly ever changes, but when it does we need it to be applied across the cluster so we don't get read mismatches, so those rare transactions use apply mode. If you don't want multiple modes at once, you don't need to use that feature. But there is no reason to prevent people having the choice, when a design exists that can provide it. (A separate and later point, is that I would one day like to annotate specific tables and functions with different modes, so a sysadmin can point out which data is important at table level - which is what MySQL provides by allowing choice of storage engine for particular tables. Nobody cares about the specific engine, they care about the durability implications of those choices. This isn't part of the current proposal, just a later statement of direction.) -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
pgsql-hackers by date: