Re: Configuring synchronous replication - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Configuring synchronous replication |
Date | |
Msg-id | 4C9345CF.8000708@enterprisedb.com Whole thread Raw |
In response to | Re: Configuring synchronous replication (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Configuring synchronous replication
Re: Configuring synchronous replication Re: Configuring synchronous replication Re: Configuring synchronous replication |
List | pgsql-hackers |
On 17/09/10 12:49, Simon Riggs wrote: > This isn't just about UI, there are significant and important > differences between the proposals in terms of the capability and control > they offer. Sure. The point of focusing on the UI is that the UI demonstrates what capability and control a proposal offers. >> So what should the user interface be like? Given the 1st and 2nd >> requirement, we need standby registration. If some standbys are >> important and others are not, the master needs to distinguish between >> them to be able to determine that a transaction is safely delivered to >> the important standbys. > > My patch provides those two requirements without standby registration, > so we very clearly don't "need" standby registration. It's still not clear to me how you would configure things like "wait for ack from reporting slave, but not other slaves" or "wait until replayed in the server on the west coast" in your proposal. Maybe it's possible, but doesn't seem very intuitive, requiring careful configuration in both the master and the slaves. In your proposal, you also need to be careful not to connect e.g a test slave with "synchronous_replication_service = apply" to the master, or it will possible shadow a real production slave, acknowledging transactions that are not yet received by the real slave. It's certainly possible to screw up with standby registration too, but you have more direct control of the master behavior in the master, instead of distributing it across all slaves. > The question is do we want standby registration on master and if so, > why? Well, aside from how to configure synchronous replication, standby registration would help with retaining the right amount of WAL in the master. wal_keep_segments doesn't guarantee that enough is retained, and OTOH when all standbys are connected you retain much more than might be required. Giving names to slaves also allows you to view their status in the master in a more intuitive format. Something like: postgres=# SELECT * FROM pg_slave_status ; name | connected | received | fsyncd | applied ------------+-----------+------------+------------+------------ reporting | t | 0/26000020 | 0/26000020 | 0/25550020ha-standby | t | 0/26000020 | 0/26000020 | 0/26000020 testserver | f | | 0/15000020| (3 rows) >> For the control between async/recv/fsync/replay, I like to think in >> terms of >> a) asynchronous vs synchronous >> b) if it's synchronous, how synchronous is it? recv, fsync or replay? >> >> I think it makes most sense to set sync vs. async in the master, and the >> level of synchronicity in the slave. Although I have sympathy for the >> argument that it's simpler if you configure it all from the master side >> as well. > > I have catered for such requests by suggesting a plugin that allows you > to implement that complexity without overburdening the core code. Well, plugins are certainly one possibility, but then we need to design the plugin API. I've been thinking along the lines of a proxy, which can implement whatever logic you want to decide when to send the acknowledgment. With a proxy as well, if we push any features people that want to a proxy or plugin, we need to make sure that the proxy/plugin has all the necessary information available. > This strikes me as an "ad absurdum" argument. Since the above > over-complexity would doubtless be seen as insane by Tom et al, it > attempts to persuade that we don't need recv, fsync and apply either. > > Fujii has long talked about 4 levels of service also. Why change? I had > thought that part was pretty much agreed between all of us. Now you lost me. I agree that we need 4 levels of service (at least ultimately, not necessarily in the first phase). > Without performance tests to demonstrate "why", these do sound hard to > understand. But we should note that DRBD offers recv ("B") and fsync > ("C") as separate options. And Oracle implements all 3 of recv, fsync > and apply. Neither of them describe those options so simply and easily > as the way we are proposing with a 4 valued enum (with async as the > fourth option). > > If we have only one option for sync_rep = 'on' which of recv | fsync | > apply would it implement? You don't mention that. Which do you choose? You would choose between recv, fsync and apply in the slave, with a GUC. > I no longer seek to persuade by words alone. The existence of my patch > means that I think that only measurements and tests will show why I have > been saying these things. We need performance tests. I don't expect any meaningful differences in terms of performance between any of the discussed options. The big question right now is what features we provide and how they're configured. Performance will depend primarily on the mode you use, and secondarily on the implementation of the mode. It would be completely premature to do performance testing yet IMHO. >> Putting all of that together. I think Fujii-san's standby.conf is pretty >> close. > >> What it needs is the additional GUC for transaction-level control. > > The difference between the patches is not a simple matter of a GUC. > > My proposal allows a single standby to provide efficient replies to > multiple requested durability levels all at the same time. With > efficient use of network resources. ISTM that because the other patch > cannot provide that you'd like to persuade us that we don't need that, > ever. You won't sell me on that point, cos I can see lots of uses for > it. Simon, how the replies are sent is an implementation detail I haven't given much thought yet. The reason we delved into that discussion earlier was that you seemed to contradict yourself with the claims that you don't need to send more than one reply per transaction, and that the standby doesn't need to know the synchronization level. Other than that the curiosity about that contradiction, it doesn't seem like a very interesting detail to me right now. It's not a question that drives the rest of the design, but the other way round. But FWIW, something like your proposal of sending 3 XLogRecPtrs in each reply seems like a good approach. I'm not sure about using walwriter. I can see that it helps with getting the 'recv' and 'replay' acknowledgments out faster, but I still have the scars from starting bgwriter during recovery. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: