Re: Configuring synchronous replication - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Configuring synchronous replication |
Date | |
Msg-id | AANLkTi=DhqLuG+R4zX-cRJCEyECDmZnUk1cjM5ZJV9vJ@mail.gmail.com Whole thread Raw |
In response to | Re: Configuring synchronous replication (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Configuring synchronous replication
|
List | pgsql-hackers |
On Fri, Sep 24, 2010 at 6:37 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> > Earlier you argued that centralizing parameters would make this nice and >> > simple. Now you're pointing out that we aren't centralizing this at all, >> > and it won't be simple. We'll have to have a standby.conf set up that is >> > customised in advance for each standby that might become a master. Plus >> > we may even need multiple standby.confs in case that we have multiple >> > nodes down. This is exactly what I was seeking to avoid and exactly what >> > I meant when I asked for an analysis of the failure modes. >> >> If you're operating on the notion that no reconfiguration will be >> necessary when nodes go down, then we have very different notions of >> what is realistic. I think that "copy the new standby.conf file in >> place" is going to be the least of the fine admin's problems. > > Earlier you argued that setting parameters on each standby was difficult > and we should centralize things on the master. Now you tell us that > actually we do need lots of settings on each standby and that to think > otherwise is not realistic. That's a contradiction. You've repeatedly accused me and others of contradicting ourselves. I don't think that's helpful in advancing the debate, and I don't think it's what I'm doing. The point I'm trying to make is that when failover happens, lots of reconfiguration is going to be needed. There is just no getting around that. Let's ignore synchronous replication entirely for a moment. You're running 9.0 and you have 10 slaves. The master dies. You promote a slave. Guess what? You need to look at each slave you didn't promote and adjust primary_conninfo. You also need to check whether the slave has received an xlog record with a higher LSN than the one you promoted. If it has, you need to take a new base backup. Otherwise, you may have data corruption - very possibly silent data corruption. Do you dispute this? If so, on which point? The reason I think that we should centralize parameters on the master is because they affect *the behavior of the master*. Controlling whether the master will wait for the slave on the slave strikes me (and others) as spooky action at a distance. Configuring whether the master will retain WAL for a disconnected slave on the slave is outright byzantine. Of course, configuring these parameters on the master means that when the master changes, you're going to need a configuration (possibly the same, possibly different) for said parameters on the new master. But since you may be doing a lot of other adjustment at that point anyway (e.g. new base backups, changes in the set of synchronous slaves) that doesn't seem like a big deal. > The chain of argument used to support this as being a sensible design choice is broken or contradictory in more than one > place. I think we should be looking for a design using the KISS principle, while retaining sensible tuning options. The KISS principle is exactly what I am attempting to apply. Configuring parameters that affect the master on some machine other than the master isn't KISS, to me. You may find that broken or contradictory, but I disagree. I am attempting to disagree respectfully, but statements like the above make me feel like you're flaming, and that's getting under my skin. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: