Re: 24x7x365 high-volume ops ideas - Mailing list pgsql-general
From | Christopher Browne |
---|---|
Subject | Re: 24x7x365 high-volume ops ideas |
Date | |
Msg-id | m3y8hdnfff.fsf@knuth.knuth.cbbrowne.com Whole thread Raw |
In response to | 24x7x365 high-volume ops ideas ("Ed L." <pgsql@bluepolka.net>) |
Responses |
Re: 24x7x365 high-volume ops ideas
|
List | pgsql-general |
A long time ago, in a galaxy far, far away, Karim.Nassar@NAU.EDU (Karim Nassar) wrote: > On Wed, 2004-11-03 at 18:10, Ed L. wrote: >> unfortunately, the requirement is 100% uptime all the time, and any >> downtime at all is a liability. Here are some of the issues: > > Seems like 100% uptime is always an issue, but not even close to > reality. I think it's unreasonable to expect a single piece of > software that NEVER to be restarted. Never is a really long time. > > For this case, isn't replication sufficient? (FWIW, in 1 month I > have to answer this same question). Would this work? > > * 'Main' db server up 99.78% of time > * 'Replicant' up 99.78% of time (using slony, dbmirror) > * When Main goes down (crisis, maintenance), Replicant answers for Main, > in a read-only fashion. > * When Main comes back up, any waiting writes can now happen. > * Likewise, Replicant can be taken down for maint, then Main syncs to it > when going back online. > > Is this how it's done? The challenge lies in two places: 1. You need some mechanism to detect that the "replica" should take over, and to actually perform that takeover. That "takeover" requires having some way for your application to become aware of the new IP address of the DB host. 2. Some changes need to take place in order to prepare the "replica" to be treated as "master." For instance, in the case of Slony-I, you can do a fullscale "failover" where you tell it to treat the "main" database as being dead. At that point, the replica becomes the master. That essentially discards the former 'master' as dead. Alternatively, there's a "MOVE SET" which is suitable for predictable maintenance; that shifts the "master" node from one node to another; you can take MAIN out of service for a while, and add it back, perhaps making it the "master" again. None of these systems _directly_ address how apps would get pointed to the shifting servers. A neat approach would involve making pgpool, a C-based 'connection pool' manager, Slony-I-aware. If it were to submit either MOVE SET or FAILOVER, it would be aware of which DB to point things to, so that applications that pass requests through pgpool would not necessarily need to be aware of there being a change beyond perhaps seeing some transactions terminated. That won't be ready tomorrow... Something needs to be "smart enough" to point apps to the right place; that's something to think about... -- let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@" [name;tld];; http://www3.sympatico.ca/cbbrowne/advocacy.html "XFS might (or might not) come out before the year 3000. As far as kernel patches go, SGI are brilliant. As far as graphics, especially OpenGL, go, SGI is untouchable. As far as filing systems go, a concussed doormouse in a tarpit would move faster." -- jd on Slashdot
pgsql-general by date: