Thread: Geographical redundancy
Is there any feasible way to achieve geographical redundancy of postgresql database?
Say you have a website which uses PG on the backend to read/write data and you want to have the website running on 2 separate servers distributed geographically and have the data synchronize somehow over the internet.
In case one data center fails, website is still up and running from 2nd geographical location (from say 2nd DNS server).
Thank you.
Say you have a website which uses PG on the backend to read/write data and you want to have the website running on 2 separate servers distributed geographically and have the data synchronize somehow over the internet.
In case one data center fails, website is still up and running from 2nd geographical location (from say 2nd DNS server).
Thank you.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Sure, there are lots of ways. Factors that start constraining things are: - do both sites have to be online (making changes to the data) at the same time? - how tightly do both sites have to stay in sync? - is data loss acceptable if one site suffers a disaster? - what platform are you running on? - how much throughput latency do you have between sites? - how much downtime is acceptable in switching sites? On Dec 26, 2006, at 11:41 PM, Dennis wrote: > Is there any feasible way to achieve geographical redundancy of > postgresql database? > > Say you have a website which uses PG on the backend to read/write > data and you want to have the website running on 2 separate servers > distributed geographically and have the data synchronize somehow > over the internet. > > In case one data center fails, website is still up and running from > 2nd geographical location (from say 2nd DNS server). > > Thank you. > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com
If you're sure that data loss is unacceptable no matter what happens to either site, then I'm not aware of too many options. As I understand it, pgpool can be configured to send data-altering queries to multiple servers in order to simulate a multi-master cluster, but it's never been clear to me how this works if you use any non- deterministic functions like random(). I'm not aware of any other multi-master synchronous replication solution. The only option I can think of (and it's ugly so hopefully somebody could suggest something more elegant) would be to make postgres sit on top of something like DRBD, so that both sites can stay in sync with every write to disk. Then pick a site at which to run your database, having your application at both sites connect to the active server. In case of a problem, start up postgres at the other site and have your app reconnect. This will be slow. It will eat up your bandwidth. It will add complexity. But it won't loose data. On Dec 29, 2006, at 2:38 PM, Dennis wrote: > - Yes, both sites have to be online and changing data at the same > time. > - data loss is unacceptable > - platform is Gentoo Linux > - downtime of up to 1 day is acceptable as long as there is no data > loss > - throughput latency -> internet over 10megabit line > > > Ben <bench@silentmedia.com> wrote: Sure, there are lots of ways. > Factors that start constraining things > are: > > - do both sites have to be online (making changes to the data) at the > same time? > - how tightly do both sites have to stay in sync? > - is data loss acceptable if one site suffers a disaster? > - what platform are you running on? > - how much throughput latency do you have between sites? > - how much downtime is acceptable in switching sites? > > On Dec 26, 2006, at 11:41 PM, Dennis wrote: > > > Is there any feasible way to achieve geographical redundancy of > > postgresql database? > > > > Say you have a website which uses PG on the backend to read/write > > data and you want to have the website running on 2 separate servers > > distributed geographically and have the data synchronize somehow > > over the internet. > > > > In case one data center fails, website is still up and running from > > 2nd geographical location (from say 2nd DNS server). > > > > Thank you. > > __________________________________________________ > > Do You Yahoo!? > > Tired of spam? Yahoo! Mail has the best spam protection around > > http://mail.yahoo.com > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com
- Yes, both sites have to be online and changing data at the same time.
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data loss
- throughput latency -> internet over 10megabit line
Ben <bench@silentmedia.com> wrote:
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data loss
- throughput latency -> internet over 10megabit line
Ben <bench@silentmedia.com> wrote:
Sure, there are lots of ways. Factors that start constraining things
are:
- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?
On Dec 26, 2006, at 11:41 PM, Dennis wrote:
> Is there any feasible way to achieve geographical redundancy of
> postgresql database?
>
> Say you have a website which uses PG on the backend to read/write
> data and you want to have the website running on 2 separate servers
> distributed geographically and have the data synchronize somehow
> over the internet.
>
> In case one data center fails, website is still up and running from
> 2nd geographical location (from say 2nd DNS server).
>
> Thank you.
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
I was thinking of maybe just having 2nd location receive a PG dump (full or incremental) every so often (an hour to 6 hours) and if the main location fails majorly, restore the PG cluster from the dump and switch DNS settings on the actual sites. I can make sure all website files are always in sync on both locations.
Ben <bench@silentmedia.com> wrote:
Ben <bench@silentmedia.com> wrote:
If you're sure that data loss is unacceptable no matter what happens
to either site, then I'm not aware of too many options. As I
understand it, pgpool can be configured to send data-altering queries
to multiple servers in order to simulate a multi-master cluster, but
it's never been clear to me how this works if you use any non-
deterministic functions like random(). I'm not aware of any other
multi-master synchronous replication solution.
The only option I can think of (and it's ugly so hopefully somebody
could suggest something more elegant) would be to make postgres sit
on top of something like DRBD, so that both sites can stay in sync
with every write to disk. Then pick a site at which to run your
database, having your application at both sites connect to the active
server. In case of a problem, start up postgres at the other site and
have your app reconnect.
This will be slow. It will eat up your bandwidth. It will add
complexity. But it won't loose data.
On Dec 29, 2006, at 2:38 PM, Dennis wrote:
> - Yes, both sites have to be online and changing data at the same
> time.
> - data loss is unacceptable
> - platform is Gentoo Linux
> - downtime of up to 1 day is acceptable as long as there is no data
> loss
> - throughput latency -> internet over 10megabit line
>
>
> Ben wrote: Sure, there are lots of ways.
> Factors that start constraining things
> are:
>
> - do both sites have to be online (making changes to the data) at the
> same time?
> - how tightly do both sites have to stay in sync?
> - is data loss acceptable if one site suffers a disaster?
> - what platform are you running on?
> - how much throughput latency do you have between sites?
> - how much downtime is acceptable in switching sites?
>
> On Dec 26, 2006, at 11:41 PM, Dennis wrote:
>
> > Is there any feasible way to achieve geographical redundancy of
> > postgresql database?
> >
> > Say you have a website which uses PG on the backend to read/write
> > data and you want to have the website running on 2 separate servers
> > distributed geographically and have the data synchronize somehow
> > over the internet.
> >
> > In case one data center fails, website is still up and running from
> > 2nd geographical location (from say 2nd DNS server).
> >
> > Thank you.
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
On 30 Dec 2006 at 0:22, Dennis wrote: > I was thinking of maybe just having 2nd location receive a PG dump > (full or incremental) every so often (an hour to 6 hours) and if the Just curious - how do you do an incremental dump? I had to do something similar recently, and ended up effectively rolling my own mini-pg_dump. --Ray. ---------------------------------------------------------------------- Raymond O'Donnell Director of Music, Galway Cathedral, Galway, Ireland rod@iol.ie ----------------------------------------------------------------------
On Sat, 30 Dec 2006, Dennis wrote: > I was thinking of maybe just having 2nd location receive a PG dump (full > or incremental) every so often (an hour to 6 hours) and if the main > location fails majorly, restore the PG cluster from the dump and switch > DNS settings on the actual sites. I can make sure all website files are > always in sync on both locations. Well, first off, you can just rsync your archived WAL files. That may be easier than playing with pg_dump: http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html But second, and more important given your data-loss desires, if you do it this way you have a window where you can experience data loss. Specifically, after a transaction is committed, that commit will be at risk until the next transfer has completed.
OK, well accepting data loss (even if it is "just" 6-12 hours worth) really opens up a lot of possibilities...... EXCEPT that you also said you want both sites to be able to modify data. Again, there is no real multi-master replication available for postgres, so you'll have to have both sites at least write to the same database server. On Tue, 2 Jan 2007, Dennis wrote: > Well, I am mainly concerned with catastrophic failure. If 1st (main) > datacenter fails majorly (say fire, earthquake, db server dies etc), I > need to be able to restore websites/data quickly in another location. If > I get a data loss of say 6-12 hours during a major failure (which should > never occur), I am ok with that. > > Ben <bench@silentmedia.com> wrote: On Sat, 30 Dec 2006, Dennis wrote: > >> I was thinking of maybe just having 2nd location receive a PG dump (full >> or incremental) every so often (an hour to 6 hours) and if the main >> location fails majorly, restore the PG cluster from the dump and switch >> DNS settings on the actual sites. I can make sure all website files are >> always in sync on both locations. > > Well, first off, you can just rsync your archived WAL files. That may be > easier than playing with pg_dump: > > http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html > > But second, and more important given your data-loss desires, if you do it > this way you have a window where you can experience data loss. > Specifically, after a transaction is committed, that commit will be at > risk until the next transfer has completed. > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com
Hello Dennis, Dennis wrote: > Is there any feasible way to achieve geographical redundancy of postgresql database? As nobody mentioned it up until now: please check the very nice documentation about High Availability and Failover here: http://www.postgresql.org/docs/8.2/static/high-availability.html It explains most optionsand its pros and cons. Regards Markus
Well, I am mainly concerned with catastrophic failure. If 1st (main) datacenter fails majorly (say fire, earthquake, db server dies etc), I need to be able to restore websites/data quickly in another location. If I get a data loss of say 6-12 hours during a major failure (which should never occur), I am ok with that.
Ben <bench@silentmedia.com> wrote:
Ben <bench@silentmedia.com> wrote:
On Sat, 30 Dec 2006, Dennis wrote:
> I was thinking of maybe just having 2nd location receive a PG dump (full
> or incremental) every so often (an hour to 6 hours) and if the main
> location fails majorly, restore the PG cluster from the dump and switch
> DNS settings on the actual sites. I can make sure all website files are
> always in sync on both locations.
Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:
http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html
But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com