Thread: Geographical redundancy

Geographical redundancy

From
Dennis
Date:
Is there any feasible way to achieve geographical redundancy of postgresql database?

Say you have a website which uses PG on the backend to read/write data and you want to have the website running on 2 separate servers distributed geographically and have the data synchronize somehow over the internet.

In case one data center fails, website is still up and running from 2nd geographical location (from say 2nd DNS server).

Thank you.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: Geographical redundancy

From
Ben
Date:
Sure, there are lots of ways. Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

> Is there any feasible way to achieve geographical redundancy of
> postgresql database?
>
> Say you have a website which uses PG on the backend to read/write
> data and you want to have the website running on 2 separate servers
> distributed geographically and have the data synchronize somehow
> over the internet.
>
> In case one data center fails, website is still up and running from
> 2nd geographical location (from say 2nd DNS server).
>
> Thank you.
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


Re: Geographical redundancy

From
Ben
Date:
If you're sure that data loss is unacceptable no matter what happens
to either site, then I'm not aware of too many options. As I
understand it, pgpool can be configured to send data-altering queries
to multiple servers in order to simulate a multi-master cluster, but
it's never been clear to me how this works if you use any non-
deterministic functions like random(). I'm not aware of any other
multi-master synchronous replication solution.

The only option I can think of (and it's ugly so hopefully somebody
could suggest something more elegant) would be to make postgres sit
on top of something like DRBD, so that both sites can stay in sync
with every write to disk. Then pick a site at which to run your
database, having your application at both sites connect to the active
server. In case of a problem, start up postgres at the other site and
have your app reconnect.

This will be slow. It will eat up your bandwidth. It will add
complexity. But it won't loose data.

On Dec 29, 2006, at 2:38 PM, Dennis wrote:

> - Yes, both sites have to be online and changing data at the same
> time.
> - data loss is unacceptable
> - platform is Gentoo Linux
> - downtime of up to 1 day is acceptable as long as there is no data
> loss
> - throughput latency -> internet over 10megabit line
>
>
> Ben <bench@silentmedia.com> wrote: Sure, there are lots of ways.
> Factors that start constraining things
> are:
>
> - do both sites have to be online (making changes to the data) at the
> same time?
> - how tightly do both sites have to stay in sync?
> - is data loss acceptable if one site suffers a disaster?
> - what platform are you running on?
> - how much throughput latency do you have between sites?
> - how much downtime is acceptable in switching sites?
>
> On Dec 26, 2006, at 11:41 PM, Dennis wrote:
>
> > Is there any feasible way to achieve geographical redundancy of
> > postgresql database?
> >
> > Say you have a website which uses PG on the backend to read/write
> > data and you want to have the website running on 2 separate servers
> > distributed geographically and have the data synchronize somehow
> > over the internet.
> >
> > In case one data center fails, website is still up and running from
> > 2nd geographical location (from say 2nd DNS server).
> >
> > Thank you.
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


Re: Geographical redundancy

From
Dennis
Date:
- Yes, both sites have to be online and changing data at the same time.
- data loss is unacceptable
- platform is Gentoo Linux
- downtime of up to 1 day is acceptable as long as there is no data loss
- throughput latency -> internet over 10megabit line


Ben <bench@silentmedia.com> wrote:
Sure, there are lots of ways. Factors that start constraining things
are:

- do both sites have to be online (making changes to the data) at the
same time?
- how tightly do both sites have to stay in sync?
- is data loss acceptable if one site suffers a disaster?
- what platform are you running on?
- how much throughput latency do you have between sites?
- how much downtime is acceptable in switching sites?

On Dec 26, 2006, at 11:41 PM, Dennis wrote:

> Is there any feasible way to achieve geographical redundancy of
> postgresql database?
>
> Say you have a website which uses PG on the backend to read/write
> data and you want to have the website running on 2 separate servers
> distributed geographically and have the data synchronize somehow
> over the internet.
>
> In case one data center fails, website is still up and running from
> 2nd geographical location (from say 2nd DNS server).
>
> Thank you.
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: Geographical redundancy

From
Dennis
Date:
I was thinking of maybe just having 2nd location receive a PG dump (full or incremental) every so often (an hour to 6 hours) and if the main location fails majorly, restore the PG cluster from the dump and switch DNS settings on the actual sites. I can make sure all website files are always in sync on both locations.



Ben <bench@silentmedia.com> wrote:
If you're sure that data loss is unacceptable no matter what happens
to either site, then I'm not aware of too many options. As I
understand it, pgpool can be configured to send data-altering queries
to multiple servers in order to simulate a multi-master cluster, but
it's never been clear to me how this works if you use any non-
deterministic functions like random(). I'm not aware of any other
multi-master synchronous replication solution.

The only option I can think of (and it's ugly so hopefully somebody
could suggest something more elegant) would be to make postgres sit
on top of something like DRBD, so that both sites can stay in sync
with every write to disk. Then pick a site at which to run your
database, having your application at both sites connect to the active
server. In case of a problem, start up postgres at the other site and
have your app reconnect.

This will be slow. It will eat up your bandwidth. It will add
complexity. But it won't loose data.

On Dec 29, 2006, at 2:38 PM, Dennis wrote:

> - Yes, both sites have to be online and changing data at the same
> time.
> - data loss is unacceptable
> - platform is Gentoo Linux
> - downtime of up to 1 day is acceptable as long as there is no data
> loss
> - throughput latency -> internet over 10megabit line
>
>
> Ben wrote: Sure, there are lots of ways.
> Factors that start constraining things
> are:
>
> - do both sites have to be online (making changes to the data) at the
> same time?
> - how tightly do both sites have to stay in sync?
> - is data loss acceptable if one site suffers a disaster?
> - what platform are you running on?
> - how much throughput latency do you have between sites?
> - how much downtime is acceptable in switching sites?
>
> On Dec 26, 2006, at 11:41 PM, Dennis wrote:
>
> > Is there any feasible way to achieve geographical redundancy of
> > postgresql database?
> >
> > Say you have a website which uses PG on the backend to read/write
> > data and you want to have the website running on 2 separate servers
> > distributed geographically and have the data synchronize somehow
> > over the internet.
> >
> > In case one data center fails, website is still up and running from
> > 2nd geographical location (from say 2nd DNS server).
> >
> > Thank you.
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: Geographical redundancy

From
"Raymond O'Donnell"
Date:
On 30 Dec 2006 at 0:22, Dennis wrote:

> I was thinking of maybe just having 2nd location receive a PG dump
> (full or incremental) every so often (an hour to 6 hours) and if the

Just curious - how do you do an incremental dump? I had to do
something similar recently, and ended up effectively rolling my own
mini-pg_dump.

--Ray.


----------------------------------------------------------------------

Raymond O'Donnell
Director of Music, Galway Cathedral, Galway, Ireland
rod@iol.ie
----------------------------------------------------------------------



Re: Geographical redundancy

From
Ben
Date:
On Sat, 30 Dec 2006, Dennis wrote:

> I was thinking of maybe just having 2nd location receive a PG dump (full
> or incremental) every so often (an hour to 6 hours) and if the main
> location fails majorly, restore the PG cluster from the dump and switch
> DNS settings on the actual sites. I can make sure all website files are
> always in sync on both locations.

Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:

http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html

But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.

Re: Geographical redundancy

From
Ben
Date:
OK, well accepting data loss (even if it is "just" 6-12 hours worth)
really opens up a lot of possibilities...... EXCEPT that you also said you
want both sites to be able to modify data. Again, there is no real
multi-master replication available for postgres, so you'll have to have
both sites at least write to the same database server.

On Tue, 2 Jan 2007, Dennis wrote:

> Well, I am mainly concerned with catastrophic failure. If 1st (main)
> datacenter fails majorly (say fire, earthquake, db server dies etc), I
> need to be able to restore websites/data quickly in another location. If
> I get a data loss of say 6-12 hours during a major failure (which should
> never occur), I am ok with that.
>
> Ben <bench@silentmedia.com> wrote: On Sat, 30 Dec 2006, Dennis wrote:
>
>> I was thinking of maybe just having 2nd location receive a PG dump (full
>> or incremental) every so often (an hour to 6 hours) and if the main
>> location fails majorly, restore the PG cluster from the dump and switch
>> DNS settings on the actual sites. I can make sure all website files are
>> always in sync on both locations.
>
> Well, first off, you can just rsync your archived WAL files. That may be
> easier than playing with pg_dump:
>
> http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html
>
> But second, and more important given your data-loss desires, if you do it
> this way you have a window where you can experience data loss.
> Specifically, after a transaction is committed, that commit will be at
> risk until the next transfer has completed.
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com

Re: Geographical redundancy

From
Markus Schiltknecht
Date:
Hello Dennis,

Dennis wrote:
> Is there any feasible way to achieve geographical redundancy of postgresql database?

As nobody mentioned it up until now: please check the very nice
documentation about High Availability and Failover here:

http://www.postgresql.org/docs/8.2/static/high-availability.html

It explains most optionsand its pros and cons.

Regards

Markus


Re: Geographical redundancy

From
Dennis
Date:
Well, I am mainly concerned with catastrophic  failure. If 1st (main) datacenter fails majorly (say fire, earthquake, db server dies etc), I need to be able to restore websites/data quickly in another location. If I get a data loss of say 6-12 hours during a major failure (which should never occur), I am ok with that.

Ben <bench@silentmedia.com> wrote:
On Sat, 30 Dec 2006, Dennis wrote:

> I was thinking of maybe just having 2nd location receive a PG dump (full
> or incremental) every so often (an hour to 6 hours) and if the main
> location fails majorly, restore the PG cluster from the dump and switch
> DNS settings on the actual sites. I can make sure all website files are
> always in sync on both locations.

Well, first off, you can just rsync your archived WAL files. That may be
easier than playing with pg_dump:

http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html

But second, and more important given your data-loss desires, if you do it
this way you have a window where you can experience data loss.
Specifically, after a transaction is committed, that commit will be at
risk until the next transfer has completed.

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com