Thread: replication choices

replication choices

From

Ben

Date:

25 January 2007, 16:18:00

Hi guys. I've inherited a system that I'm looking to add replication to.
It already has some custom replication code, but it's being nice to say
that code less than good. I'm hoping there's an existing project out there
that will work much better. Unfortunately, I'm not seeing anything that
obviously fits my needs, so maybe somebody here can suggest something.

I've got a single cluster in the datacenter and dozens of remote sites.
Many of these sites are on unreliable connections to the internet, and
while they're online more often then not, when their network will go down
isn't known, and even when it's up, the network isn't that fast.

A vast majority of activity occurs at these remote sites, with very little
at the datacenter cluster. That said, the datacenter cluster needs to keep
pretty good copies of most (but not all) of the data at each site.
Obviously the network unrealiability puts a limit on how up to date the
datacenter can be, but loosing data is considered Bad. So, for instance,
restoring the daily backup of each site at the datacenter is too
infrequent.

Each site will replicate to its own schema in the datacenter cluster, so I
don't *think* I need a multi-master solution.... but at the same time,
because data will be coming from multiple sites, simply replaying WAL
files at the datacenter won't work.

In addition, there will be some data changes made at the datacenter that
will need to replicate to all of the remote sites as soon as they're
online. It's ok if data being replicated from the datacenter ends up in a
different schema at the remote sites than the schema which holds the data
that will be replicated back to the datacenter.

My current best guess of what to do is create a global schema at every
database, a local schema at each site, and a schema for each site at the
datacenter. Then I can use Slony to replicate the global schema from the
datacenter to each site, and again use Slony to replicate the local schema
from each site to that site's schema in the datacenter. But I'm not too
familiar with Slony, and from what I understand, using Slony with bad
networks leads to bad problems. I'm also not sure that Slony supports
replicating from multiple sources to the same postgres install, even if
each replication process is writing to a different schema.

Are there any better options? Or is my Slony idea not so bad?

Re: replication choices

From

Bruce Momjian

Date:

25 January 2007, 16:19:38

Have you read the 8.2 documentation about this:

    http://www.postgresql.org/docs/8.2/static/high-availability.html

---------------------------------------------------------------------------

Ben wrote:
> Hi guys. I've inherited a system that I'm looking to add replication to.
> It already has some custom replication code, but it's being nice to say
> that code less than good. I'm hoping there's an existing project out there
> that will work much better. Unfortunately, I'm not seeing anything that
> obviously fits my needs, so maybe somebody here can suggest something.
>
> I've got a single cluster in the datacenter and dozens of remote sites.
> Many of these sites are on unreliable connections to the internet, and
> while they're online more often then not, when their network will go down
> isn't known, and even when it's up, the network isn't that fast.
>
> A vast majority of activity occurs at these remote sites, with very little
> at the datacenter cluster. That said, the datacenter cluster needs to keep
> pretty good copies of most (but not all) of the data at each site.
> Obviously the network unrealiability puts a limit on how up to date the
> datacenter can be, but loosing data is considered Bad. So, for instance,
> restoring the daily backup of each site at the datacenter is too
> infrequent.
>
> Each site will replicate to its own schema in the datacenter cluster, so I
> don't *think* I need a multi-master solution.... but at the same time,
> because data will be coming from multiple sites, simply replaying WAL
> files at the datacenter won't work.
>
> In addition, there will be some data changes made at the datacenter that
> will need to replicate to all of the remote sites as soon as they're
> online. It's ok if data being replicated from the datacenter ends up in a
> different schema at the remote sites than the schema which holds the data
> that will be replicated back to the datacenter.
>
> My current best guess of what to do is create a global schema at every
> database, a local schema at each site, and a schema for each site at the
> datacenter. Then I can use Slony to replicate the global schema from the
> datacenter to each site, and again use Slony to replicate the local schema
> from each site to that site's schema in the datacenter. But I'm not too
> familiar with Slony, and from what I understand, using Slony with bad
> networks leads to bad problems. I'm also not sure that Slony supports
> replicating from multiple sources to the same postgres install, even if
> each replication process is writing to a different schema.
>
> Are there any better options? Or is my Slony idea not so bad?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: replication choices

From

Ben

Date:

25 January 2007, 17:42:36

Yes, but unless I'm missing something, it doesn't look like any of those
options perfectly fit my situation, except perhaps Slony, which is why I'm
leaning that direction now despite my concerns.

Is there a section of this page I should be re-reading?

On Thu, 25 Jan 2007, Bruce Momjian wrote:

>
> Have you read the 8.2 documentation about this:
>
>     http://www.postgresql.org/docs/8.2/static/high-availability.html
>
> ---------------------------------------------------------------------------
>
> Ben wrote:
>> Hi guys. I've inherited a system that I'm looking to add replication to.
>> It already has some custom replication code, but it's being nice to say
>> that code less than good. I'm hoping there's an existing project out there
>> that will work much better. Unfortunately, I'm not seeing anything that
>> obviously fits my needs, so maybe somebody here can suggest something.
>>
>> I've got a single cluster in the datacenter and dozens of remote sites.
>> Many of these sites are on unreliable connections to the internet, and
>> while they're online more often then not, when their network will go down
>> isn't known, and even when it's up, the network isn't that fast.
>>
>> A vast majority of activity occurs at these remote sites, with very little
>> at the datacenter cluster. That said, the datacenter cluster needs to keep
>> pretty good copies of most (but not all) of the data at each site.
>> Obviously the network unrealiability puts a limit on how up to date the
>> datacenter can be, but loosing data is considered Bad. So, for instance,
>> restoring the daily backup of each site at the datacenter is too
>> infrequent.
>>
>> Each site will replicate to its own schema in the datacenter cluster, so I
>> don't *think* I need a multi-master solution.... but at the same time,
>> because data will be coming from multiple sites, simply replaying WAL
>> files at the datacenter won't work.
>>
>> In addition, there will be some data changes made at the datacenter that
>> will need to replicate to all of the remote sites as soon as they're
>> online. It's ok if data being replicated from the datacenter ends up in a
>> different schema at the remote sites than the schema which holds the data
>> that will be replicated back to the datacenter.
>>
>> My current best guess of what to do is create a global schema at every
>> database, a local schema at each site, and a schema for each site at the
>> datacenter. Then I can use Slony to replicate the global schema from the
>> datacenter to each site, and again use Slony to replicate the local schema
>> from each site to that site's schema in the datacenter. But I'm not too
>> familiar with Slony, and from what I understand, using Slony with bad
>> networks leads to bad problems. I'm also not sure that Slony supports
>> replicating from multiple sources to the same postgres install, even if
>> each replication process is writing to a different schema.
>>
>> Are there any better options? Or is my Slony idea not so bad?
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 6: explain analyze is your friend
>
> --
>  Bruce Momjian   bruce@momjian.us
>  EnterpriseDB    http://www.enterprisedb.com
>
>  + If your life is a hard drive, Christ can be your backup. +
>

Re: replication choices

From

"Lenorovitz, Joel"

Date:

31 January 2007, 17:54:12

I believe I have a similar situation involving multiple database
instances acting quasi-independently on a common (at least conceptually)
set of data.  Since each instance can effectively operate independently,
I am uncertain if the term replication is accurate, but here is my
strategy to keep the data properly synchronized.  It is still unproven
so any advice or scrutiny that can be given is welcome.

Situation:
There are multiple sites at which the same database/front-end
application is running.  None of the sites are directly connected to one
another over a network and the only communication between sites is
effectively unidirectional to a central location (i.e., information can
independently go both ways during a communications link, but it's not
real-time duplex).  Each of the sites allows authorized users to perform
any type of change to the data.

Solution:
Each site has 3 different versions of what I call the base schema:
Confirmed, Pending, and Update.  Each of these versions has some special
columns associated with it to capture other information about changes
that are made to it (e.g. timestamp, action(insert,update,delete), and
status).  The central site (which I'm loathe to call 'master') has these
same schemas, plus it has an additional Update schema for each other
site in the system.

During normal use at each non-central site, the Pending schema is the
active schema  from which data is queried and also added, modified, and
deleted.  Each time a record is changed in the Pending schema it's
status is flagged as 'pending' and the new data is copied to the Update
schema.  Also copied to the Update schema is the old data from the
record that was changed.  This effectively makes the Update schema a log
of what each record in the database was changed to, what it was changed
from, and when that happened (in UTC).  The data from the update schema
is then dumped regularly to a flat file.

When any remote site establishes a communications link with the central
site, the flat files of the Update schema from each site are exchanged
and the official synchronization time is taken to be that of the flat
file that was updated least recently (i.e., the older file).  Then, at
each site the data from the flat file is uploaded to the local Updates
schema.  All of the records in the now more populous Update schema are
then processed sequentially by timestamp and applied to the Confirmed
schema so, in theory, the same changes should be simultaneously getting
applied to the Confirmed schemas at both locations in question.
Finally, each record in the Pending schema is set to the value contained
in the Confirmed schema and the flag set back to 'confirmed', the two
sites are considered synchronized, and then the whole process starts
anew.

There are some details that have been glossed over here to eschew
obfuscation, and the actual situation at the central site is more
complex than this in practice, but that is the gist of the approach.  I
do not know of any product, Slony included, that has built in support
for a situation such as this, so I suspect all of the details will have
to be handled in a custom fashion.

Anyhow, Ben, this is my working solution and, from the sounds of it,
yours is the only case I have heard of that has the same set of
challenges.  I am interested in hearing if these ideas will work for you
and/or if anyone knows of any flaws in this methodology or a
better/easier/more reliable means of accomplishing this task.  I should
point out that, in our environment of understandably limited
connectivity, we are definitely more tolerant of the delayed performance
this synchronization strategy will cause than most users/companies would
be.  The important thing for us is that the data integrity is maintained
and that everyone at each site can access and change the data.

Regards,
Joel

-----Original Message-----
From: Ben [mailto:bench@silentmedia.com]
Sent: Thursday, January 25, 2007 1:18 PM
To: pgsql-general@postgresql.org
Subject: replication choices

Hi guys. I've inherited a system that I'm looking to add replication to.

It already has some custom replication code, but it's being nice to say
that code less than good. I'm hoping there's an existing project out
there that will work much better. Unfortunately, I'm not seeing anything
that obviously fits my needs, so maybe somebody here can suggest
something.

I've got a single cluster in the datacenter and dozens of remote sites.
Many of these sites are on unreliable connections to the internet, and
while they're online more often then not, when their network will go
down isn't known, and even when it's up, the network isn't that fast.

A vast majority of activity occurs at these remote sites, with very
little at the datacenter cluster. That said, the datacenter cluster
needs to keep pretty good copies of most (but not all) of the data at
each site.
Obviously the network unrealiability puts a limit on how up to date the
datacenter can be, but loosing data is considered Bad. So, for instance,
restoring the daily backup of each site at the datacenter is too
infrequent.

Each site will replicate to its own schema in the datacenter cluster, so
I don't *think* I need a multi-master solution.... but at the same time,
because data will be coming from multiple sites, simply replaying WAL
files at the datacenter won't work.

In addition, there will be some data changes made at the datacenter that
will need to replicate to all of the remote sites as soon as they're
online. It's ok if data being replicated from the datacenter ends up in
a different schema at the remote sites than the schema which holds the
data that will be replicated back to the datacenter.

My current best guess of what to do is create a global schema at every
database, a local schema at each site, and a schema for each site at the
datacenter. Then I can use Slony to replicate the global schema from the
datacenter to each site, and again use Slony to replicate the local
schema from each site to that site's schema in the datacenter. But I'm
not too familiar with Slony, and from what I understand, using Slony
with bad networks leads to bad problems. I'm also not sure that Slony
supports replicating from multiple sources to the same postgres install,
even if each replication process is writing to a different schema.

Are there any better options? Or is my Slony idea not so bad?

Re: replication choices

From

Guido Neitzer

Date:

31 January 2007, 18:15:20

On 31.01.2007, at 14:53, Lenorovitz, Joel wrote:

> I do not know of any product, Slony included, that has built in
> support
> for a situation such as this, so I suspect all of the details will
> have
> to be handled in a custom fashion.

It is not relevant for you as your are using PostgreSQL (for good
reasons I believe - I use it too, but not exclusively), but from what
I have heard, the guys at FrontBase  [1] have developed a solution
for such scenarios based on their own database engine. If you are
especially looking for such a product and if you are not bound to a
special db engine, you might want to talk to them.

The information about that is, as far as I know, not on their
website, so you might ask the customer support for that.

To make that clear: I have nothing to do with FrontBase and it
clearly has disadvantages and advantages compared to PostgreSQL. I'm
just a user, as I am for PostgreSQL. I use both of them (more often
PostgreSQL) and I always try to use the one that fits the situation
best. That's all and that's why I'm giving this information.

cug

[1] http://www.frontbase.com

Re: replication choices

From

Ben

Date:

31 January 2007, 19:17:50

Well, it sounds like your situation is more difficult than mine, in that
it looks like you might have to deal with the possibility of conflicting
changes. Fortuantely, I don't have that issue, because even though I have
data flowing both ways between my "master" and my sites, it only flows one
way for any given table.

But unlike you (?) I have the additional problem that some data changes at
the central site need to be replicated to multiple sites.

After looking into it more, it appears that Slony log shipping just may
work for me.... if only it weren't for the problem that to ship logs from
the remote sites back to the central site, each remote site needs to have
a normal slony node first, which I don't have the hardware for.

Short of conflict resolution, I don't see any big problems with your
scheme.

On Wed, 31 Jan 2007, Lenorovitz, Joel wrote:

> I believe I have a similar situation involving multiple database
> instances acting quasi-independently on a common (at least conceptually)
> set of data.  Since each instance can effectively operate independently,
> I am uncertain if the term replication is accurate, but here is my
> strategy to keep the data properly synchronized.  It is still unproven
> so any advice or scrutiny that can be given is welcome.
>
> Situation:
> There are multiple sites at which the same database/front-end
> application is running.  None of the sites are directly connected to one
> another over a network and the only communication between sites is
> effectively unidirectional to a central location (i.e., information can
> independently go both ways during a communications link, but it's not
> real-time duplex).  Each of the sites allows authorized users to perform
> any type of change to the data.
>
> Solution:
> Each site has 3 different versions of what I call the base schema:
> Confirmed, Pending, and Update.  Each of these versions has some special
> columns associated with it to capture other information about changes
> that are made to it (e.g. timestamp, action(insert,update,delete), and
> status).  The central site (which I'm loathe to call 'master') has these
> same schemas, plus it has an additional Update schema for each other
> site in the system.
>
> During normal use at each non-central site, the Pending schema is the
> active schema  from which data is queried and also added, modified, and
> deleted.  Each time a record is changed in the Pending schema it's
> status is flagged as 'pending' and the new data is copied to the Update
> schema.  Also copied to the Update schema is the old data from the
> record that was changed.  This effectively makes the Update schema a log
> of what each record in the database was changed to, what it was changed
> from, and when that happened (in UTC).  The data from the update schema
> is then dumped regularly to a flat file.
>
> When any remote site establishes a communications link with the central
> site, the flat files of the Update schema from each site are exchanged
> and the official synchronization time is taken to be that of the flat
> file that was updated least recently (i.e., the older file).  Then, at
> each site the data from the flat file is uploaded to the local Updates
> schema.  All of the records in the now more populous Update schema are
> then processed sequentially by timestamp and applied to the Confirmed
> schema so, in theory, the same changes should be simultaneously getting
> applied to the Confirmed schemas at both locations in question.
> Finally, each record in the Pending schema is set to the value contained
> in the Confirmed schema and the flag set back to 'confirmed', the two
> sites are considered synchronized, and then the whole process starts
> anew.
>
> There are some details that have been glossed over here to eschew
> obfuscation, and the actual situation at the central site is more
> complex than this in practice, but that is the gist of the approach.  I
> do not know of any product, Slony included, that has built in support
> for a situation such as this, so I suspect all of the details will have
> to be handled in a custom fashion.
>
> Anyhow, Ben, this is my working solution and, from the sounds of it,
> yours is the only case I have heard of that has the same set of
> challenges.  I am interested in hearing if these ideas will work for you
> and/or if anyone knows of any flaws in this methodology or a
> better/easier/more reliable means of accomplishing this task.  I should
> point out that, in our environment of understandably limited
> connectivity, we are definitely more tolerant of the delayed performance
> this synchronization strategy will cause than most users/companies would
> be.  The important thing for us is that the data integrity is maintained
> and that everyone at each site can access and change the data.
>
> Regards,
> Joel
>
> -----Original Message-----
> From: Ben [mailto:bench@silentmedia.com]
> Sent: Thursday, January 25, 2007 1:18 PM
> To: pgsql-general@postgresql.org
> Subject: replication choices
>
> Hi guys. I've inherited a system that I'm looking to add replication to.
>
> It already has some custom replication code, but it's being nice to say
> that code less than good. I'm hoping there's an existing project out
> there that will work much better. Unfortunately, I'm not seeing anything
> that obviously fits my needs, so maybe somebody here can suggest
> something.
>
> I've got a single cluster in the datacenter and dozens of remote sites.
> Many of these sites are on unreliable connections to the internet, and
> while they're online more often then not, when their network will go
> down isn't known, and even when it's up, the network isn't that fast.
>
> A vast majority of activity occurs at these remote sites, with very
> little at the datacenter cluster. That said, the datacenter cluster
> needs to keep pretty good copies of most (but not all) of the data at
> each site.
> Obviously the network unrealiability puts a limit on how up to date the
> datacenter can be, but loosing data is considered Bad. So, for instance,
> restoring the daily backup of each site at the datacenter is too
> infrequent.
>
> Each site will replicate to its own schema in the datacenter cluster, so
> I don't *think* I need a multi-master solution.... but at the same time,
> because data will be coming from multiple sites, simply replaying WAL
> files at the datacenter won't work.
>
> In addition, there will be some data changes made at the datacenter that
> will need to replicate to all of the remote sites as soon as they're
> online. It's ok if data being replicated from the datacenter ends up in
> a different schema at the remote sites than the schema which holds the
> data that will be replicated back to the datacenter.
>
> My current best guess of what to do is create a global schema at every
> database, a local schema at each site, and a schema for each site at the
> datacenter. Then I can use Slony to replicate the global schema from the
> datacenter to each site, and again use Slony to replicate the local
> schema from each site to that site's schema in the datacenter. But I'm
> not too familiar with Slony, and from what I understand, using Slony
> with bad networks leads to bad problems. I'm also not sure that Slony
> supports replicating from multiple sources to the same postgres install,
> even if each replication process is writing to a different schema.
>
> Are there any better options? Or is my Slony idea not so bad?
>

Re: replication choices

From

Andrew Sullivan

Date:

06 February 2007, 12:28:57

On Thu, Jan 25, 2007 at 12:17:52PM -0800, Ben wrote:
> familiar with Slony, and from what I understand, using Slony with bad
> networks leads to bad problems. I'm also not sure that Slony supports
> replicating from multiple sources to the same postgres install, even if
> each replication process is writing to a different schema.

Yes, you can have multiple origins into the same database, without a
problem.  I'd be worried for sure about the network unreliability,
though.  You might, however, be able to do this usefully using the
log shipping features of Slony.

I would _not_ worry about the outbound replication from the centre,
assuming that the changes are infrequent.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
                --Brad Holland

Re: replication choices

From

Andrew Sullivan

Date:

06 February 2007, 12:43:53

On Wed, Jan 31, 2007 at 03:17:40PM -0800, Ben wrote:
> the remote sites back to the central site, each remote site needs to have
> a normal slony node first, which I don't have the hardware for.

An answer for this, though a dirty kludge, is to replicate to another
database in the same cluster.  This is really no more load than the
single replication user, although it is expensive at the disk level.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.
        --Philip Greenspun

Re: replication choices

From

Ben

Date:

06 February 2007, 15:51:03

Yeah, log shipping looks like it solves the network problem, except for
the part about how how I must replicate to a normal slony node before I
can get logs to ship. We don't have the hardware to have a secondary
database at every site. :(

On Tue, 6 Feb 2007, Andrew Sullivan wrote:

> On Thu, Jan 25, 2007 at 12:17:52PM -0800, Ben wrote:
>> familiar with Slony, and from what I understand, using Slony with bad
>> networks leads to bad problems. I'm also not sure that Slony supports
>> replicating from multiple sources to the same postgres install, even if
>> each replication process is writing to a different schema.
>
> Yes, you can have multiple origins into the same database, without a
> problem.  I'd be worried for sure about the network unreliability,
> though.  You might, however, be able to do this usefully using the
> log shipping features of Slony.
>
> I would _not_ worry about the outbound replication from the centre,
> assuming that the changes are infrequent.
>
> A
>
> --
> Andrew Sullivan  | ajs@crankycanuck.ca
> In the future this spectacle of the middle classes shocking the avant-
> garde will probably become the textbook definition of Postmodernism.
>                --Brad Holland
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org/
>

Re: replication choices

From

Ben

Date:

06 February 2007, 20:24:51

Just to be clear, this effectively means I double my database writes,
correct?

On Tue, 6 Feb 2007, Andrew Sullivan wrote:

> On Wed, Jan 31, 2007 at 03:17:40PM -0800, Ben wrote:
>> the remote sites back to the central site, each remote site needs to have
>> a normal slony node first, which I don't have the hardware for.
>
> An answer for this, though a dirty kludge, is to replicate to another
> database in the same cluster.  This is really no more load than the
> single replication user, although it is expensive at the disk level.
>
> A
>
> --
> Andrew Sullivan  | ajs@crankycanuck.ca
> The fact that technology doesn't work is no bar to success in the marketplace.
>         --Philip Greenspun
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org/
>