Thread: replication choices
Hi guys. I've inherited a system that I'm looking to add replication to. It already has some custom replication code, but it's being nice to say that code less than good. I'm hoping there's an existing project out there that will work much better. Unfortunately, I'm not seeing anything that obviously fits my needs, so maybe somebody here can suggest something. I've got a single cluster in the datacenter and dozens of remote sites. Many of these sites are on unreliable connections to the internet, and while they're online more often then not, when their network will go down isn't known, and even when it's up, the network isn't that fast. A vast majority of activity occurs at these remote sites, with very little at the datacenter cluster. That said, the datacenter cluster needs to keep pretty good copies of most (but not all) of the data at each site. Obviously the network unrealiability puts a limit on how up to date the datacenter can be, but loosing data is considered Bad. So, for instance, restoring the daily backup of each site at the datacenter is too infrequent. Each site will replicate to its own schema in the datacenter cluster, so I don't *think* I need a multi-master solution.... but at the same time, because data will be coming from multiple sites, simply replaying WAL files at the datacenter won't work. In addition, there will be some data changes made at the datacenter that will need to replicate to all of the remote sites as soon as they're online. It's ok if data being replicated from the datacenter ends up in a different schema at the remote sites than the schema which holds the data that will be replicated back to the datacenter. My current best guess of what to do is create a global schema at every database, a local schema at each site, and a schema for each site at the datacenter. Then I can use Slony to replicate the global schema from the datacenter to each site, and again use Slony to replicate the local schema from each site to that site's schema in the datacenter. But I'm not too familiar with Slony, and from what I understand, using Slony with bad networks leads to bad problems. I'm also not sure that Slony supports replicating from multiple sources to the same postgres install, even if each replication process is writing to a different schema. Are there any better options? Or is my Slony idea not so bad?
Have you read the 8.2 documentation about this: http://www.postgresql.org/docs/8.2/static/high-availability.html --------------------------------------------------------------------------- Ben wrote: > Hi guys. I've inherited a system that I'm looking to add replication to. > It already has some custom replication code, but it's being nice to say > that code less than good. I'm hoping there's an existing project out there > that will work much better. Unfortunately, I'm not seeing anything that > obviously fits my needs, so maybe somebody here can suggest something. > > I've got a single cluster in the datacenter and dozens of remote sites. > Many of these sites are on unreliable connections to the internet, and > while they're online more often then not, when their network will go down > isn't known, and even when it's up, the network isn't that fast. > > A vast majority of activity occurs at these remote sites, with very little > at the datacenter cluster. That said, the datacenter cluster needs to keep > pretty good copies of most (but not all) of the data at each site. > Obviously the network unrealiability puts a limit on how up to date the > datacenter can be, but loosing data is considered Bad. So, for instance, > restoring the daily backup of each site at the datacenter is too > infrequent. > > Each site will replicate to its own schema in the datacenter cluster, so I > don't *think* I need a multi-master solution.... but at the same time, > because data will be coming from multiple sites, simply replaying WAL > files at the datacenter won't work. > > In addition, there will be some data changes made at the datacenter that > will need to replicate to all of the remote sites as soon as they're > online. It's ok if data being replicated from the datacenter ends up in a > different schema at the remote sites than the schema which holds the data > that will be replicated back to the datacenter. > > My current best guess of what to do is create a global schema at every > database, a local schema at each site, and a schema for each site at the > datacenter. Then I can use Slony to replicate the global schema from the > datacenter to each site, and again use Slony to replicate the local schema > from each site to that site's schema in the datacenter. But I'm not too > familiar with Slony, and from what I understand, using Slony with bad > networks leads to bad problems. I'm also not sure that Slony supports > replicating from multiple sources to the same postgres install, even if > each replication process is writing to a different schema. > > Are there any better options? Or is my Slony idea not so bad? > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Yes, but unless I'm missing something, it doesn't look like any of those options perfectly fit my situation, except perhaps Slony, which is why I'm leaning that direction now despite my concerns. Is there a section of this page I should be re-reading? On Thu, 25 Jan 2007, Bruce Momjian wrote: > > Have you read the 8.2 documentation about this: > > http://www.postgresql.org/docs/8.2/static/high-availability.html > > --------------------------------------------------------------------------- > > Ben wrote: >> Hi guys. I've inherited a system that I'm looking to add replication to. >> It already has some custom replication code, but it's being nice to say >> that code less than good. I'm hoping there's an existing project out there >> that will work much better. Unfortunately, I'm not seeing anything that >> obviously fits my needs, so maybe somebody here can suggest something. >> >> I've got a single cluster in the datacenter and dozens of remote sites. >> Many of these sites are on unreliable connections to the internet, and >> while they're online more often then not, when their network will go down >> isn't known, and even when it's up, the network isn't that fast. >> >> A vast majority of activity occurs at these remote sites, with very little >> at the datacenter cluster. That said, the datacenter cluster needs to keep >> pretty good copies of most (but not all) of the data at each site. >> Obviously the network unrealiability puts a limit on how up to date the >> datacenter can be, but loosing data is considered Bad. So, for instance, >> restoring the daily backup of each site at the datacenter is too >> infrequent. >> >> Each site will replicate to its own schema in the datacenter cluster, so I >> don't *think* I need a multi-master solution.... but at the same time, >> because data will be coming from multiple sites, simply replaying WAL >> files at the datacenter won't work. >> >> In addition, there will be some data changes made at the datacenter that >> will need to replicate to all of the remote sites as soon as they're >> online. It's ok if data being replicated from the datacenter ends up in a >> different schema at the remote sites than the schema which holds the data >> that will be replicated back to the datacenter. >> >> My current best guess of what to do is create a global schema at every >> database, a local schema at each site, and a schema for each site at the >> datacenter. Then I can use Slony to replicate the global schema from the >> datacenter to each site, and again use Slony to replicate the local schema >> from each site to that site's schema in the datacenter. But I'm not too >> familiar with Slony, and from what I understand, using Slony with bad >> networks leads to bad problems. I'm also not sure that Slony supports >> replicating from multiple sources to the same postgres install, even if >> each replication process is writing to a different schema. >> >> Are there any better options? Or is my Slony idea not so bad? >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 6: explain analyze is your friend > > -- > Bruce Momjian bruce@momjian.us > EnterpriseDB http://www.enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + >
I believe I have a similar situation involving multiple database instances acting quasi-independently on a common (at least conceptually) set of data. Since each instance can effectively operate independently, I am uncertain if the term replication is accurate, but here is my strategy to keep the data properly synchronized. It is still unproven so any advice or scrutiny that can be given is welcome. Situation: There are multiple sites at which the same database/front-end application is running. None of the sites are directly connected to one another over a network and the only communication between sites is effectively unidirectional to a central location (i.e., information can independently go both ways during a communications link, but it's not real-time duplex). Each of the sites allows authorized users to perform any type of change to the data. Solution: Each site has 3 different versions of what I call the base schema: Confirmed, Pending, and Update. Each of these versions has some special columns associated with it to capture other information about changes that are made to it (e.g. timestamp, action(insert,update,delete), and status). The central site (which I'm loathe to call 'master') has these same schemas, plus it has an additional Update schema for each other site in the system. During normal use at each non-central site, the Pending schema is the active schema from which data is queried and also added, modified, and deleted. Each time a record is changed in the Pending schema it's status is flagged as 'pending' and the new data is copied to the Update schema. Also copied to the Update schema is the old data from the record that was changed. This effectively makes the Update schema a log of what each record in the database was changed to, what it was changed from, and when that happened (in UTC). The data from the update schema is then dumped regularly to a flat file. When any remote site establishes a communications link with the central site, the flat files of the Update schema from each site are exchanged and the official synchronization time is taken to be that of the flat file that was updated least recently (i.e., the older file). Then, at each site the data from the flat file is uploaded to the local Updates schema. All of the records in the now more populous Update schema are then processed sequentially by timestamp and applied to the Confirmed schema so, in theory, the same changes should be simultaneously getting applied to the Confirmed schemas at both locations in question. Finally, each record in the Pending schema is set to the value contained in the Confirmed schema and the flag set back to 'confirmed', the two sites are considered synchronized, and then the whole process starts anew. There are some details that have been glossed over here to eschew obfuscation, and the actual situation at the central site is more complex than this in practice, but that is the gist of the approach. I do not know of any product, Slony included, that has built in support for a situation such as this, so I suspect all of the details will have to be handled in a custom fashion. Anyhow, Ben, this is my working solution and, from the sounds of it, yours is the only case I have heard of that has the same set of challenges. I am interested in hearing if these ideas will work for you and/or if anyone knows of any flaws in this methodology or a better/easier/more reliable means of accomplishing this task. I should point out that, in our environment of understandably limited connectivity, we are definitely more tolerant of the delayed performance this synchronization strategy will cause than most users/companies would be. The important thing for us is that the data integrity is maintained and that everyone at each site can access and change the data. Regards, Joel -----Original Message----- From: Ben [mailto:bench@silentmedia.com] Sent: Thursday, January 25, 2007 1:18 PM To: pgsql-general@postgresql.org Subject: replication choices Hi guys. I've inherited a system that I'm looking to add replication to. It already has some custom replication code, but it's being nice to say that code less than good. I'm hoping there's an existing project out there that will work much better. Unfortunately, I'm not seeing anything that obviously fits my needs, so maybe somebody here can suggest something. I've got a single cluster in the datacenter and dozens of remote sites. Many of these sites are on unreliable connections to the internet, and while they're online more often then not, when their network will go down isn't known, and even when it's up, the network isn't that fast. A vast majority of activity occurs at these remote sites, with very little at the datacenter cluster. That said, the datacenter cluster needs to keep pretty good copies of most (but not all) of the data at each site. Obviously the network unrealiability puts a limit on how up to date the datacenter can be, but loosing data is considered Bad. So, for instance, restoring the daily backup of each site at the datacenter is too infrequent. Each site will replicate to its own schema in the datacenter cluster, so I don't *think* I need a multi-master solution.... but at the same time, because data will be coming from multiple sites, simply replaying WAL files at the datacenter won't work. In addition, there will be some data changes made at the datacenter that will need to replicate to all of the remote sites as soon as they're online. It's ok if data being replicated from the datacenter ends up in a different schema at the remote sites than the schema which holds the data that will be replicated back to the datacenter. My current best guess of what to do is create a global schema at every database, a local schema at each site, and a schema for each site at the datacenter. Then I can use Slony to replicate the global schema from the datacenter to each site, and again use Slony to replicate the local schema from each site to that site's schema in the datacenter. But I'm not too familiar with Slony, and from what I understand, using Slony with bad networks leads to bad problems. I'm also not sure that Slony supports replicating from multiple sources to the same postgres install, even if each replication process is writing to a different schema. Are there any better options? Or is my Slony idea not so bad?
On 31.01.2007, at 14:53, Lenorovitz, Joel wrote: > I do not know of any product, Slony included, that has built in > support > for a situation such as this, so I suspect all of the details will > have > to be handled in a custom fashion. It is not relevant for you as your are using PostgreSQL (for good reasons I believe - I use it too, but not exclusively), but from what I have heard, the guys at FrontBase [1] have developed a solution for such scenarios based on their own database engine. If you are especially looking for such a product and if you are not bound to a special db engine, you might want to talk to them. The information about that is, as far as I know, not on their website, so you might ask the customer support for that. To make that clear: I have nothing to do with FrontBase and it clearly has disadvantages and advantages compared to PostgreSQL. I'm just a user, as I am for PostgreSQL. I use both of them (more often PostgreSQL) and I always try to use the one that fits the situation best. That's all and that's why I'm giving this information. cug [1] http://www.frontbase.com
Well, it sounds like your situation is more difficult than mine, in that it looks like you might have to deal with the possibility of conflicting changes. Fortuantely, I don't have that issue, because even though I have data flowing both ways between my "master" and my sites, it only flows one way for any given table. But unlike you (?) I have the additional problem that some data changes at the central site need to be replicated to multiple sites. After looking into it more, it appears that Slony log shipping just may work for me.... if only it weren't for the problem that to ship logs from the remote sites back to the central site, each remote site needs to have a normal slony node first, which I don't have the hardware for. Short of conflict resolution, I don't see any big problems with your scheme. On Wed, 31 Jan 2007, Lenorovitz, Joel wrote: > I believe I have a similar situation involving multiple database > instances acting quasi-independently on a common (at least conceptually) > set of data. Since each instance can effectively operate independently, > I am uncertain if the term replication is accurate, but here is my > strategy to keep the data properly synchronized. It is still unproven > so any advice or scrutiny that can be given is welcome. > > Situation: > There are multiple sites at which the same database/front-end > application is running. None of the sites are directly connected to one > another over a network and the only communication between sites is > effectively unidirectional to a central location (i.e., information can > independently go both ways during a communications link, but it's not > real-time duplex). Each of the sites allows authorized users to perform > any type of change to the data. > > Solution: > Each site has 3 different versions of what I call the base schema: > Confirmed, Pending, and Update. Each of these versions has some special > columns associated with it to capture other information about changes > that are made to it (e.g. timestamp, action(insert,update,delete), and > status). The central site (which I'm loathe to call 'master') has these > same schemas, plus it has an additional Update schema for each other > site in the system. > > During normal use at each non-central site, the Pending schema is the > active schema from which data is queried and also added, modified, and > deleted. Each time a record is changed in the Pending schema it's > status is flagged as 'pending' and the new data is copied to the Update > schema. Also copied to the Update schema is the old data from the > record that was changed. This effectively makes the Update schema a log > of what each record in the database was changed to, what it was changed > from, and when that happened (in UTC). The data from the update schema > is then dumped regularly to a flat file. > > When any remote site establishes a communications link with the central > site, the flat files of the Update schema from each site are exchanged > and the official synchronization time is taken to be that of the flat > file that was updated least recently (i.e., the older file). Then, at > each site the data from the flat file is uploaded to the local Updates > schema. All of the records in the now more populous Update schema are > then processed sequentially by timestamp and applied to the Confirmed > schema so, in theory, the same changes should be simultaneously getting > applied to the Confirmed schemas at both locations in question. > Finally, each record in the Pending schema is set to the value contained > in the Confirmed schema and the flag set back to 'confirmed', the two > sites are considered synchronized, and then the whole process starts > anew. > > There are some details that have been glossed over here to eschew > obfuscation, and the actual situation at the central site is more > complex than this in practice, but that is the gist of the approach. I > do not know of any product, Slony included, that has built in support > for a situation such as this, so I suspect all of the details will have > to be handled in a custom fashion. > > Anyhow, Ben, this is my working solution and, from the sounds of it, > yours is the only case I have heard of that has the same set of > challenges. I am interested in hearing if these ideas will work for you > and/or if anyone knows of any flaws in this methodology or a > better/easier/more reliable means of accomplishing this task. I should > point out that, in our environment of understandably limited > connectivity, we are definitely more tolerant of the delayed performance > this synchronization strategy will cause than most users/companies would > be. The important thing for us is that the data integrity is maintained > and that everyone at each site can access and change the data. > > Regards, > Joel > > -----Original Message----- > From: Ben [mailto:bench@silentmedia.com] > Sent: Thursday, January 25, 2007 1:18 PM > To: pgsql-general@postgresql.org > Subject: replication choices > > Hi guys. I've inherited a system that I'm looking to add replication to. > > It already has some custom replication code, but it's being nice to say > that code less than good. I'm hoping there's an existing project out > there that will work much better. Unfortunately, I'm not seeing anything > that obviously fits my needs, so maybe somebody here can suggest > something. > > I've got a single cluster in the datacenter and dozens of remote sites. > Many of these sites are on unreliable connections to the internet, and > while they're online more often then not, when their network will go > down isn't known, and even when it's up, the network isn't that fast. > > A vast majority of activity occurs at these remote sites, with very > little at the datacenter cluster. That said, the datacenter cluster > needs to keep pretty good copies of most (but not all) of the data at > each site. > Obviously the network unrealiability puts a limit on how up to date the > datacenter can be, but loosing data is considered Bad. So, for instance, > restoring the daily backup of each site at the datacenter is too > infrequent. > > Each site will replicate to its own schema in the datacenter cluster, so > I don't *think* I need a multi-master solution.... but at the same time, > because data will be coming from multiple sites, simply replaying WAL > files at the datacenter won't work. > > In addition, there will be some data changes made at the datacenter that > will need to replicate to all of the remote sites as soon as they're > online. It's ok if data being replicated from the datacenter ends up in > a different schema at the remote sites than the schema which holds the > data that will be replicated back to the datacenter. > > My current best guess of what to do is create a global schema at every > database, a local schema at each site, and a schema for each site at the > datacenter. Then I can use Slony to replicate the global schema from the > datacenter to each site, and again use Slony to replicate the local > schema from each site to that site's schema in the datacenter. But I'm > not too familiar with Slony, and from what I understand, using Slony > with bad networks leads to bad problems. I'm also not sure that Slony > supports replicating from multiple sources to the same postgres install, > even if each replication process is writing to a different schema. > > Are there any better options? Or is my Slony idea not so bad? >
On Thu, Jan 25, 2007 at 12:17:52PM -0800, Ben wrote: > familiar with Slony, and from what I understand, using Slony with bad > networks leads to bad problems. I'm also not sure that Slony supports > replicating from multiple sources to the same postgres install, even if > each replication process is writing to a different schema. Yes, you can have multiple origins into the same database, without a problem. I'd be worried for sure about the network unreliability, though. You might, however, be able to do this usefully using the log shipping features of Slony. I would _not_ worry about the outbound replication from the centre, assuming that the changes are infrequent. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland
On Wed, Jan 31, 2007 at 03:17:40PM -0800, Ben wrote: > the remote sites back to the central site, each remote site needs to have > a normal slony node first, which I don't have the hardware for. An answer for this, though a dirty kludge, is to replicate to another database in the same cluster. This is really no more load than the single replication user, although it is expensive at the disk level. A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
Yeah, log shipping looks like it solves the network problem, except for the part about how how I must replicate to a normal slony node before I can get logs to ship. We don't have the hardware to have a secondary database at every site. :( On Tue, 6 Feb 2007, Andrew Sullivan wrote: > On Thu, Jan 25, 2007 at 12:17:52PM -0800, Ben wrote: >> familiar with Slony, and from what I understand, using Slony with bad >> networks leads to bad problems. I'm also not sure that Slony supports >> replicating from multiple sources to the same postgres install, even if >> each replication process is writing to a different schema. > > Yes, you can have multiple origins into the same database, without a > problem. I'd be worried for sure about the network unreliability, > though. You might, however, be able to do this usefully using the > log shipping features of Slony. > > I would _not_ worry about the outbound replication from the centre, > assuming that the changes are infrequent. > > A > > -- > Andrew Sullivan | ajs@crankycanuck.ca > In the future this spectacle of the middle classes shocking the avant- > garde will probably become the textbook definition of Postmodernism. > --Brad Holland > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >
Just to be clear, this effectively means I double my database writes, correct? On Tue, 6 Feb 2007, Andrew Sullivan wrote: > On Wed, Jan 31, 2007 at 03:17:40PM -0800, Ben wrote: >> the remote sites back to the central site, each remote site needs to have >> a normal slony node first, which I don't have the hardware for. > > An answer for this, though a dirty kludge, is to replicate to another > database in the same cluster. This is really no more load than the > single replication user, although it is expensive at the disk level. > > A > > -- > Andrew Sullivan | ajs@crankycanuck.ca > The fact that technology doesn't work is no bar to success in the marketplace. > --Philip Greenspun > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >