Thread: The state of PG replication in 2008/Q2?
My company finally has the means to install a new database server for replication. I have Googled and found a lot of sparse information out there regarding replication systems for PostgreSQL and a lot of it looks very out-of-date. Can I please get some ideas from those of you that are currently using fail-over replication systems? What advantage does your solution have? What are the "gotchas" I need to worry about? My desire would be to have a parallel server that could act as a hot standby system with automatic fail over in a multi-master role. If our primary server goes down for whatever reason, the secondary would take over and handle the load seamlessly. I think this is really the "holy grail" scenario and I understand how difficult it is to achieve. Especially since we make frequent use of sequences in our databases. If MM is too difficult, I'm willing to accept a hot-standby read-only system that will handle queries until we can fix whatever ails the master. We are primary an OLAP environment but there is a constant stream of inserts into the databases. There are 47 different databases hosted on the primary server and this number will continue to scale up to whatever the server seems to support. The reason I mention this number is that it seems that those systems that make heavy use of schema changes require a lot of "fiddling". For a single database, this doesn't seem too problematic, but any manual work involved and administrative overhead will scale at the same rate as the database count grows and I certainly want to minimize as much fiddling as possible. We are using 8.3 and the total combined size for the PG data directory is 226G. Hopefully I didn't neglect to include more relevant information. As always, thank you for your insight. -Dan
Hi Dan! Its true, many of the replication options that exists for PostgreSQL have not seen any updates in a while. If you only looking for redundancy and not a performance gain you should look at PostgreSQL PITR (http://www.postgresql.org/docs/8.1/static/backup-online.html ) For Master-Slave replication i think that Slony http://www.slony.info/ is most up to date. But it does not support DDL changes. You may wich to look at pgpool http://pgpool.projects.postgresql.org/ it supports Synchronous replication (wich is good for data integrity, but can be bad for performance). These are some of the open source options. I do not have any experience with the commercial onces. Best regards, Mathias http://www.pastbedti.me/ On 21 aug 2008, at 19.53, Dan Harris wrote: > My company finally has the means to install a new database server > for replication. I have Googled and found a lot of sparse > information out there regarding replication systems for PostgreSQL > and a lot of it looks very out-of-date. Can I please get some ideas > from those of you that are currently using fail-over replication > systems? What advantage does your solution have? What are the > "gotchas" I need to worry about? > > My desire would be to have a parallel server that could act as a hot > standby system with automatic fail over in a multi-master role. If > our primary server goes down for whatever reason, the secondary > would take over and handle the load seamlessly. I think this is > really the "holy grail" scenario and I understand how difficult it > is to achieve. Especially since we make frequent use of sequences > in our databases. If MM is too difficult, I'm willing to accept a > hot-standby read-only system that will handle queries until we can > fix whatever ails the master. > We are primary an OLAP environment but there is a constant stream of > inserts into the databases. There are 47 different databases hosted > on the primary server and this number will continue to scale up to > whatever the server seems to support. The reason I mention this > number is that it seems that those systems that make heavy use of > schema changes require a lot of "fiddling". For a single database, > this doesn't seem too problematic, but any manual work involved and > administrative overhead will scale at the same rate as the database > count grows and I certainly want to minimize as much fiddling as > possible. > > We are using 8.3 and the total combined size for the PG data > directory is 226G. Hopefully I didn't neglect to include more > relevant information. > > As always, thank you for your insight. > > -Dan > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Attachment
On Thu, Aug 21, 2008 at 10:53:05PM +0200, Mathias Stjernström wrote: > For Master-Slave replication i think that Slony http://www.slony.info/ is > most up to date. But it does not support DDL changes. This isn't quite true. It supports DDL; it just doesn't support it in the normal way, and is broken by applications doing DDL as part of the regular operation. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
On Thursday 21 August 2008, Dan Harris <fbsd@drivefaster.net> wrote: > Especially since we make frequent use of sequences in our databases. If > MM is too difficult, I'm willing to accept a hot-standby read-only > system that will handle queries until we can fix whatever ails the > master. A heartbeat+DRBD solution might make more sense than database-level replication to achieve this. -- Alan
Yes thats true. It does support DDL changes but not in a automatic way. You have to execute all DDL changes with a separate script. What's the status of http://www.commandprompt.com/products/mammothreplicator/ ? Best regards, Mathias http://www.pastbedti.me/ On 21 aug 2008, at 23.04, Andrew Sullivan wrote: > On Thu, Aug 21, 2008 at 10:53:05PM +0200, Mathias Stjernström wrote: > >> For Master-Slave replication i think that Slony http://www.slony.info/ >> is >> most up to date. But it does not support DDL changes. > > This isn't quite true. It supports DDL; it just doesn't support it in > the normal way, and is broken by applications doing DDL as part of the > regular operation. > > A > > -- > Andrew Sullivan > ajs@commandprompt.com > +1 503 667 4564 x104 > http://www.commandprompt.com/ > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Attachment
On Thu, 21 Aug 2008 23:21:26 +0200 Mathias Stjernström <mathias@globalinn.com> wrote: > Yes thats true. It does support DDL changes but not in a automatic > way. You have to execute all DDL changes with a separate script. > > What's the status of > http://www.commandprompt.com/products/mammothreplicator/ ? > It is about to go open source but it doesn't replicate DDL either. Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Mathias Stjernström wrote: > Yes thats true. It does support DDL changes but not in a automatic way. > You have to execute all DDL changes with a separate script. > That's true, but it's quite simple to do with the provided perl script(s) - slonik_execute_script. I've had to make use of it a few times and have had no problems. -salman
On Thursday 21 August 2008, salman <salmanb@quietcaresystems.com> wrote: > Mathias Stjernström wrote: > > Yes thats true. It does support DDL changes but not in a automatic way. > > You have to execute all DDL changes with a separate script. > > That's true, but it's quite simple to do with the provided perl > script(s) - slonik_execute_script. I've had to make use of it a few > times and have had no problems. I do it almost every day, and it is not all that simple if your configuration is complex. The original poster would require at least 47 different Slony clusters, for starters. The complications from adding and dropping tables and sequences across 47 databases, and trying to keep Slony up to date throughout, staggers the imagination, honestly. -- Alan
Joshua Drake wrote: > On Thu, 21 Aug 2008 23:21:26 +0200 > Mathias Stjernström <mathias@globalinn.com> wrote: > > > Yes thats true. It does support DDL changes but not in a automatic > > way. You have to execute all DDL changes with a separate script. > > > > What's the status of > > http://www.commandprompt.com/products/mammothreplicator/ ? > > It is about to go open source but it doesn't replicate DDL either. It doesn't replicate multiple databases either. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, 21 Aug 2008, Mathias Stjernstr?m wrote: > Hi Dan! > > Its true, many of the replication options that exists for PostgreSQL have not > seen any updates in a while. > > If you only looking for redundancy and not a performance gain you should look > at PostgreSQL PITR > (http://www.postgresql.org/docs/8.1/static/backup-online.html) > > For Master-Slave replication i think that Slony http://www.slony.info/ is > most up to date. But it does not support DDL changes. > > You may wich to look at pgpool http://pgpool.projects.postgresql.org/ it > supports Synchronous replication (wich is good for data integrity, but can be > bad for performance). > > These are some of the open source options. I do not have any experience with > the commercial onces. a couple of months ago there was a lot of news about a WAL based replication engine. one that was closed source, but possibly getting opened shortly, and also the decision by the core devs to add one into the base distro. what's been happening on this front? from my understanding the first versions of this would not support queries of the replica, but would provide for the consistancy needed for reliable failover. David Lang
On Thu, 21 Aug 2008 17:54:11 -0400 Alvaro Herrera <alvherre@commandprompt.com> wrote: > Joshua Drake wrote: > > On Thu, 21 Aug 2008 23:21:26 +0200 > > Mathias Stjernström <mathias@globalinn.com> wrote: > > > > > Yes thats true. It does support DDL changes but not in a > > > automatic way. You have to execute all DDL changes with a > > > separate script. > > > > > > What's the status of > > > http://www.commandprompt.com/products/mammothreplicator/ ? > > > > It is about to go open source but it doesn't replicate DDL either. > > It doesn't replicate multiple databases either. > True Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> My company finally has the means to install a new database server for > replication. I have Googled and found a lot of sparse information out > there regarding replication systems for PostgreSQL and a lot of it > looks very out-of-date. Can I please get some ideas from those of you > that are currently using fail-over replication systems? What > advantage does your solution have? What are the "gotchas" I need to > worry about? > > My desire would be to have a parallel server that could act as a hot > standby system with automatic fail over in a multi-master role. If > our primary server goes down for whatever reason, the secondary would > take over and handle the load seamlessly. I think this is really the > "holy grail" scenario and I understand how difficult it is to > achieve. Especially since we make frequent use of sequences in our > databases. If MM is too difficult, I'm willing to accept a > hot-standby read-only system that will handle queries until we can fix > whatever ails the master. > We are primary an OLAP environment but there is a constant stream of > inserts into the databases. There are 47 different databases hosted > on the primary server and this number will continue to scale up to > whatever the server seems to support. The reason I mention this > number is that it seems that those systems that make heavy use of > schema changes require a lot of "fiddling". For a single database, > this doesn't seem too problematic, but any manual work involved and > administrative overhead will scale at the same rate as the database > count grows and I certainly want to minimize as much fiddling as > possible. > > If you really need "only" need automatic failover than use DRBD + Heartbeat somebody already mentioned. We are using this solution since 3 years now. With DRBD replication is done at filesystem block level. So you don't have to bother about changes done with a DDL statement and Heartbeat will automatically failover if one server goes down. It's really stable. If you want MM you should give Cybercluster a try. (http://www.postgresql.at/english/pr_cybercluster_e.html) They offer good support and is Open Source since a few month now. Robert
Yup, but sometimes you are not in charge of the DDL changes. You may have many different users that make changes or for example if you are working with Ruby On Rails you have something thats called Migrations that handles all DDL changes in those situations it can get really complicated with those slony scripts. My experience is that automatic handling of DDL changes is a very important feature of a replication system of curse not in all systems but in many. I am also very interested in the WAL replication that David Lang asked about. Best regards, Mathias Stjernström http://www.pastbedti.me/ On 21 aug 2008, at 23.26, salman wrote: > > > Mathias Stjernström wrote: >> Yes thats true. It does support DDL changes but not in a automatic >> way. You have to execute all DDL changes with a separate script. > > That's true, but it's quite simple to do with the provided perl > script(s) - slonik_execute_script. I've had to make use of it a few > times and have had no problems. > > -salman
Attachment
I Agree with Robert but i never heard of Cybercluster before. Does anyone have any experience with Cybercluster? It sounds really interesting! Best regards, Mathias Stjernström http://www.pastbedti.me/ On 22 aug 2008, at 08.18, RW wrote: > >> My company finally has the means to install a new database server >> for replication. I have Googled and found a lot of sparse >> information out there regarding replication systems for PostgreSQL >> and a lot of it looks very out-of-date. Can I please get some >> ideas from those of you that are currently using fail-over >> replication systems? What advantage does your solution have? What >> are the "gotchas" I need to worry about? >> >> My desire would be to have a parallel server that could act as a >> hot standby system with automatic fail over in a multi-master >> role. If our primary server goes down for whatever reason, the >> secondary would take over and handle the load seamlessly. I think >> this is really the "holy grail" scenario and I understand how >> difficult it is to achieve. Especially since we make frequent use >> of sequences in our databases. If MM is too difficult, I'm willing >> to accept a hot-standby read-only system that will handle queries >> until we can fix whatever ails the master. >> We are primary an OLAP environment but there is a constant stream >> of inserts into the databases. There are 47 different databases >> hosted on the primary server and this number will continue to scale >> up to whatever the server seems to support. The reason I mention >> this number is that it seems that those systems that make heavy use >> of schema changes require a lot of "fiddling". For a single >> database, this doesn't seem too problematic, but any manual work >> involved and administrative overhead will scale at the same rate as >> the database count grows and I certainly want to minimize as much >> fiddling as possible. >> >> > If you really need "only" need automatic failover than use DRBD + > Heartbeat > somebody already mentioned. We are using this solution since 3 years > now. > With DRBD replication is done at filesystem block level. So you > don't have to > bother about changes done with a DDL statement and Heartbeat will > automatically failover if one server goes down. It's really stable. > > If you want MM you should give Cybercluster a try. (http://www.postgresql.at/english/pr_cybercluster_e.html > ) > They offer good support and is Open Source since a few month now. > > Robert > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Attachment
Dan Harris wrote: > My desire would be to have a parallel server that could act as a hot > standby system with automatic fail over in a multi-master role. I will add my "me too" for DRBD + Heartbeat.
Hi Mathias, On Aug 22, 2008, at 8:35 AM, Mathias Stjernström wrote: > I Agree with Robert but i never heard of Cybercluster before. > Does anyone have any experience with Cybercluster? It sounds really > interesting! Some months ago i took a look into cybercluster. At that point cybercluster was basically postgres-source 8.3 patched already with pgcluster sources. Best regards, Jan
Jan Otto wrote: > Hi Mathias, > > On Aug 22, 2008, at 8:35 AM, Mathias Stjernström wrote: > >> I Agree with Robert but i never heard of Cybercluster before. Does >> anyone have any experience with Cybercluster? It sounds really >> interesting! > > Some months ago i took a look into cybercluster. At that point > cybercluster was basically postgres-source 8.3 patched already with > pgcluster sources. > I do believe it is a packaged version of pgcluster Does anyone have experience with bucardo? It's still a recent addition to open source offerings. No DDL replication but it does support two masters. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz