Thread: The state of PG replication in 2008/Q2?

The state of PG replication in 2008/Q2?

From
Dan Harris
Date:
My company finally has the means to install a new database server for
replication.  I have Googled and found a lot of sparse information out
there regarding replication systems for PostgreSQL and a lot of it looks
very out-of-date.  Can I please get some ideas from those of you that
are currently using fail-over replication systems?  What advantage does
your solution have?  What are the "gotchas" I need to worry about?

My desire would be to have a parallel server that could act as a hot
standby system with automatic fail over in a multi-master role.  If our
primary server goes down for whatever reason, the secondary would take
over and handle the load seamlessly.  I think this is really the "holy
grail" scenario and I understand how difficult it is to achieve.
Especially since we make frequent use of sequences in our databases.  If
MM is too difficult, I'm willing to accept a hot-standby read-only
system that will handle queries until we can fix whatever ails the master.

We are primary an OLAP environment but there is a constant stream of
inserts into the databases.  There are 47 different databases hosted on
the primary server and this number will continue to scale up to whatever
the server seems to support.  The reason I mention this number is that
it seems that those systems that make heavy use of schema changes
require a lot of "fiddling".  For a single database, this doesn't seem
too problematic, but any manual work involved and administrative
overhead will scale at the same rate as the database count grows and I
certainly want to minimize as much fiddling as possible.

We are using 8.3 and the total combined size for the PG data directory
is 226G.  Hopefully I didn't neglect to include more relevant information.

As always, thank you for your insight.

-Dan



Re: The state of PG replication in 2008/Q2?

From
Mathias Stjernström
Date:
Hi Dan!

Its true, many of the replication options that exists for PostgreSQL
have not seen any updates in a while.

If you only looking for redundancy and not a performance gain you
should look at PostgreSQL PITR (http://www.postgresql.org/docs/8.1/static/backup-online.html
)

For Master-Slave replication i think that Slony http://www.slony.info/
is most up to date. But it does not support DDL changes.

You may wich to look at pgpool http://pgpool.projects.postgresql.org/
it supports Synchronous replication (wich is good for data integrity,
but can be bad for performance).

These are some of the open source options. I do not have any
experience with the commercial onces.

Best regards,
Mathias

http://www.pastbedti.me/


On 21 aug 2008, at 19.53, Dan Harris wrote:

> My company finally has the means to install a new database server
> for replication.  I have Googled and found a lot of sparse
> information out there regarding replication systems for PostgreSQL
> and a lot of it looks very out-of-date.  Can I please get some ideas
> from those of you that are currently using fail-over replication
> systems?  What advantage does your solution have?  What are the
> "gotchas" I need to worry about?
>
> My desire would be to have a parallel server that could act as a hot
> standby system with automatic fail over in a multi-master role.  If
> our primary server goes down for whatever reason, the secondary
> would take over and handle the load seamlessly.  I think this is
> really the "holy grail" scenario and I understand how difficult it
> is to achieve.  Especially since we make frequent use of sequences
> in our databases.  If MM is too difficult, I'm willing to accept a
> hot-standby read-only system that will handle queries until we can
> fix whatever ails the master.
> We are primary an OLAP environment but there is a constant stream of
> inserts into the databases.  There are 47 different databases hosted
> on the primary server and this number will continue to scale up to
> whatever the server seems to support.  The reason I mention this
> number is that it seems that those systems that make heavy use of
> schema changes require a lot of "fiddling".  For a single database,
> this doesn't seem too problematic, but any manual work involved and
> administrative overhead will scale at the same rate as the database
> count grows and I certainly want to minimize as much fiddling as
> possible.
>
> We are using 8.3 and the total combined size for the PG data
> directory is 226G.  Hopefully I didn't neglect to include more
> relevant information.
>
> As always, thank you for your insight.
>
> -Dan
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Attachment

Re: The state of PG replication in 2008/Q2?

From
Andrew Sullivan
Date:
On Thu, Aug 21, 2008 at 10:53:05PM +0200, Mathias Stjernström wrote:

> For Master-Slave replication i think that Slony http://www.slony.info/ is
> most up to date. But it does not support DDL changes.

This isn't quite true.  It supports DDL; it just doesn't support it in
the normal way, and is broken by applications doing DDL as part of the
regular operation.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: The state of PG replication in 2008/Q2?

From
Alan Hodgson
Date:
On Thursday 21 August 2008, Dan Harris <fbsd@drivefaster.net> wrote:
> Especially since we make frequent use of sequences in our databases.  If
> MM is too difficult, I'm willing to accept a hot-standby read-only
> system that will handle queries until we can fix whatever ails the
> master.

A heartbeat+DRBD solution might make more sense than database-level
replication to achieve this.

--
Alan

Re: The state of PG replication in 2008/Q2?

From
Mathias Stjernström
Date:
Yes thats true. It does support DDL changes but not in a automatic
way. You have to execute all DDL changes with a separate script.

What's the status of http://www.commandprompt.com/products/mammothreplicator/
  ?

Best regards,
Mathias

http://www.pastbedti.me/


On 21 aug 2008, at 23.04, Andrew Sullivan wrote:

> On Thu, Aug 21, 2008 at 10:53:05PM +0200, Mathias Stjernström wrote:
>
>> For Master-Slave replication i think that Slony http://www.slony.info/
>>  is
>> most up to date. But it does not support DDL changes.
>
> This isn't quite true.  It supports DDL; it just doesn't support it in
> the normal way, and is broken by applications doing DDL as part of the
> regular operation.
>
> A
>
> --
> Andrew Sullivan
> ajs@commandprompt.com
> +1 503 667 4564 x104
> http://www.commandprompt.com/
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Attachment

Re: The state of PG replication in 2008/Q2?

From
Joshua Drake
Date:
On Thu, 21 Aug 2008 23:21:26 +0200
Mathias Stjernström <mathias@globalinn.com> wrote:

> Yes thats true. It does support DDL changes but not in a automatic
> way. You have to execute all DDL changes with a separate script.
>
> What's the status of
> http://www.commandprompt.com/products/mammothreplicator/ ?
>

It is about to go open source but it doesn't replicate DDL either.

Joshua D. Drake
--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



Re: The state of PG replication in 2008/Q2?

From
salman
Date:

Mathias Stjernström wrote:
> Yes thats true. It does support DDL changes but not in a automatic way.
> You have to execute all DDL changes with a separate script.
>

That's true, but it's quite simple to do with the provided perl
script(s) - slonik_execute_script. I've had to make use of it a few
times and have had no problems.

-salman

Re: The state of PG replication in 2008/Q2?

From
Alan Hodgson
Date:
On Thursday 21 August 2008, salman <salmanb@quietcaresystems.com> wrote:
> Mathias Stjernström wrote:
> > Yes thats true. It does support DDL changes but not in a automatic way.
> > You have to execute all DDL changes with a separate script.
>
> That's true, but it's quite simple to do with the provided perl
> script(s) - slonik_execute_script. I've had to make use of it a few
> times and have had no problems.

I do it almost every day, and it is not all that simple if your
configuration is complex. The original poster would require at least 47
different Slony clusters, for starters. The complications from adding and
dropping tables and sequences across 47 databases, and trying to keep Slony
up to date throughout, staggers the imagination, honestly.

--
Alan

Re: The state of PG replication in 2008/Q2?

From
Alvaro Herrera
Date:
Joshua Drake wrote:
> On Thu, 21 Aug 2008 23:21:26 +0200
> Mathias Stjernström <mathias@globalinn.com> wrote:
>
> > Yes thats true. It does support DDL changes but not in a automatic
> > way. You have to execute all DDL changes with a separate script.
> >
> > What's the status of
> > http://www.commandprompt.com/products/mammothreplicator/ ?
>
> It is about to go open source but it doesn't replicate DDL either.

It doesn't replicate multiple databases either.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: The state of PG replication in 2008/Q2?

From
david@lang.hm
Date:
On Thu, 21 Aug 2008, Mathias Stjernstr?m wrote:

> Hi Dan!
>
> Its true, many of the replication options that exists for PostgreSQL have not
> seen any updates in a while.
>
> If you only looking for redundancy and not a performance gain you should look
> at PostgreSQL PITR
> (http://www.postgresql.org/docs/8.1/static/backup-online.html)
>
> For Master-Slave replication i think that Slony http://www.slony.info/ is
> most up to date. But it does not support DDL changes.
>
> You may wich to look at pgpool http://pgpool.projects.postgresql.org/ it
> supports Synchronous replication (wich is good for data integrity, but can be
> bad for performance).
>
> These are some of the open source options. I do not have any experience with
> the commercial onces.

a couple of months ago there was a lot of news about a WAL based
replication engine. one that was closed source, but possibly getting
opened shortly, and also the decision by the core devs to add one into the
base distro.

what's been happening on this front?

from my understanding the first versions of this would not support queries
of the replica, but would provide for the consistancy needed for reliable
failover.

David Lang

Re: The state of PG replication in 2008/Q2?

From
Joshua Drake
Date:
On Thu, 21 Aug 2008 17:54:11 -0400
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> Joshua Drake wrote:
> > On Thu, 21 Aug 2008 23:21:26 +0200
> > Mathias Stjernström <mathias@globalinn.com> wrote:
> >
> > > Yes thats true. It does support DDL changes but not in a
> > > automatic way. You have to execute all DDL changes with a
> > > separate script.
> > >
> > > What's the status of
> > > http://www.commandprompt.com/products/mammothreplicator/ ?
> >
> > It is about to go open source but it doesn't replicate DDL either.
>
> It doesn't replicate multiple databases either.
>

True

Joshua D. Drake

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



Re: The state of PG replication in 2008/Q2?

From
RW
Date:
> My company finally has the means to install a new database server for
> replication.  I have Googled and found a lot of sparse information out
> there regarding replication systems for PostgreSQL and a lot of it
> looks very out-of-date.  Can I please get some ideas from those of you
> that are currently using fail-over replication systems?  What
> advantage does your solution have?  What are the "gotchas" I need to
> worry about?
>
> My desire would be to have a parallel server that could act as a hot
> standby system with automatic fail over in a multi-master role.  If
> our primary server goes down for whatever reason, the secondary would
> take over and handle the load seamlessly.  I think this is really the
> "holy grail" scenario and I understand how difficult it is to
> achieve.  Especially since we make frequent use of sequences in our
> databases.  If MM is too difficult, I'm willing to accept a
> hot-standby read-only system that will handle queries until we can fix
> whatever ails the master.
> We are primary an OLAP environment but there is a constant stream of
> inserts into the databases.  There are 47 different databases hosted
> on the primary server and this number will continue to scale up to
> whatever the server seems to support.  The reason I mention this
> number is that it seems that those systems that make heavy use of
> schema changes require a lot of "fiddling".  For a single database,
> this doesn't seem too problematic, but any manual work involved and
> administrative overhead will scale at the same rate as the database
> count grows and I certainly want to minimize as much fiddling as
> possible.
>
>
If you really need "only" need automatic failover than use DRBD + Heartbeat
somebody already mentioned. We are using this solution since 3 years now.
With DRBD replication is done at filesystem block level. So you don't
have to
bother about changes done with a DDL statement and Heartbeat will
automatically failover if one server goes down. It's really stable.

If you want MM you should give Cybercluster a try.
(http://www.postgresql.at/english/pr_cybercluster_e.html)
They offer good support and is Open Source since a few month now.

Robert


Re: The state of PG replication in 2008/Q2?

From
Mathias Stjernström
Date:
Yup, but sometimes you are not in charge of the DDL changes. You may
have many different users that make changes or for example if you are
working with Ruby On Rails you have something thats called Migrations
that handles all DDL changes in those situations it can get really
complicated with those slony scripts. My experience is that automatic
handling of DDL changes is a very important feature of a replication
system of curse not in all systems but in many.

I am also very interested in the WAL replication that David Lang asked
about.

Best regards,
Mathias Stjernström

http://www.pastbedti.me/



On 21 aug 2008, at 23.26, salman wrote:

>
>
> Mathias Stjernström wrote:
>> Yes thats true. It does support DDL changes but not in a automatic
>> way. You have to execute all DDL changes with a separate script.
>
> That's true, but it's quite simple to do with the provided perl
> script(s) - slonik_execute_script. I've had to make use of it a few
> times and have had no problems.
>
> -salman


Attachment

Re: The state of PG replication in 2008/Q2?

From
Mathias Stjernström
Date:
I Agree with Robert but i never heard of Cybercluster before.
Does anyone have any experience with Cybercluster? It sounds really
interesting!

Best regards,
Mathias Stjernström

http://www.pastbedti.me/


On 22 aug 2008, at 08.18, RW wrote:

>
>> My company finally has the means to install a new database server
>> for replication.  I have Googled and found a lot of sparse
>> information out there regarding replication systems for PostgreSQL
>> and a lot of it looks very out-of-date.  Can I please get some
>> ideas from those of you that are currently using fail-over
>> replication systems?  What advantage does your solution have?  What
>> are the "gotchas" I need to worry about?
>>
>> My desire would be to have a parallel server that could act as a
>> hot standby system with automatic fail over in a multi-master
>> role.  If our primary server goes down for whatever reason, the
>> secondary would take over and handle the load seamlessly.  I think
>> this is really the "holy grail" scenario and I understand how
>> difficult it is to achieve.  Especially since we make frequent use
>> of sequences in our databases.  If MM is too difficult, I'm willing
>> to accept a hot-standby read-only system that will handle queries
>> until we can fix whatever ails the master.
>> We are primary an OLAP environment but there is a constant stream
>> of inserts into the databases.  There are 47 different databases
>> hosted on the primary server and this number will continue to scale
>> up to whatever the server seems to support.  The reason I mention
>> this number is that it seems that those systems that make heavy use
>> of schema changes require a lot of "fiddling".  For a single
>> database, this doesn't seem too problematic, but any manual work
>> involved and administrative overhead will scale at the same rate as
>> the database count grows and I certainly want to minimize as much
>> fiddling as possible.
>>
>>
> If you really need "only" need automatic failover than use DRBD +
> Heartbeat
> somebody already mentioned. We are using this solution since 3 years
> now.
> With DRBD replication is done at filesystem block level. So you
> don't have to
> bother about changes done with a DDL statement and Heartbeat will
> automatically failover if one server goes down. It's really stable.
>
> If you want MM you should give Cybercluster a try. (http://www.postgresql.at/english/pr_cybercluster_e.html
> )
> They offer good support and is Open Source since a few month now.
>
> Robert
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Attachment

Re: The state of PG replication in 2008/Q2?

From
Peter Eisentraut
Date:
Dan Harris wrote:
> My desire would be to have a parallel server that could act as a hot
> standby system with automatic fail over in a multi-master role.

I will add my "me too" for DRBD + Heartbeat.

Re: The state of PG replication in 2008/Q2?

From
Jan Otto
Date:
Hi Mathias,

On Aug 22, 2008, at 8:35 AM, Mathias Stjernström wrote:

> I Agree with Robert but i never heard of Cybercluster before.
> Does anyone have any experience with Cybercluster? It sounds really
> interesting!

Some months ago i took a look into cybercluster. At that point
cybercluster was
basically postgres-source 8.3 patched already with pgcluster sources.

Best regards,

Jan

Re: The state of PG replication in 2008/Q2?

From
Shane Ambler
Date:
Jan Otto wrote:
> Hi Mathias,
>
> On Aug 22, 2008, at 8:35 AM, Mathias Stjernström wrote:
>
>> I Agree with Robert but i never heard of Cybercluster before. Does
>> anyone have any experience with Cybercluster? It sounds really
>> interesting!
>
> Some months ago i took a look into cybercluster. At that point
> cybercluster was basically postgres-source 8.3 patched already with
> pgcluster sources.
>

I do believe it is a packaged version of pgcluster

Does anyone have experience with bucardo? It's still a recent addition
to open source offerings. No DDL replication but it does support two
masters.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz