Thread: Replication options?

Replication options?

From
"Liam Lesboch"
Date:
Greetings,

Yesterday theres was a brief discuissions about replications software for
PostgreSQL. My boss and I saw only two replications for PostgreSQL that was
spoken of. We found no reviews on the internet that spoke of either or
compared them and we are considering PostgreSQL as an options for a movement
from our present platform of Microsoft SQL Server that has replications and
stored procedures. Can peoples direct me towards third-party reviews online
of the replications so that we can use as examples for consideration of
products and make migration?

Thank you,

Liam

_________________________________________________________________
Add photos to your e-mail with MSN 8. Get 2 months FREE*.
http://join.msn.com/?page=features/featuredemail


Re: Replication options?

From
"Liam Lesboch"
Date:
Thank you much,

When I perform the google search:
http://www.google.com/search?q=slony-i+review

I do not find reviews and critiques of slony-i. Are there many companies
using for their enterprise level database systems? Without reviews in the
magazines, my bosses uncomfort with PostgreSQL will not be remedied with
just project maintainers word of mouth.

Liam



>From: Bruce Momjian <pgman@candle.pha.pa.us>
>To: Liam Lesboch <liamlesboch@hotmail.com>
>CC: pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options?
>Date: Tue, 10 Aug 2004 13:12:39 -0400 (EDT)
>
>
>Most people are using Sloney for master/slave replication.  You can
>search for it easily.
>
>---------------------------------------------------------------------------
>
>Liam Lesboch wrote:
> > Greetings,
> >
> > Yesterday theres was a brief discuissions about replications software
>for
> > PostgreSQL. My boss and I saw only two replications for PostgreSQL that
>was
> > spoken of. We found no reviews on the internet that spoke of either or
> > compared them and we are considering PostgreSQL as an options for a
>movement
> > from our present platform of Microsoft SQL Server that has replications
>and
> > stored procedures. Can peoples direct me towards third-party reviews
>online
> > of the replications so that we can use as examples for consideration of
> > products and make migration?
> >
> > Thank you,
> >
> > Liam
> >
> > _________________________________________________________________
> > Add photos to your e-mail with MSN 8. Get 2 months FREE*.
> > http://join.msn.com/?page=features/featuredemail
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
> >
>
>--
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania
>19073

_________________________________________________________________
STOP MORE SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail


Re: Replication options?

From
Bruce Momjian
Date:
Most people are using Sloney for master/slave replication.  You can
search for it easily.

---------------------------------------------------------------------------

Liam Lesboch wrote:
> Greetings,
>
> Yesterday theres was a brief discuissions about replications software for
> PostgreSQL. My boss and I saw only two replications for PostgreSQL that was
> spoken of. We found no reviews on the internet that spoke of either or
> compared them and we are considering PostgreSQL as an options for a movement
> from our present platform of Microsoft SQL Server that has replications and
> stored procedures. Can peoples direct me towards third-party reviews online
> of the replications so that we can use as examples for consideration of
> products and make migration?
>
> Thank you,
>
> Liam
>
> _________________________________________________________________
> Add photos to your e-mail with MSN 8. Get 2 months FREE*.
> http://join.msn.com/?page=features/featuredemail
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Replication options?

From
"Liam Lesboch"
Date:
Thes slashdots post today about the beta releases of 8.0 caught the
attention of my boss and I. Many comments about the replicator issue and saw
many posts about Slony-I in particular. Maybe this is the only viable option
in PostgreSQL? There are others that cost money but no where did we surface
a article that spoke of them in any form of critique or tutorial (good or
bad) and thats a concern for my boss. What large companies use replicators
for PostgreSQL?

Liam


>From: "Liam Lesboch" <liamlesboch@hotmail.com>
>To: pgsql-general@postgresql.org
>CC: pgman@candle.pha.pa.us
>Subject: Re: [GENERAL] Replication options?
>Date: Tue, 10 Aug 2004 20:29:08 +0000
>
>Thank you much,
>
>When I perform the google search:
>http://www.google.com/search?q=slony-i+review
>
>I do not find reviews and critiques of slony-i. Are there many companies
>using for their enterprise level database systems? Without reviews in the
>magazines, my bosses uncomfort with PostgreSQL will not be remedied with
>just project maintainers word of mouth.
>
>Liam
>
>
>
>>From: Bruce Momjian <pgman@candle.pha.pa.us>
>>To: Liam Lesboch <liamlesboch@hotmail.com>
>>CC: pgsql-general@postgresql.org
>>Subject: Re: [GENERAL] Replication options?
>>Date: Tue, 10 Aug 2004 13:12:39 -0400 (EDT)
>>
>>
>>Most people are using Sloney for master/slave replication.  You can
>>search for it easily.
>>
>>---------------------------------------------------------------------------
>>
>>Liam Lesboch wrote:
>> > Greetings,
>> >
>> > Yesterday theres was a brief discuissions about replications software
>>for
>> > PostgreSQL. My boss and I saw only two replications for PostgreSQL that
>>was
>> > spoken of. We found no reviews on the internet that spoke of either or
>> > compared them and we are considering PostgreSQL as an options for a
>>movement
>> > from our present platform of Microsoft SQL Server that has replications
>>and
>> > stored procedures. Can peoples direct me towards third-party reviews
>>online
>> > of the replications so that we can use as examples for consideration of
>> > products and make migration?
>> >
>> > Thank you,
>> >
>> > Liam
>> >
>> > _________________________________________________________________
>> > Add photos to your e-mail with MSN 8. Get 2 months FREE*.
>> > http://join.msn.com/?page=features/featuredemail
>> >
>> >
>> > ---------------------------(end of
>>broadcast)---------------------------
>> > TIP 6: Have you searched our list archives?
>> >
>> >                http://archives.postgresql.org
>> >
>>
>>--
>>   Bruce Momjian                        |  http://candle.pha.pa.us
>>   pgman@candle.pha.pa.us               |  (610) 359-1001
>>   +  If your life is a hard drive,     |  13 Roberts Road
>>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania
>>19073
>
>_________________________________________________________________
>STOP MORE SPAM with the new MSN 8 and get 2 months FREE*
>http://join.msn.com/?page=features/junkmail
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus


Re: Replication options?

From
"Scott Marlowe"
Date:
On Tue, 2004-08-10 at 16:53, Liam Lesboch wrote:
> Thes slashdots post today about the beta releases of 8.0 caught the
> attention of my boss and I. Many comments about the replicator issue and saw
> many posts about Slony-I in particular. Maybe this is the only viable option
> in PostgreSQL? There are others that cost money but no where did we surface
> a article that spoke of them in any form of critique or tutorial (good or
> bad) and thats a concern for my boss. What large companies use replicators
> for PostgreSQL?

Back in the days of 6.5.3 when we were looking at using it at the last
company I was at, I built my own test suite to make sure postgresql
could handle the load we were going to throw at it.

No matter what the guys in the nice suits from big companies tell you to
sell their product, you owe it to yourself and your company to prove /
disprove the THEORY that a particular piece of software will do what you
need.

Who knows, maybe you could be the one writing the article on Slony-I
that someone else reads before they try it.


Re: Replication options?

From
Jeff Eckermann
Date:
--- Liam Lesboch <liamlesboch@hotmail.com> wrote:

> Thes slashdots post today about the beta releases of
> 8.0 caught the
> attention of my boss and I. Many comments about the
> replicator issue and saw
> many posts about Slony-I in particular. Maybe this
> is the only viable option
> in PostgreSQL? There are others that cost money but
> no where did we surface
> a article that spoke of them in any form of critique
> or tutorial (good or
> bad) and thats a concern for my boss. What large
> companies use replicators
> for PostgreSQL?

You are not likely to read too many reviews of
Slony-I, simply because it is a brand new product.
But many people have been using it already, even as
development code, with good results.

That probably won't impress your bosses.  If you need
a track record, then erServer might be what you need.
erServer is a commercially produced product that was
(is still?) used by Afilias, the provider of registry
services for the .info and .org domains.  That's
serious testing; very large databases and lots of
traffic.

Note that Afilias paid for the development of Slony-I;
they have employed PostgreSQL core developer Jan Wieck
to do that (and probably other things as well).

erServer has now been open sourced, so you can get it
for free.  erServer was created by PostgreSQL, Inc.,
which also provides support services for PostgreSQL
(another point your bosses might be interested in).
Check http://www.pgsql.com for more information.

You can get more information about commercial support
for PostgreSQL at
http://techdocs.postgresql.org/companies.php

HTH

>
> Liam
>
>
> >From: "Liam Lesboch" <liamlesboch@hotmail.com>
> >To: pgsql-general@postgresql.org
> >CC: pgman@candle.pha.pa.us
> >Subject: Re: [GENERAL] Replication options?
> >Date: Tue, 10 Aug 2004 20:29:08 +0000
> >
> >Thank you much,
> >
> >When I perform the google search:
> >http://www.google.com/search?q=slony-i+review
> >
> >I do not find reviews and critiques of slony-i. Are
> there many companies
> >using for their enterprise level database systems?
> Without reviews in the
> >magazines, my bosses uncomfort with PostgreSQL will
> not be remedied with
> >just project maintainers word of mouth.
> >
> >Liam
> >
> >
> >
> >>From: Bruce Momjian <pgman@candle.pha.pa.us>
> >>To: Liam Lesboch <liamlesboch@hotmail.com>
> >>CC: pgsql-general@postgresql.org
> >>Subject: Re: [GENERAL] Replication options?
> >>Date: Tue, 10 Aug 2004 13:12:39 -0400 (EDT)
> >>
> >>
> >>Most people are using Sloney for master/slave
> replication.  You can
> >>search for it easily.
> >>
>
>>---------------------------------------------------------------------------
> >>
> >>Liam Lesboch wrote:
> >> > Greetings,
> >> >
> >> > Yesterday theres was a brief discuissions about
> replications software
> >>for
> >> > PostgreSQL. My boss and I saw only two
> replications for PostgreSQL that
> >>was
> >> > spoken of. We found no reviews on the internet
> that spoke of either or
> >> > compared them and we are considering PostgreSQL
> as an options for a
> >>movement
> >> > from our present platform of Microsoft SQL
> Server that has replications
> >>and
> >> > stored procedures. Can peoples direct me
> towards third-party reviews
> >>online
> >> > of the replications so that we can use as
> examples for consideration of
> >> > products and make migration?
> >> >
> >> > Thank you,
> >> >
> >> > Liam
> >> >
> >> >
>
_________________________________________________________________
> >> > Add photos to your e-mail with MSN 8. Get 2
> months FREE*.
> >> >
> http://join.msn.com/?page=features/featuredemail
> >> >
> >> >
> >> > ---------------------------(end of
> >>broadcast)---------------------------
> >> > TIP 6: Have you searched our list archives?
> >> >
> >> >                http://archives.postgresql.org
> >> >
> >>
> >>--
> >>   Bruce Momjian                        |
> http://candle.pha.pa.us
> >>   pgman@candle.pha.pa.us               |  (610)
> 359-1001
> >>   +  If your life is a hard drive,     |  13
> Roberts Road
> >>   +  Christ can be your backup.        |  Newtown
> Square, Pennsylvania
> >>19073
> >
>
>_________________________________________________________________
> >STOP MORE SPAM with the new MSN 8 and get 2 months
> FREE*
> >http://join.msn.com/?page=features/junkmail
> >
> >
> >---------------------------(end of
> broadcast)---------------------------
> >TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>
>
_________________________________________________________________
> MSN 8 with e-mail virus protection service: 2 months
> FREE*
> http://join.msn.com/?page=features/virus
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>




__________________________________
Do you Yahoo!?
Yahoo! Mail � Now with 25x more storage than before!
http://promotions.yahoo.com/new_mail

Re: Replication options?

From
"Liam Lesboch"
Date:


>From: Jeff Eckermann <jeff_eckermann@yahoo.com>
>To: Liam Lesboch <liamlesboch@hotmail.com>, pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options?
>Date: Wed, 11 Aug 2004 07:20:14 -0700 (PDT)
>
>
>--- Liam Lesboch <liamlesboch@hotmail.com> wrote:
>
> > Thes slashdots post today about the beta releases of
> > 8.0 caught the
> > attention of my boss and I. Many comments about the
> > replicator issue and saw
> > many posts about Slony-I in particular. Maybe this
> > is the only viable option
> > in PostgreSQL? There are others that cost money but
> > no where did we surface
> > a article that spoke of them in any form of critique
> > or tutorial (good or
> > bad) and thats a concern for my boss. What large
> > companies use replicators
> > for PostgreSQL?
>
>You are not likely to read too many reviews of
>Slony-I, simply because it is a brand new product.
>But many people have been using it already, even as
>development code, with good results.
>
>That probably won't impress your bosses.  If you need
>a track record, then erServer might be what you need.
>erServer is a commercially produced product that was
>(is still?) used by Afilias, the provider of registry
>services for the .info and .org domains.  That's
>serious testing; very large databases and lots of
>traffic.
>
>Note that Afilias paid for the development of Slony-I;
>they have employed PostgreSQL core developer Jan Wieck
>to do that (and probably other things as well).
>
>erServer has now been open sourced, so you can get it
>for free.  erServer was created by PostgreSQL, Inc.,
>which also provides support services for PostgreSQL
>(another point your bosses might be interested in).
>Check http://www.pgsql.com for more information.
>
>You can get more information about commercial support
>for PostgreSQL at
>http://techdocs.postgresql.org/companies.php
>
>HTH
>

Thank you for taking the time to provide another option for me with some
background information.

Liam

_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail


Re: Replication options?

From
Tom Lane
Date:
Jeff Eckermann <jeff_eckermann@yahoo.com> writes:
> That probably won't impress your bosses.  If you need
> a track record, then erServer might be what you need.
> erServer is a commercially produced product that was
> (is still?) used by Afilias, the provider of registry
> services for the .info and .org domains.  That's
> serious testing; very large databases and lots of
> traffic.

> Note that Afilias paid for the development of Slony-I;

... because they were quite unhappy with erServer ...

Now erServer did work for them, but it required significant amounts of
tuning and constant babysitting by the DBA.  (If Andrew Sullivan is
paying attention to this thread, he can offer lots of gory details.)
I can also personally testify that getting erServer set up is a major
pain in the rear.  I haven't messed with Slony, but all reports are that
it's a substantially better piece of code.

            regards, tom lane

Re: Replication options?

From
Vivek Khera
Date:
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:

>> serious testing; very large databases and lots of
>> traffic.

>> Note that Afilias paid for the development of Slony-I;

TL> ... because they were quite unhappy with erServer ...


The major point Jan brings up when you talk about this is that
eRServer offers a nice failover model.  But then what?  How do you get
back to your production server?  Swapping master/slave to do
maintenance on the master using eRServer is not really something a
mere mortal can do, either.  I paid for a license to use eRServer.  I
wouldn't ever do that again...  It was unable to solve my main problem
with multi-million row tables, but it did work on a small production
database with a few hundred rows just fine (albeit with 200MB+ memory
footprint!)



--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Replication options?

From
Jeff Eckermann
Date:
--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Jeff Eckermann <jeff_eckermann@yahoo.com> writes:
> > That probably won't impress your bosses.  If you
> need
> > a track record, then erServer might be what you
> need.
> > erServer is a commercially produced product that
> was
> > (is still?) used by Afilias, the provider of
> registry
> > services for the .info and .org domains.  That's
> > serious testing; very large databases and lots of
> > traffic.
>
> > Note that Afilias paid for the development of
> Slony-I;
>
> ... because they were quite unhappy with erServer
> ...
>
> Now erServer did work for them, but it required
> significant amounts of
> tuning and constant babysitting by the DBA.  (If
> Andrew Sullivan is
> paying attention to this thread, he can offer lots
> of gory details.)
> I can also personally testify that getting erServer
> set up is a major
> pain in the rear.  I haven't messed with Slony, but
> all reports are that
> it's a substantially better piece of code.
>
>             regards, tom lane
>

Granted...
Liam's bosses want something with a history of
successful use in a serious production situation, and
erServer at least has that.  Slony has been around for
too short a time to make that claim, yet.






__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail

Re: Replication options?

From
Peter Eisentraut
Date:
Liam Lesboch wrote:
> Thes slashdots post today about the beta releases of 8.0 caught the
> attention of my boss and I. Many comments about the replicator issue
> and saw many posts about Slony-I in particular. Maybe this is the
> only viable option in PostgreSQL? There are others that cost money
> but no where did we surface a article that spoke of them in any form
> of critique or tutorial (good or bad) and thats a concern for my
> boss. What large companies use replicators for PostgreSQL?

Slony-I and eRServer give you an asynchronous master/slave replication
system that allows you to setup load balancing or data warehouse type
things, or even failover with some replication gap.  In my experience,
most people don't need the load balancing part.  If all you're after is
securing your database system against hardware failures (which most
people are after), I suggest you set up two machines with a shared
storage (talk to your hardware vendor) or a replicating file system
(like DRBD) and make the two machines monitor each other so that only
one machine has the database mounted at any time.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Replication options?

From
"Joshua D. Drake"
Date:
> Granted...
> Liam's bosses want something with a history of
> successful use in a serious production situation, and
> erServer at least has that.  Slony has been around for
> too short a time to make that claim, yet.

There is also Mammoth Replicator which is an integrated replication
approach that does not require any triggers. It is a commercial product
though.

Sincerely,

Joshua D. Drake



>
>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> New and Improved Yahoo! Mail - 100MB free storage!
> http://promotions.yahoo.com/new_mail
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment

Re: Replication options?

From
"Liam Lesboch"
Date:
>From: "Joshua D. Drake" <jd@commandprompt.com>
>To: Jeff Eckermann <jeff_eckermann@yahoo.com>
>CC: Tom Lane <tgl@sss.pgh.pa.us>,Liam Lesboch <liamlesboch@hotmail.com>,
>pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options?
>Date: Wed, 11 Aug 2004 13:38:54 -0700
>
>
>>Granted...
>>Liam's bosses want something with a history of
>>successful use in a serious production situation, and
>>erServer at least has that.  Slony has been around for
>>too short a time to make that claim, yet.
>
>There is also Mammoth Replicator which is an integrated replication
>approach that does not require any triggers. It is a commercial product
>though.
>

Thank you for your info. Are there any differences of the trigger versus non
trigger systems? We are a commercial companie that has no products
(software) but have many servers. There is no BSD version of mamoth on the
website and we have many servers on BSD and the linux servers.

Does the Slony-I replications operation on BSD? Is there a reviews of the
Mammoth online?

Liam

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.
http://join.msn.com/?page=features/virus


Re: Replication options?

From
"Liam Lesboch"
Date:


>From: Peter Eisentraut <peter_e@gmx.net>
>To: "Liam Lesboch" <liamlesboch@hotmail.com>,pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options?
>Date: Wed, 11 Aug 2004 21:54:28 +0200
>
>Liam Lesboch wrote:
> > Thes slashdots post today about the beta releases of 8.0 caught the
> > attention of my boss and I. Many comments about the replicator issue
> > and saw many posts about Slony-I in particular. Maybe this is the
> > only viable option in PostgreSQL? There are others that cost money
> > but no where did we surface a article that spoke of them in any form
> > of critique or tutorial (good or bad) and thats a concern for my
> > boss. What large companies use replicators for PostgreSQL?
>
>Slony-I and eRServer give you an asynchronous master/slave replication
>system that allows you to setup load balancing or data warehouse type
>things, or even failover with some replication gap.  In my experience,
>most people don't need the load balancing part.  If all you're after is
>securing your database system against hardware failures (which most
>people are after), I suggest you set up two machines with a shared
>storage (talk to your hardware vendor) or a replicating file system
>(like DRBD) and make the two machines monitor each other so that only
>one machine has the database mounted at any time.
>

The boss is seeking a system located in 5 countries and they have own copy
of master servers in real time.  Master->SubMaster(s) (each country) -
 >slaves(1-3 each country)

Make sense?

Liam

_________________________________________________________________
The new MSN 8: advanced junk mail protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail


Re: Replication options?

From
"Liam Lesboch"
Date:


>From: Jeff Eckermann <jeff_eckermann@yahoo.com>
>To: Tom Lane <tgl@sss.pgh.pa.us>
>CC: Liam Lesboch <liamlesboch@hotmail.com>, pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options? Date: Wed, 11 Aug 2004 12:38:34
>-0700 (PDT)
>
>--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Jeff Eckermann <jeff_eckermann@yahoo.com> writes:
> > > That probably won't impress your bosses.  If you
> > need
> > > a track record, then erServer might be what you
> > need.
> > > erServer is a commercially produced product that
> > was
> > > (is still?) used by Afilias, the provider of
> > registry
> > > services for the .info and .org domains.  That's
> > > serious testing; very large databases and lots of
> > > traffic.
> >
> > > Note that Afilias paid for the development of
> > Slony-I;
> >
> > ... because they were quite unhappy with erServer
> > ...
> >
> > Now erServer did work for them, but it required
> > significant amounts of
> > tuning and constant babysitting by the DBA.  (If
> > Andrew Sullivan is
> > paying attention to this thread, he can offer lots
> > of gory details.)
> > I can also personally testify that getting erServer
> > set up is a major
> > pain in the rear.  I haven't messed with Slony, but
> > all reports are that
> > it's a substantially better piece of code.
> >
> >             regards, tom lane
> >
>
>Granted...
>Liam's bosses want something with a history of
>successful use in a serious production situation, and
>erServer at least has that.  Slony has been around for
>too short a time to make that claim, yet.
>

Thank you. You understand my position. I am the advocate for using Open
Source technologies in our productions environment. My boss, as they say has
the foot out the door and one inside or is the putting the toes in the
waters not jumping in top first. He reads the magazines and shows me the
reviews in them and it takes much work to show him the other stories that
are not wrote yet.

Liam

_________________________________________________________________
The new MSN 8: advanced junk mail protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail


Re: Replication options?

From
Andrew Rawnsley
Date:
On Aug 11, 2004, at 12:02 PM, Tom Lane wrote:

> Jeff Eckermann <jeff_eckermann@yahoo.com> writes:
>> That probably won't impress your bosses.  If you need
>> a track record, then erServer might be what you need.
>> erServer is a commercially produced product that was
>> (is still?) used by Afilias, the provider of registry
>> services for the .info and .org domains.  That's
>> serious testing; very large databases and lots of
>> traffic.
>
>> Note that Afilias paid for the development of Slony-I;
>
> ... because they were quite unhappy with erServer ...
>
> Now erServer did work for them, but it required significant amounts of
> tuning and constant babysitting by the DBA.  (If Andrew Sullivan is
> paying attention to this thread, he can offer lots of gory details.)
> I can also personally testify that getting erServer set up is a major
> pain in the rear.  I haven't messed with Slony, but all reports are
> that
> it's a substantially better piece of code.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>


As the last person to do any significant work on the eRServer code (at
least the
open-source version), I would have to agree that Slony is the direction
to
take looking forward. I would like to think we managed to fix a lot of
problems, and
improved the usability of it quite a bit, but Andrew Sullivan and I
agreed that its a dead
end - there will be a final packaging of what's in CVS and that will be
it, unless someone else
wants to pick it up.


--------------------

Andrew Rawnsley
President
The Ravensfield Digital Resource Group, Ltd.
(740) 587-0114
www.ravensfield.com


Re: Replication options?

From
Andrew Sullivan
Date:
I'll try this again, since it doesn't seem to have made it to the
list.


On Wed, Aug 11, 2004 at 12:02:07PM -0400, Tom Lane wrote:
> Now erServer did work for them, but it required significant amounts of
> tuning and constant babysitting by the DBA.  (If Andrew Sullivan is
> paying attention to this thread, he can offer lots of gory details.)
> I can also personally testify that getting erServer set up is a major
> pain in the rear.  I haven't messed with Slony, but all reports are that
> it's a substantially better piece of code.

I can indeed provide gory details.  Erserver worked for us, and was
able to handle the load we gave it (at times pretty substantial).
But it had a number of flaws.  Some of these were mere matters of
implementation, and some were (in my view) fundamental.  Since I've
been observing radio silence on the list lately, I feel entitled to
blather on at length now.  So, below is the gore, and the reasons we
finally decided to abandon erserver.  This is very similar to the
negative part of what I had to say at OSCON, so if you were bored by
me there, you'll find this equally boring.

A.  First, the implementation faults.  As Vivek Khera pointed out,
the failover and set up support is not strong.

1.  Setting up erserver on a system which is not already replicated
is a major pain.  (We didn't have this problem because we always
launched with erserver support in place.)  On a database of a few gig,
you could easily have to take 24 hours downtime to get it set up.
Some of that was just faulty implementation, and if you have a single
not null unique column on every table, the problem is more to do with
poorly conceived setup scripts.  But finding this out turns out to
depend on having available an expert in the system (and as far as I
know, almost all the experts on it actually work here at Afilias.  I
did put together some notes on this topic for the BSD version of
erserver.  They're at
<http://gborg.postgresql.org/pipermail/erserver-general/2003-October/000169.html>
or <http://tinyurl.com/66b89>.)

2.  Switchover is also a pain (we don't like to talk about failover:
erserver is, like Slony-I, async, and failover more or less
automatically risks stranding data on the dead master).  There are
some automation scripts which make it a little easier, but the basic
problem is getting your slave into a condition where it can actually
take over from the master.  The slaves in erserver really don't know
enough about the master to be in a position to do this.  It _is_
possible: I've done it.  It's not fun.  (Failing back is even less
fun, and essentially requires you to build a new slave.  See A1, above.
If you're going to use erserver as a disaster-avoidance system, you
need two identical servers, so that any one can play the role of
master.)

3.  The engine was written in Java.  Java is a nice language, but the
JDK from Sun imposes a 3 G limit on the size of the JVM.  If you get
far enough behind, the VM just blows up, and then you have no hope of
recovering.  This is a _very_ serious limitation for high traffic
sites.  It also turned out to be completely fatal for certain users
who wanted to replicate large objects: one object would be enough to
make the system fail (for reasons that are too incredible to go into,
the process actually has two copies of the data at one time during a
part of processing.  This is just a bug, though a dangerous one).

4.  The logging code was deliberately obfuscatory.  For some reason,
the person who originally wrote the Java code (note _not_ the original
code from PostgreSQL, Inc.) decided to wrap all the error handling in
an outer layer which returned the line number of the error handler
every time it threw an exception.  This meant that, from looking at
the logs, every case of a bug looks like it happened at the same
place.  You can imagine how much fun it was to fix things.  Every
person I've ever known who looked at the logging code suffered
retinal damange -- it was that bad.  (This is acutally fixed in the
PostgreSQL-commercial version of the software, BTW.)

B.  Second, the fundamental errors.

1.  The first big problem came from something we thought was an
advantage: erserver replicates only the latest version of the row.
This reduces the replication overhead considerably, and for a long
time I was a great proponent of this approach.  I was wrong, because
the performance overhead that it imposes under certain perverse kinds
of loads is well and truly awful.  Even in the normal circumstance,
the performance penalty is noticable; but it's not a problem if you
have enough excess capacity.  When that capacity is squeezed, you
run into a lot of pain.  In such cases, the replication application
starts to slow down.  Get under really heavy load, and you start to
have to worry about the JVM limits outlined in A3.  This can be
dealt with, but you absolutely need to hold its hand when things are
bad.  Your DBAs have better things to do, I assume.

The decision to send only the last row also cost some functionality,
because you can't build an historical-database slave with erserver,
unfortunately (if a row gets updated twice in the space of one
transaction, you won't see two changes on the slave, but only the
final state of the changed row).

2.  Finally, there is the problem that the snapshot applications
occasionally could get into the situation where applying rows to the
slave would result either in bad data (bad) or errors on unique
indexes (also bad).  You had to choose between making your slave even
more unlike your master or potentially getting called in the middle
of the night to hand-fix the deadlock condition.  (Some further
discussion of this feature of the software is at
<http://gborg.postgresql.org/pipermail/erserver-general/2003-October/000185.html>
or <http://tinyurl.com/5erj7>.)

It is really the items in B that finally conviced us that we had to
give up on the erserver code and work on a fresh system.  I think
Jan will confirm that his Slony-I work drew some useful inspiration
from the erserver code (in particular, the magic that Vadim
performed).  But ultimately, erserver taught us as much about what
_else_ you needed before you got a real replication system.  In
particular, we felt that you needed more knowledge at all the nodes
than erserver was able to provide.  By contrast, you can usefully
think of Slony-I as a cluster-communication system which happens to
specialise in keeping the data the same on all subscribing nodes.

This isn't to say that erserver is not undergoing development.  I
understand from Geoff Davidson of PostgreSQL, Inc, that they are
continuing work on the product, with an eye to a multi-master
distributed system and automatic failover.  I think such developments
would be welcomed by PostgreSQL users.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@ca.afilias.info>                              M2P 2A8
                                        +1 416 646 3304 x4110


Re: Replication options?

From
Jan Wieck
Date:
On 8/11/2004 5:07 PM, Liam Lesboch wrote:

>>From: "Joshua D. Drake" <jd@commandprompt.com>
>>To: Jeff Eckermann <jeff_eckermann@yahoo.com>
>>CC: Tom Lane <tgl@sss.pgh.pa.us>,Liam Lesboch <liamlesboch@hotmail.com>,
>>pgsql-general@postgresql.org
>>Subject: Re: [GENERAL] Replication options?
>>Date: Wed, 11 Aug 2004 13:38:54 -0700
>>
>>
>>>Granted...
>>>Liam's bosses want something with a history of
>>>successful use in a serious production situation, and
>>>erServer at least has that.  Slony has been around for
>>>too short a time to make that claim, yet.
>>
>>There is also Mammoth Replicator which is an integrated replication
>>approach that does not require any triggers. It is a commercial product
>>though.
>>
>
> Thank you for your info. Are there any differences of the trigger versus non
> trigger systems? We are a commercial companie that has no products
> (software) but have many servers. There is no BSD version of mamoth on the
> website and we have many servers on BSD and the linux servers.

There are some differences. Before going into the details let me make
clear that in the Slony-I case we are talking about generic triggers
written in C, that make extensive use of the internal prepared execution
plan features and take advantage of being able to access the system
catalog cache as well as any builtin functionality. My former work on
the foreign key implementation of PostgreSQL looks very similar.

I think the major operational difference between Mammoth Replicator and
Slony-I is where both collect the replication information. Slony-I
collects single row changes as log rows in regular database tables. It
filters out unchaged columns, so that only changed values and the
primary key of the row together with the column names appear in the log.

As far as I know (Joshua please clearify here) Mammoth Replicator writes
its own, binary journal containing the changes that need to be applied
to the replica.

On a first look inserting into database tables might look more
expensive. But there are some fine details that make it worth taking a
second look. One side effect of doing this is that collecting the
replication log together with changing the data on the origin (master)
is automatically covered by the exact same ACID properties the database
provides. I am not sure if or how Replicator guarantees that under all
possible circumstances and server crash situations the replication log
journal will contain exacly all committed transactions, and only those.
To make a transaction durable, the changes first have to be recorded in
PostgreSQL's crash recovery WAL. Only after that data is flushed to the
disk it can be assumed that the transaction will be redone in the case
of an immediately following crash. If a replication system now logs the
commit event before the WAL operation happens, it is possible that the
transaction does not commit on the master due to a crash, but it will be
replayed and committed on the slaves. On the other side if the
replication logging of the commit is done after the WAL operation, it
must be assured that WAL replay during crash recovery also causes
replication log journal to be recovered or repeated. In short, the
replication log must be covered by the same redo mechanism the crash
recovery system uses.

This all is only important for the case that one does not immediately
slam on the big red panic button and issues a full failover when the
main server crashes, but rather tries to bring the main server back. If
it can boot, has no FS inconsistencies and PostgreSQL's crash recovery
succeeds too, there is no need to fail over any more and one will want
to resume normal operation. If now the strict synchronization between
what PostgreSQL's crash recovery mechanism does restore and what will be
applied to the slave systems cannot be guaranteed, then there is the
possibility of loss of synchronization between master and slaves. You
just lost the data integrity of your backup server.

Slony-I has the replication log journal covered by PostgreSQL's native
ACID properties. I assume Joshua can explain how Mammoth Replicator
solves this problem.

Another important difference is automatic replication of schema changes.
This is a feature often asked for, and I have no idea where that wish
comes from. Certainly it does not stem from too much thinking about the
problem. Slony-I does not attempt to go into that direction. A trigger
based solution like Slony-I cannot do it anyway. Again, Joshua, what's
the plans or status with Replicator on that?

The reason why I consider automatic schema replication a subintelligent
idea is simply that it makes special purpose configurations impossible.
If one only needs a full backup server for failover, it sure is usefull.
But what about using several slaves as load balanced search engines,
while another slave is the data warehouse and yet another one is the
reporting server? It would certainly be desirable to maintain the
indexes used by the search engines only on the search engines, or have
some special triggers firing only on the data warehouse and again have
only a subset of tables replicated to the reporting server. Is that all
possible with a 100% schema and data copy replication system? No, it is
not.

>
> Does the Slony-I replications operation on BSD? Is there a reviews of the
> Mammoth online?

I use FreeBSD 4.9 for most of the development. In general Slony-I should
run on every PostgreSQL supported Unix platform that provides pthreads.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Replication options?

From
Christopher Browne
Date:
After a long battle with technology, JanWieck@Yahoo.com (Jan Wieck), an earthling, wrote:
> Another important difference is automatic replication of schema
> changes. This is a feature often asked for, and I have no idea where
> that wish comes from.

It actually looks like quite a positive thing to apply "univeral
schema" changes via something like the following:

EXECUTE SCRIPT (
    SET ID = 1,
    FILENAME = '2004-08-21-add-two-tables-and-7-indices.sql',
    EVENT NODE = 1
);

This approach imposes a certain amount of "discipline" on how things
are done, which seems like a feature, as opposed to a bug...
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/linux.html
There is a theory that states: "If anyone finds  out what the universe
is for, it will disappear and  be replaced by something more bizarrely
inexplicable." There is another theory  that states: "This has already
happened..." -Douglas Adams, "Hitch-Hikers Guide to the Galaxy"

Re: Replication options?

From
"Joshua D. Drake"
Date:
  > As far as I know (Joshua please clearify here) Mammoth Replicator
writes
> its own, binary journal containing the changes that need to be applied
> to the replica.


Yes that is correct.

>
> On a first look inserting into database tables might look more
> expensive. But there are some fine details that make it worth taking a
> second look. One side effect of doing this is that collecting the
> replication log together with changing the data on the origin (master)
> is automatically covered by the exact same ACID properties the database
> provides. I am not sure if or how Replicator guarantees that under all
> possible circumstances and server crash situations the replication log
> journal will contain exacly all committed transactions, and only those.

Replicator does not replicate until the transaction is committed.

> To make a transaction durable, the changes first have to be recorded in
> PostgreSQL's crash recovery WAL. Only after that data is flushed to the
> disk it can be assumed that the transaction will be redone in the case
> of an immediately following crash. If a replication system now logs the
> commit event before the WAL operation happens, it is possible that the
> transaction does not commit on the master due to a crash, but it will be
> replayed and committed on the slaves. On the other side if the
> replication logging of the commit is done after the WAL operation, it
> must be assured that WAL replay during crash recovery also causes
> replication log journal to be recovered or repeated. In short, the
> replication log must be covered by the same redo mechanism the crash
> recovery system uses.

This I will have to verify with our programmers as to exactly "when" the
replication occurs.

>
> This all is only important for the case that one does not immediately
> slam on the big red panic button and issues a full failover when the
> main server crashes, but rather tries to bring the main server back. If
> it can boot,

That is correct. It is important to remember that one should not just
"failover". You need to know why... what happen, happen. If you take two
minutes bring that master back up it is often apparent what needs to be
done quickly to get the machine in operational condition.


> has no FS inconsistencies and PostgreSQL's crash recovery
> succeeds too, there is no need to fail over any more and one will want
> to resume normal operation. If now the strict synchronization between
> what PostgreSQL's crash recovery mechanism does restore and what will be
> applied to the slave systems cannot be guaranteed, then there is the
> possibility of loss of synchronization between master and slaves. You
> just lost the data integrity of your backup server.

If Replicator looses synchronization, it will attempt to resync. If it
can not, it will completely wipe a slave and perform a full dump to make
sure the slave in question is in sync.

> Slony-I has the replication log journal covered by PostgreSQL's native
> ACID properties. I assume Joshua can explain how Mammoth Replicator
> solves this problem.
>
> Another important difference is automatic replication of schema changes.
> This is a feature often asked for, and I have no idea where that wish
> comes from. Certainly it does not stem from too much thinking about the
> problem. Slony-I does not attempt to go into that direction. A trigger
> based solution like Slony-I cannot do it anyway. Again, Joshua, what's
> the plans or status with Replicator on that?

We will not as it would be on many levels of crazy :) replicate schema
changes.

> The reason why I consider automatic schema replication a subintelligent
> idea is simply that it makes special purpose configurations impossible.
> If one only needs a full backup server for failover, it sure is usefull.

I don't even know if I agree with that. A database schema should be
relatively static once it goes into production. If you have to make a
database schema change then one should test, test, test on dev machines
and then schedule an outage for the schema changes.

Also even if we were to replicate schema changes it would almost
guarantee a full dump on every change you made.


>> Does the Slony-I replications operation on BSD? Is there a reviews of
>> the Mammoth online?
>

BSD is a platform that is coming for Replicator. The majority (95%) of
our demand has been on Linux and thus is obviously our most supported
platform.

Next would be Solaris, and then about half a dozen requests for BSD.

Sincerely,

Joshua D. Drake



>
> I use FreeBSD 4.9 for most of the development. In general Slony-I should
> run on every PostgreSQL supported Unix platform that provides pthreads.
>
>
> Jan
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment

Re: Replication options?

From
"Liam Lesboch"
Date:
Thank you for your overviews and responses. Much appreciation. Where might
one find reviews or testimonials of your product and priceing?

Liam


>From: "Joshua D. Drake" <jd@commandprompt.com>
>To: Jan Wieck <JanWieck@Yahoo.com>
>CC: Liam Lesboch <liamlesboch@hotmail.com>,pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Replication options?
>Date: Thu, 12 Aug 2004 09:02:34 -0700
>
>  > As far as I know (Joshua please clearify here) Mammoth Replicator
>writes
>>its own, binary journal containing the changes that need to be applied to
>>the replica.
>
>
>Yes that is correct.
>
>>
>>On a first look inserting into database tables might look more expensive.
>>But there are some fine details that make it worth taking a second look.
>>One side effect of doing this is that collecting the replication log
>>together with changing the data on the origin (master) is automatically
>>covered by the exact same ACID properties the database provides. I am not
>>sure if or how Replicator guarantees that under all possible circumstances
>>and server crash situations the replication log journal will contain
>>exacly all committed transactions, and only those.
>
>Replicator does not replicate until the transaction is committed.
>
>>To make a transaction durable, the changes first have to be recorded in
>>PostgreSQL's crash recovery WAL. Only after that data is flushed to the
>>disk it can be assumed that the transaction will be redone in the case of
>>an immediately following crash. If a replication system now logs the
>>commit event before the WAL operation happens, it is possible that the
>>transaction does not commit on the master due to a crash, but it will be
>>replayed and committed on the slaves. On the other side if the replication
>>logging of the commit is done after the WAL operation, it must be assured
>>that WAL replay during crash recovery also causes replication log journal
>>to be recovered or repeated. In short, the replication log must be covered
>>by the same redo mechanism the crash recovery system uses.
>
>This I will have to verify with our programmers as to exactly "when" the
>replication occurs.
>
>>
>>This all is only important for the case that one does not immediately slam
>>on the big red panic button and issues a full failover when the main
>>server crashes, but rather tries to bring the main server back. If it can
>>boot,
>
>That is correct. It is important to remember that one should not just
>"failover". You need to know why... what happen, happen. If you take two
>minutes bring that master back up it is often apparent what needs to be
>done quickly to get the machine in operational condition.
>
>
>>has no FS inconsistencies and PostgreSQL's crash recovery succeeds too,
>>there is no need to fail over any more and one will want to resume normal
>>operation. If now the strict synchronization between what PostgreSQL's
>>crash recovery mechanism does restore and what will be applied to the
>>slave systems cannot be guaranteed, then there is the possibility of loss
>>of synchronization between master and slaves. You just lost the data
>>integrity of your backup server.
>
>If Replicator looses synchronization, it will attempt to resync. If it
>can not, it will completely wipe a slave and perform a full dump to make
>sure the slave in question is in sync.
>
>>Slony-I has the replication log journal covered by PostgreSQL's native
>>ACID properties. I assume Joshua can explain how Mammoth Replicator solves
>>this problem.
>>
>>Another important difference is automatic replication of schema changes.
>>This is a feature often asked for, and I have no idea where that wish
>>comes from. Certainly it does not stem from too much thinking about the
>>problem. Slony-I does not attempt to go into that direction. A trigger
>>based solution like Slony-I cannot do it anyway. Again, Joshua, what's the
>>plans or status with Replicator on that?
>
>We will not as it would be on many levels of crazy :) replicate schema
>changes.
>
>>The reason why I consider automatic schema replication a subintelligent
>>idea is simply that it makes special purpose configurations impossible. If
>>one only needs a full backup server for failover, it sure is usefull.
>
>I don't even know if I agree with that. A database schema should be
>relatively static once it goes into production. If you have to make a
>database schema change then one should test, test, test on dev machines
>and then schedule an outage for the schema changes.
>
>Also even if we were to replicate schema changes it would almost guarantee
>a full dump on every change you made.
>
>
>>>Does the Slony-I replications operation on BSD? Is there a reviews of the
>>>Mammoth online?
>>
>
>BSD is a platform that is coming for Replicator. The majority (95%) of our
>demand has been on Linux and thus is obviously our most supported platform.
>
>Next would be Solaris, and then about half a dozen requests for BSD.
>
>Sincerely,
>
>Joshua D. Drake
>
>
>
>>
>>I use FreeBSD 4.9 for most of the development. In general Slony-I should
>>run on every PostgreSQL supported Unix platform that provides pthreads.
>>
>>
>>Jan
>>
>
>
>--
>Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
>Postgresql support, programming shared hosting and dedicated hosting.
>+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
>Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
><< jd.vcf >>
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.
http://join.msn.com/?page=features/virus


Re: Replication options?

From
Andrew Sullivan
Date:
On Wed, Aug 11, 2004 at 12:02:07PM -0400, Tom Lane wrote:
> Now erServer did work for them, but it required significant amounts of
> tuning and constant babysitting by the DBA.  (If Andrew Sullivan is
> paying attention to this thread, he can offer lots of gory details.)
> I can also personally testify that getting erServer set up is a major
> pain in the rear.  I haven't messed with Slony, but all reports are that
> it's a substantially better piece of code.

I can indeed provide gory details.  Erserver worked for us, and was
able to handle the load we gave it (at times pretty substantial).
But it had a number of flaws.  Some of these were mere matters of
implementation, and some were (in my view) fundamental.  Since I've
been observing radio silence on the list lately, I feel entitled to
blather on at length now.  So, below is the gore, and the reasons we
finally decided to abandon erserver.  This is very similar to the
negative part of what I had to say at OSCON, so if you were bored by
me there, you'll find this equally boring.

A.  First, the implementation faults.  As Vivek Khera pointed out,
the failover and set up support is not strong.

1.  Setting up erserver on a system which is not already replicated
is a major pain.  (We didn't have this problem because we always
launched with erserver support in place.)  On a database of a few gig,
you could easily have to take 24 hours downtime to get it set up.
Some of that was just faulty implementation, and if you have a single
not null unique column on every table, the problem is more to do with
poorly conceived setup scripts.  But finding this out turns out to
depend on having available an expert in the system (and as far as I
know, almost all the experts on it actually work here at Afilias.  I
did put together some notes on this topic for the BSD version of
erserver.  They're at
<http://gborg.postgresql.org/pipermail/erserver-general/2003-October/000169.html>
or <http://tinyurl.com/66b89>.)

2.  Switchover is also a pain (we don't like to talk about failover:
erserver is, like Slony-I, async, and failover more or less
automatically risks stranding data on the dead master).  There are
some automation scripts which make it a little easier, but the basic
problem is getting your slave into a condition where it can actually
take over from the master.  The slaves in erserver really don't know
enough about the master to be in a position to do this.  It _is_
possible: I've done it.  It's not fun.  (Failing back is even less
fun, and essentially requires you to build a new slave.  See A1, above.
If you're going to use erserver as a disaster-avoidance system, you
need two identical servers, so that any one can play the role of
master.)

3.  The engine was written in Java.  Java is a nice language, but the
JDK from Sun imposes a 3 G limit on the size of the JVM.  If you get
far enough behind, the VM just blows up, and then you have no hope of
recovering.  This is a _very_ serious limitation for high traffic
sites.  It also turned out to be completely fatal for certain users
who wanted to replicate large objects: one object would be enough to
make the system fail (for reasons that are too incredible to go into,
the process actually has two copies of the data at one time during a
part of processing.  This is just a bug, though a dangerous one).

4.  The logging code was deliberately obfuscatory.  For some reason,
the person who originally wrote the Java code (note _not_ the original
code from PostgreSQL, Inc.) decided to wrap all the error handling in
an outer layer which returned the line number of the error handler
every time it threw an exception.  This meant that, from looking at
the logs, every case of a bug looks like it happened at the same
place.  You can imagine how much fun it was to fix things.  Every
person I've ever known who looked at the logging code suffered
retinal damange -- it was that bad.  (This is acutally fixed in the
PostgreSQL-commercial version of the software, BTW.)

B.  Second, the fundamental errors.

1.  The first big problem came from something we thought was an
advantage: erserver replicates only the latest version of the row.
This reduces the replication overhead considerably, and for a long
time I was a great proponent of this approach.  I was wrong, because
the performance overhead that it imposes under certain perverse kinds
of loads is well and truly awful.  Even in the normal circumstance,
the performance penalty is noticable; but it's not a problem if you
have enough excess capacity.  When that capacity is squeezed, you
run into a lot of pain.  In such cases, the replication application
starts to slow down.  Get under really heavy load, and you start to
have to worry about the JVM limits outlined in A3.  This can be
dealt with, but you absolutely need to hold its hand when things are
bad.  Your DBAs have better things to do, I assume.

The decision to send only the last row also cost some functionality,
because you can't build an historical-database slave with erserver,
unfortunately (if a row gets updated twice in the space of one
transaction, you won't see two changes on the slave, but only the
final state of the changed row).

2.  Finally, there is the problem that the snapshot applications
occasionally could get into the situation where applying rows to the
slave would result either in bad data (bad) or errors on unique
indexes (also bad).  You had to choose between making your slave even
more unlike your master or potentially getting called in the middle
of the night to hand-fix the deadlock condition.  (Some further
discussion of this feature of the software is at
<http://gborg.postgresql.org/pipermail/erserver-general/2003-October/000185.html>
or <http://tinyurl.com/5erj7>.)

It is really the items in B that finally conviced us that we had to
give up on the erserver code and work on a fresh system.  I think
Jan will confirm that his Slony-I work drew some useful inspiration
from the erserver code (in particular, the magic that Vadim
performed).  But ultimately, erserver taught us as much about what
_else_ you needed before you got a real replication system.  In
particular, we felt that you needed more knowledge at all the nodes
than erserver was able to provide.  By contrast, you can usefully
think of Slony-I as a cluster-communication system which happens to
specialise in keeping the data the same on all subscribing nodes.

This isn't to say that erserver is not undergoing development.  I
understand from Geoff Davidson of PostgreSQL, Inc, that they are
continuing work on the product, with an eye to a multi-master
distributed system and automatic failover.  I think such developments
would be welcomed by PostgreSQL users.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@ca.afilias.info>                              M2P 2A8
                                        +1 416 646 3304 x4110


Re: Replication options?

From
Jan Wieck
Date:
On 8/12/2004 12:02 PM, Joshua D. Drake wrote:

>> To make a transaction durable, the changes first have to be recorded in
>> PostgreSQL's crash recovery WAL. Only after that data is flushed to the
>> disk it can be assumed that the transaction will be redone in the case
>> of an immediately following crash. If a replication system now logs the
>> commit event before the WAL operation happens, it is possible that the
>> transaction does not commit on the master due to a crash, but it will be
>> replayed and committed on the slaves. On the other side if the
>> replication logging of the commit is done after the WAL operation, it
>> must be assured that WAL replay during crash recovery also causes
>> replication log journal to be recovered or repeated. In short, the
>> replication log must be covered by the same redo mechanism the crash
>> recovery system uses.
>
> This I will have to verify with our programmers as to exactly "when" the
> replication occurs.

Joshua,

you never followed up to this one.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: Replication options?

From
"Joshua D. Drake"
Date:
>>> logging of the commit is done after the WAL operation, it must be
>>> assured that WAL replay during crash recovery also causes
>>> replication log journal to be recovered or repeated. In short, the
>>> replication log must be covered by the same redo mechanism the crash
>>> recovery system uses.
>>
>>
>> This I will have to verify with our programmers as to exactly "when"
>> the replication occurs.
>
>
> Joshua,
>
> you never followed up to this one.

As of 1.3.1 (the current version) we also perform wal processing to
insure that the transaction is correctly replicated in case of a crash.

Sincerely,

Joshua D. Drake



>
>
> Jan
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL