Re: Replication options? - Mailing list pgsql-general

From Jan Wieck
Subject Re: Replication options?
Date
Msg-id 411B8587.40100@Yahoo.com
Whole thread Raw
In response to Re: Replication options?  ("Liam Lesboch" <liamlesboch@hotmail.com>)
Responses Re: Replication options?
List pgsql-general
On 8/11/2004 5:07 PM, Liam Lesboch wrote:

>>From: "Joshua D. Drake" <jd@commandprompt.com>
>>To: Jeff Eckermann <jeff_eckermann@yahoo.com>
>>CC: Tom Lane <tgl@sss.pgh.pa.us>,Liam Lesboch <liamlesboch@hotmail.com>,
>>pgsql-general@postgresql.org
>>Subject: Re: [GENERAL] Replication options?
>>Date: Wed, 11 Aug 2004 13:38:54 -0700
>>
>>
>>>Granted...
>>>Liam's bosses want something with a history of
>>>successful use in a serious production situation, and
>>>erServer at least has that.  Slony has been around for
>>>too short a time to make that claim, yet.
>>
>>There is also Mammoth Replicator which is an integrated replication
>>approach that does not require any triggers. It is a commercial product
>>though.
>>
>
> Thank you for your info. Are there any differences of the trigger versus non
> trigger systems? We are a commercial companie that has no products
> (software) but have many servers. There is no BSD version of mamoth on the
> website and we have many servers on BSD and the linux servers.

There are some differences. Before going into the details let me make
clear that in the Slony-I case we are talking about generic triggers
written in C, that make extensive use of the internal prepared execution
plan features and take advantage of being able to access the system
catalog cache as well as any builtin functionality. My former work on
the foreign key implementation of PostgreSQL looks very similar.

I think the major operational difference between Mammoth Replicator and
Slony-I is where both collect the replication information. Slony-I
collects single row changes as log rows in regular database tables. It
filters out unchaged columns, so that only changed values and the
primary key of the row together with the column names appear in the log.

As far as I know (Joshua please clearify here) Mammoth Replicator writes
its own, binary journal containing the changes that need to be applied
to the replica.

On a first look inserting into database tables might look more
expensive. But there are some fine details that make it worth taking a
second look. One side effect of doing this is that collecting the
replication log together with changing the data on the origin (master)
is automatically covered by the exact same ACID properties the database
provides. I am not sure if or how Replicator guarantees that under all
possible circumstances and server crash situations the replication log
journal will contain exacly all committed transactions, and only those.
To make a transaction durable, the changes first have to be recorded in
PostgreSQL's crash recovery WAL. Only after that data is flushed to the
disk it can be assumed that the transaction will be redone in the case
of an immediately following crash. If a replication system now logs the
commit event before the WAL operation happens, it is possible that the
transaction does not commit on the master due to a crash, but it will be
replayed and committed on the slaves. On the other side if the
replication logging of the commit is done after the WAL operation, it
must be assured that WAL replay during crash recovery also causes
replication log journal to be recovered or repeated. In short, the
replication log must be covered by the same redo mechanism the crash
recovery system uses.

This all is only important for the case that one does not immediately
slam on the big red panic button and issues a full failover when the
main server crashes, but rather tries to bring the main server back. If
it can boot, has no FS inconsistencies and PostgreSQL's crash recovery
succeeds too, there is no need to fail over any more and one will want
to resume normal operation. If now the strict synchronization between
what PostgreSQL's crash recovery mechanism does restore and what will be
applied to the slave systems cannot be guaranteed, then there is the
possibility of loss of synchronization between master and slaves. You
just lost the data integrity of your backup server.

Slony-I has the replication log journal covered by PostgreSQL's native
ACID properties. I assume Joshua can explain how Mammoth Replicator
solves this problem.

Another important difference is automatic replication of schema changes.
This is a feature often asked for, and I have no idea where that wish
comes from. Certainly it does not stem from too much thinking about the
problem. Slony-I does not attempt to go into that direction. A trigger
based solution like Slony-I cannot do it anyway. Again, Joshua, what's
the plans or status with Replicator on that?

The reason why I consider automatic schema replication a subintelligent
idea is simply that it makes special purpose configurations impossible.
If one only needs a full backup server for failover, it sure is usefull.
But what about using several slaves as load balanced search engines,
while another slave is the data warehouse and yet another one is the
reporting server? It would certainly be desirable to maintain the
indexes used by the search engines only on the search engines, or have
some special triggers firing only on the data warehouse and again have
only a subset of tables replicated to the reporting server. Is that all
possible with a 100% schema and data copy replication system? No, it is
not.

>
> Does the Slony-I replications operation on BSD? Is there a reviews of the
> Mammoth online?

I use FreeBSD 4.9 for most of the development. In general Slony-I should
run on every PostgreSQL supported Unix platform that provides pthreads.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: pl pgsql grammer file contains error
Next
From: Tom Lane
Date:
Subject: Re: datpath error