Re: Replication options? - Mailing list pgsql-general
From | Jan Wieck |
---|---|
Subject | Re: Replication options? |
Date | |
Msg-id | 411B8587.40100@Yahoo.com Whole thread Raw |
In response to | Re: Replication options? ("Liam Lesboch" <liamlesboch@hotmail.com>) |
Responses |
Re: Replication options?
|
List | pgsql-general |
On 8/11/2004 5:07 PM, Liam Lesboch wrote: >>From: "Joshua D. Drake" <jd@commandprompt.com> >>To: Jeff Eckermann <jeff_eckermann@yahoo.com> >>CC: Tom Lane <tgl@sss.pgh.pa.us>,Liam Lesboch <liamlesboch@hotmail.com>, >>pgsql-general@postgresql.org >>Subject: Re: [GENERAL] Replication options? >>Date: Wed, 11 Aug 2004 13:38:54 -0700 >> >> >>>Granted... >>>Liam's bosses want something with a history of >>>successful use in a serious production situation, and >>>erServer at least has that. Slony has been around for >>>too short a time to make that claim, yet. >> >>There is also Mammoth Replicator which is an integrated replication >>approach that does not require any triggers. It is a commercial product >>though. >> > > Thank you for your info. Are there any differences of the trigger versus non > trigger systems? We are a commercial companie that has no products > (software) but have many servers. There is no BSD version of mamoth on the > website and we have many servers on BSD and the linux servers. There are some differences. Before going into the details let me make clear that in the Slony-I case we are talking about generic triggers written in C, that make extensive use of the internal prepared execution plan features and take advantage of being able to access the system catalog cache as well as any builtin functionality. My former work on the foreign key implementation of PostgreSQL looks very similar. I think the major operational difference between Mammoth Replicator and Slony-I is where both collect the replication information. Slony-I collects single row changes as log rows in regular database tables. It filters out unchaged columns, so that only changed values and the primary key of the row together with the column names appear in the log. As far as I know (Joshua please clearify here) Mammoth Replicator writes its own, binary journal containing the changes that need to be applied to the replica. On a first look inserting into database tables might look more expensive. But there are some fine details that make it worth taking a second look. One side effect of doing this is that collecting the replication log together with changing the data on the origin (master) is automatically covered by the exact same ACID properties the database provides. I am not sure if or how Replicator guarantees that under all possible circumstances and server crash situations the replication log journal will contain exacly all committed transactions, and only those. To make a transaction durable, the changes first have to be recorded in PostgreSQL's crash recovery WAL. Only after that data is flushed to the disk it can be assumed that the transaction will be redone in the case of an immediately following crash. If a replication system now logs the commit event before the WAL operation happens, it is possible that the transaction does not commit on the master due to a crash, but it will be replayed and committed on the slaves. On the other side if the replication logging of the commit is done after the WAL operation, it must be assured that WAL replay during crash recovery also causes replication log journal to be recovered or repeated. In short, the replication log must be covered by the same redo mechanism the crash recovery system uses. This all is only important for the case that one does not immediately slam on the big red panic button and issues a full failover when the main server crashes, but rather tries to bring the main server back. If it can boot, has no FS inconsistencies and PostgreSQL's crash recovery succeeds too, there is no need to fail over any more and one will want to resume normal operation. If now the strict synchronization between what PostgreSQL's crash recovery mechanism does restore and what will be applied to the slave systems cannot be guaranteed, then there is the possibility of loss of synchronization between master and slaves. You just lost the data integrity of your backup server. Slony-I has the replication log journal covered by PostgreSQL's native ACID properties. I assume Joshua can explain how Mammoth Replicator solves this problem. Another important difference is automatic replication of schema changes. This is a feature often asked for, and I have no idea where that wish comes from. Certainly it does not stem from too much thinking about the problem. Slony-I does not attempt to go into that direction. A trigger based solution like Slony-I cannot do it anyway. Again, Joshua, what's the plans or status with Replicator on that? The reason why I consider automatic schema replication a subintelligent idea is simply that it makes special purpose configurations impossible. If one only needs a full backup server for failover, it sure is usefull. But what about using several slaves as load balanced search engines, while another slave is the data warehouse and yet another one is the reporting server? It would certainly be desirable to maintain the indexes used by the search engines only on the search engines, or have some special triggers firing only on the data warehouse and again have only a subset of tables replicated to the reporting server. Is that all possible with a 100% schema and data copy replication system? No, it is not. > > Does the Slony-I replications operation on BSD? Is there a reviews of the > Mammoth online? I use FreeBSD 4.9 for most of the development. In general Slony-I should run on every PostgreSQL supported Unix platform that provides pthreads. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-general by date: