replication docs: split single vs. multi-master - Mailing list pgsql-patches
From | Markus Schiltknecht |
---|---|
Subject | replication docs: split single vs. multi-master |
Date | |
Msg-id | 455AEF4F.3010304@bluegap.ch Whole thread Raw |
Responses |
Re: replication docs: split single vs. multi-master
|
List | pgsql-patches |
Hi, as promised on -docs, here comes my proposal on how to improve the replication documentation. The patches are split as follows and have to be applied in order: replication_doku_1.diff: Smallest possible one-word change to warm-up... replication_doku_2.diff: Moves down "Clustering For Parallel Query Execution", because it's not a replication type, but a feature, see explanation below. replication_doku_3.diff: This is the most important part, splitting all replication types into single- and multi-master replication. I'm new to SGML, so please bear with me if this is not the right way to do it... "Shared-Disk-Failover" does IMO not fall into a replication category. Should we mention there, that 'sharing' a disk using NFS or some such is not recommended? (And more importantly, does not work as a multi-master replication solution) I've added a general paragraph describing Single-Master Replication. I'm stating that 'Single-Master Replication is always asynchronous'. Can anybody think of a counter example? Or a use case for sync Single-Master Replication? The argument to put down is: if you go sync, why don't you do Multi-Master right away? Most of the "Clustering for Load Balancing" text applies to all synchronous, Multi-Master Replication algorithms, even to "Query Broadcasting". Thus it became the general description of Multi-Master Replication. The section "Clustering for Load Balancing" has been removed. replication_doku_4.diff: These are the text modifications I did to adjust to the new structure. I've adjusted the Multi-Master Replication text to really be appropriate for all existing solutions. "Query Broadcasting" has some corrections, mainly to stick to describe that algorithm there and none of the general properties of Multi-Master Replication. I've added two sections to describe 2PC and Distributed SHMEM algorithms which belong into that category and cover all of the previous text. Except that I've removed the mentioning of Oracle RAC in favor of Pgpool-II. IMO this makes it clearer, what replication types exist and how to categorize them. I'm tempted to mention the Postgres-R algorithm as fourth sub-section of Multi-Master Replication, as it's quite different from all the others in many aspects. But I urgently need to do go to work now... besides, I'm heavily biased regarding Postgres-R, so probably someone else should write that paragraph. :-) The only downside of the structure I'm proposing here is: the non-replication-algorithms fall of somewhat. Namely: "Shared-Disk Failover", "Data Partitioning", "Parallel Query Execution" and "Commercial Solutions". For me, "Data Partitioning" as well as "Parallel Query Execution" are possible optimizations which can be run on top of replicated data. They don't replicate data and are thus not replication solutions. But grouping those two together would make sense. So. I really have to go to work now! Regards Markus *** doc/src/sgml/failover.sgml 2006-11-15 08:52:25.000000000 +0100 --- doc/src/sgml/failover.sgml 2006-11-15 08:58:34.000000000 +0100 *************** *** 126,132 **** </para> <para> ! Such partitioning implements both failover and load balancing. Failover is achieved because the data resides on both servers, and this is an ideal way to enable failover if the servers share a slow communication channel. Load balancing is possible because read requests can go to any --- 126,132 ---- </para> <para> ! Such partitioning provides both failover and load balancing. Failover is achieved because the data resides on both servers, and this is an ideal way to enable failover if the servers share a slow communication channel. Load balancing is possible because read requests can go to any *** doc/src/sgml/failover.sgml 2006-11-15 08:58:34.000000000 +0100 --- doc/src/sgml/failover.sgml 2006-11-15 09:10:53.000000000 +0100 *************** *** 114,150 **** </para> </sect1> - <sect1 id="data-partitioning"> - <title>Data Partitioning</title> - - <para> - Data partitioning splits tables into data sets. Each set can only be - modified by one server. For example, data can be partitioned by - offices, e.g. London and Paris. While London and Paris servers have all - data records, only London can modify London records, and Paris can only - modify Paris records. - </para> - - <para> - Such partitioning provides both failover and load balancing. Failover - is achieved because the data resides on both servers, and this is an - ideal way to enable failover if the servers share a slow communication - channel. Load balancing is possible because read requests can go to any - of the servers, and write requests are split among the servers. Of - course, the communication to keep all the servers up-to-date adds - overhead, so ideally the write load should be low, or localized as in - the London/Paris example above. - </para> - - <para> - Data partitioning is usually handled by application code, though rules - and triggers can be used to keep the read-only data sets current. Slony-I - can also be used in such a setup. While Slony-I replicates only entire - tables, London and Paris can be placed in separate tables, and - inheritance can be used to access both tables using a single table name. - </para> - </sect1> - <sect1 id="query-broadcast-load-balancing"> <title>Query Broadcast Load Balancing</title> --- 114,119 ---- *************** *** 198,203 **** --- 167,203 ---- </para> </sect1> + <sect1 id="data-partitioning"> + <title>Data Partitioning</title> + + <para> + Data partitioning splits tables into data sets. Each set can only be + modified by one server. For example, data can be partitioned by + offices, e.g. London and Paris. While London and Paris servers have all + data records, only London can modify London records, and Paris can only + modify Paris records. + </para> + + <para> + Such partitioning provides both failover and load balancing. Failover + is achieved because the data resides on both servers, and this is an + ideal way to enable failover if the servers share a slow communication + channel. Load balancing is possible because read requests can go to any + of the servers, and write requests are split among the servers. Of + course, the communication to keep all the servers up-to-date adds + overhead, so ideally the write load should be low, or localized as in + the London/Paris example above. + </para> + + <para> + Data partitioning is usually handled by application code, though rules + and triggers can be used to keep the read-only data sets current. Slony-I + can also be used in such a setup. While Slony-I replicates only entire + tables, London and Paris can be placed in separate tables, and + inheritance can be used to access both tables using a single table name. + </para> + </sect1> + <sect1 id="clustering-for-parallel-query-execution"> <title>Clustering For Parallel Query Execution</title> *** doc/src/sgml/failover.sgml 2006-11-15 09:10:53.000000000 +0100 --- doc/src/sgml/failover.sgml 2006-11-15 11:11:12.000000000 +0100 *************** *** 83,89 **** </para> </sect1> ! <sect1 id="warm-standby-using-point-in-time-recovery"> <title>Warm Standby Using Point-In-Time Recovery</title> <para> --- 83,107 ---- </para> </sect1> ! <sect1 id="single-master-replication"> ! <indexterm><primary>single-master</></> ! ! <title>Single-Master Replication</title> ! ! <para> ! When using only one server to process data-modifying transactions we ! speak of a master server and Single-Master Replication. It is used ! to allow one or more slave servers to stay current and therefore ! provide a quick failover for high availability. ! </para> ! ! <para> ! Single-Master Replication is always asynchronous to allow the master ! to process data-modifying transactions at full speed, but with the ! downside of possible data loss during failover. ! </para> ! ! <sect2 id="warm-standby-using-point-in-time-recovery"> <title>Warm Standby Using Point-In-Time Recovery</title> <para> *************** *** 94,102 **** made the new master database server. This is asynchronous and can only be done for the entire database server. </para> ! </sect1> ! <sect1 id="continuously-running-replication-server"> <title>Continuously Running Replication Server</title> <para> --- 112,120 ---- made the new master database server. This is asynchronous and can only be done for the entire database server. </para> ! </sect2> ! <sect2 id="continuously-running-replication-server"> <title>Continuously Running Replication Server</title> <para> *************** *** 112,121 **** granularity. It updates the backup server in batches, so the replication is asynchronous and might lose data during a fail over. </para> </sect1> ! <sect1 id="query-broadcast-load-balancing"> ! <title>Query Broadcast Load Balancing</title> <para> Query broadcast load balancing is accomplished by having a --- 130,160 ---- granularity. It updates the backup server in batches, so the replication is asynchronous and might lose data during a fail over. </para> + </sect2> + </sect1> ! ! <sect1 id="Multi-Master Replication"> ! ! <indexterm><primary>multi-master</></> ! ! <title>Multi-Master Replication</title> ! ! <para> ! In clustering, each server can accept write requests, and modified ! data is transmitted from the original server to every other ! server before each transaction commits. Heavy write activity ! can cause excessive locking, leading to poor performance. In ! fact, write performance is often worse than that of a single ! server. Read requests can be sent to any server. Clustering ! is best for mostly read workloads, though its big advantage is ! that any server can accept write requests — there is no need ! to partition workloads between read/write and read-only servers. ! </para> ! ! <sect2 id="query-broadcast"> ! <title>Query Broadcasting</title> <para> Query broadcast load balancing is accomplished by having a *************** *** 138,170 **** care must be taken that all transactions either commit or abort on all servers Pgpool is an example of this type of replication. </para> ! </sect1> ! ! <sect1 id="clustering-for-load-balancing"> ! <title>Clustering For Load Balancing</title> ! ! <para> ! In clustering, each server can accept write requests, and modified ! data is transmitted from the original server to every other ! server before each transaction commits. Heavy write activity ! can cause excessive locking, leading to poor performance. In ! fact, write performance is often worse than that of a single ! server. Read requests can be sent to any server. Clustering ! is best for mostly read workloads, though its big advantage is ! that any server can accept write requests — there is no need ! to partition workloads between read/write and read-only servers. ! </para> - <para> - Clustering is implemented by <productname>Oracle</> in their - <productname><acronym>RAC</></> product. <productname>PostgreSQL</> - does not offer this type of load balancing, though - <productname>PostgreSQL</> two-phase commit (<xref - linkend="sql-prepare-transaction" - endterm="sql-prepare-transaction-title"> and <xref - linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">) - can be used to implement this in application code or middleware. - </para> </sect1> <sect1 id="data-partitioning"> --- 177,184 ---- care must be taken that all transactions either commit or abort on all servers Pgpool is an example of this type of replication. </para> ! </sect2> </sect1> <sect1 id="data-partitioning"> *** doc/src/sgml/failover.sgml 2006-11-15 11:11:12.000000000 +0100 --- doc/src/sgml/failover.sgml 2006-11-15 11:05:58.000000000 +0100 *************** *** 141,181 **** <title>Multi-Master Replication</title> ! <para> ! In clustering, each server can accept write requests, and modified ! data is transmitted from the original server to every other ! server before each transaction commits. Heavy write activity ! can cause excessive locking, leading to poor performance. In ! fact, write performance is often worse than that of a single ! server. Read requests can be sent to any server. Clustering ! is best for mostly read workloads, though its big advantage is ! that any server can accept write requests — there is no need ! to partition workloads between read/write and read-only servers. ! </para> <sect2 id="query-broadcast"> <title>Query Broadcasting</title> <para> ! Query broadcast load balancing is accomplished by having a ! program intercept every SQL query and send it to all servers. ! This is unique because most replication solutions have the write ! server propagate its changes to the other servers. With query ! broadcasting, each server operates independently. Read-only ! queries can be sent to a single server because there is no need ! for all servers to process it. </para> <para> ! One limitation of this solution is that functions like ! <function>random()</>, <function>CURRENT_TIMESTAMP</>, and ! sequences can have different values on different servers. This ! is because each server operates independently, and because SQL ! queries are broadcast (and not actual modified rows). If this ! is unacceptable, applications must query such values from a ! single server and then use those values in write queries. Also, ! care must be taken that all transactions either commit or abort ! on all servers Pgpool is an example of this type of replication. </para> </sect2> --- 141,215 ---- <title>Multi-Master Replication</title> ! <para> ! With Multi-Master Replication, each server can accept write requests, ! and modified data is transmitted to all servers. Heavy write activity ! causes network traffic and excessive locking, leading to poor performance ! especially for synchronous Multi-Master Replication. The write ! performance can therefore be worse than that of a single server. Thus ! Multi-Master Replication works best for mostly read workloads, though ! its big advantage is that any server can accept write requests — ! there is no need to partition workloads between read/write and read-only ! servers. ! </para> ! ! <para> ! Because every server has a consistent copy (replica) of the data, read ! requests can be sent to any server. In asynchronous Multi-Master ! Replication the servers replica diverge until re-synchronized. ! Likewise, conflicts in committed transactions can only be detected during ! re-synchronization. So either the application has to ensure not to send ! conflicting transactions or the conflicts have to be resolved somehow ! during re-synchronization of the servers. ! </para> <sect2 id="query-broadcast"> <title>Query Broadcasting</title> <para> ! Query broadcasting is often accomplished by having a program intercept ! data-modifying SQL queries and send them to all servers. With query ! broadcasting, each server operates independently, thus every ! data-modifying transaction has to be processed on every server. Pgpool ! is an example of this type of replication. ! </para> ! ! <para> ! One challenge in implementing this solution is that functions like ! <function>random()</> or <function>CURRENT_TIMESTAMP</> can return ! different values on different servers. So if only plain SQL gets ! transmitted, the application has to cope with these differences. ! Some solutions try to correctly handle these situations by executing ! the non-deterministic functions on only one server and propagating ! the result together with the rest of the SQL to the others. ! </para> ! </sect2> ! ! <sect2 id="two-phase-commit-replication"> ! <title>Using Two Phase Commit</title> ! ! <para> ! <productname>PostgreSQL</productname> offers two-phase commit ! (<xref linkend="sql-prepare-transaction" ! endterm="sql-prepare-transaction-title"> and <xref ! linkend="sql-commit-prepared" endterm="sql-commit-prepared-title"> ! which can be used to implement synchronous Multi-Master replication ! in application code or middleware. As with Query Broadcasting, care ! has to be taken with non-deterministic functions. </para> + </sect2> + + <sect2 id="distributed-shared-memory-replication"> + <title>Using Distributed Shared Memory</title> <para> ! A different approach is to use Distributed Shared Memory to propagate ! the modified data and all the locks of data-modifying transactions. ! This can be done on the level of the operating system, i.e. with ! OpenMosix or within the database itself. However, in both methods ! generate a lot of network traffic and do not scale very well for that ! reason. Pgpool-II is an effort to implement Distributed Shared Memory ! Replication for PostgreSQL. </para> </sect2>
pgsql-patches by date: