Thread: Replication Documentation
Here's a patch to add in the material on replication recently discussed on pgsql.docs. I'm not thrilled that there were only a few comments made; I'd be happy to see "slicing and dicing" to see this made more useful. Index: filelist.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/filelist.sgml,v retrieving revision 1.44 diff -c -u -r1.44 filelist.sgml --- filelist.sgml 12 Sep 2005 22:11:38 -0000 1.44 +++ filelist.sgml 1 Aug 2006 20:00:00 -0000 @@ -44,6 +44,7 @@ <!entity config SYSTEM "config.sgml"> <!entity user-manag SYSTEM "user-manag.sgml"> <!entity wal SYSTEM "wal.sgml"> +<!entity replication SYSTEM "replication.sgml"> <!-- programmer's guide --> <!entity dfunc SYSTEM "dfunc.sgml"> Index: postgres.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/postgres.sgml,v retrieving revision 1.77 diff -c -u -r1.77 postgres.sgml --- postgres.sgml 10 Mar 2006 19:10:48 -0000 1.77 +++ postgres.sgml 1 Aug 2006 20:00:00 -0000 @@ -155,6 +155,7 @@ &diskusage; &wal; ®ress; + &replication; </part> ---- Then add the following as .../doc/src/sgml/replication.sgml <!-- $PostgreSQL$ --> <chapter id="replication"> <title> Replication </title> <indexterm><primary>replication</primary></indexterm> <para> People frequently ask about what replication options are available for <productname>PostgreSQL</productname>. Unfortunately, there are so many approaches and models to this that are useful for different purposes that things tend to get confusing. </para> <para> At perhaps the most primitive level, one might use <xref linkend="backup"> tools, whether <xref linkend="app-pgdump"> or <xref linkend="continuous-archiving"> to create additional copies of databases. This <emphasis>doesn't</emphasis> provide any way to keep the replicas up to date; to bring the state of things to a different point in time requires bringing up another copy. There is no way, with these tools, for updates on a <quote>master</quote> system to automatically propagate to the replicas.</para> <sect1> <title> Categorization of Replication Systems </title> <para> Looking at replication systems, there are a number of ways in which they may be viewed: <itemizedlist> <listitem><para> Single master versus multimaster.</para> <para> That is, whether there is a single database considered <quote>master</quote>, where all update operations are required to be submitted, or the alternative, multimaster, where updates may be submitted to any of several databases.</para> <para> Multimaster replication is vastly more complex and expensive, because of the need to deal with the possibility of conflicting updates. The simplest example of this is where a replicated database manages inventory; the question is, what happens when requests go to different database nodes requesting a particular piece of inventory?</para> <para> Synchronous multimaster replication introduces the need to distribute locks across the systems, which, in research work done with Postgres-R and Slony-II, has proven to be very expensive. </para></listitem> <listitem><para> Synchronous versus asynchronous</para> <para>Synchronous systems are ones where updates must be accepted on all the databases before they are permitted to <command>COMMIT</command>. </para> <para> Asynchronous systems propagate updates to the other databases later. This permits the possibility that one database may have data significantly behind others. Whether or not being behind is acceptable or not will depend on the nature of the application.</para> <para> Asynchronous multimaster replication introduces the possibility that conflicting updates will be accepted by multiple nodes, as they don't know, at <command>COMMIT</command> time, that the updates conflict. It is then necessary to have some sort of conflict resolution system, which can't really be generalized as a generic database facility. An instance of this that is commonly seen is in the <productname>PalmOS HotSync</productname> system; the <quote>general policy</quote> when conflicts are noticed is to allow both conflicting records to persist until a human can intervene. That may be quite acceptable for an address book; it's <emphasis>not</emphasis> fine for OLTP systems. </para> </listitem> <listitem><para> Update capture methods </para> <para> Common methods include having triggers on tables, capturing SQL statements, and capturing transaction log (WAL) updates </para> <itemizedlist> <listitem><para> Triggers, as used in eRServer and Slony-I, have the advantage of capturing updates at the end of processing when all column values have been finalized. The use of transaction visibility (MVCC) and ordering can provide strong guarantees on consistency. </para> <para> Of course, firing a trigger for each tuple update comes at a not inconsiderable cost: a statement that touches 10,000 tuples will fire the trigger 10,000 times, and transform, on the subscriber, into 10,000 SQL statements.</para></listitem> <listitem><para> Statement capture almost exactly reverses the issues, as compared to triggers.</para> <para> There are no strong guarantees on consistency: any sort of nondeterministic query can <quote>corrupt</quote> things by introducing differences between nodes. Here are four examples of cases where naive statement capture is sure to get things wrong:</para> <itemizedlist> <listitem><para><command>INSERT INTO mytable (txntime, product, quantity, taxes, total) values (now(), 'AB-275', 10, 45, 250.00);</command></para> <para> Some replication systems parse the queries, replacing date requests with timestamps. </para> </listitem> <listitem><para><command>INSERT INTO table2 (random() * 50);</command></para> <para> In this case, nondeterminism is fairly much the point!</para> </listitem> <listitem><para>Any use of sequnce values as defaults, particularly with per-connection value cacheing, will open up occasions for values to diverge between nodes.</para></listitem> <listitem><para><command>INSERT INTO tab1 (txn_type, tdate, quantity, units, price) SELECT * FROM tab2 ORDER BY txn_type limit 50;</command></para> <para> There are many variations on this which will turn out badly: </para> <itemizedlist> <listitem><para>If there are default fields in tab1 that are set using sequences, the only way to even hope for the same ordering is to have an <command>ORDER BY</command> clause that ensures identical ordering on both hosts.</para></listitem> <listitem><para> If the ordering isn't a suitable total ordering, the requests for data from tab2 may find different data on different hosts.</para></listitem> <listitem><para>Columns with a default of <function>now()</function> will be troublesome as mentioned earlier, and this makes the problem harder because unlike in the earlier query, where one might substitute '2006-09-02 04:42:23-00' for <function>now()</function>, this requires a substantial rewriting of the query.</para></listitem> </itemizedlist> </listitem> </itemizedlist> </listitem> </itemizedlist> </listitem> </itemizedlist> </para> </sect1> <sect1 id="replicationsystems"> <title> PostgreSQL Replication Systems and Their Uses </title> <para> Based on the preceding taxonomy, we may categorize various replication systems, which should be helpful in determining what they may be best used for, and whether they are compatible with your <quote>use case.</quote></para> <sect2><title> Slony-I</title> <para> Slony-I is a single-master to multiple subscriber asynchronous replication system that captures updates using triggers. </para> <para> For many systems, it is not clear how to initialize replication on a new node some time after a system has been set up in production. Slony-I was specifically designed to provide the ability to introduce new nodes without the need to interrupt activity on the master node. </para> <para> It has, a particular merit, that, by only using components internal to PostgreSQL, it is compatible with multiple versions of PostgreSQL. This lends it especially to assisting at upgrading systems from one version of PostgreSQL to another without requiring a long outage. </para> <para> It suffers from three particular problems:</para> <itemizedlist> <listitem><para> Despite improvements from earlier versions, it is fairly complex to configure and administer.</para></listitem> <listitem><para> It can only replicate changes that can be captured using triggers. </para> <para> There is a handling for sequences, which comes via polling, but Slony-I <emphasis>does not</emphasis> provide an automatic way to replicate other sorts of objects. </listitem> <listitem><para> The handling of DDL changes is somewhat fragile, and exists as something of a bag on the side. </para> <para> There has been loose discussion as to how to address that; useful comprehensive answers have not emerged. </listitem> </itemizedlist> <sect3> <title> Use Cases </title> <para> Slony-I has proven useful for the following sorts of usages: </para> <itemizedlist> <listitem><para> Upgrading from one PostgreSQL release to another with only brief downtime. </para></listitem> <listitem><para> Providing extra database copies that are nearly up to date that may be used to offload read activity from the <quote>master</quote> database system. </para></listitem> <listitem><para> Providing extra database copies that are nearly up to date that may be used as failover targets. </para> </listitem> </itemizedlist> </sect2> <sect2><title> pgpool </title> <para> <application>pgpool</application> was initially created by Tatsuo Isshii as a portable alternative to Java connection pool modules. He subsequently observed that it wouldn't take very much effort to extend it to create a simple replication system: if it is forwarding SQL queries to a PostgreSQL instance, extending that to two databases is very straightforward. </para> <para> It suffers, by nature, from the problems associated with replicating using capture of SQL statements; any sort of nondeterminism in the replicated statements will cause the databases to diverge. </para> <para> On the other hand, it is very easy to install and configure; for users with simple requirements, that can suffice. </para> <para> A <application>pgpool-2</application> is under way which introduces a more sophisticated query parser to try to address the nondeterminism issues; that may limit ongoing support for the legacy version.</para> <sect3> <title> Use Cases </title> <para> pgpool has proven useful for the following sorts of usages: </para> <itemizedlist> <listitem><para> Dividing read-only database activity between two database instances. </para></listitem> <listitem><para> Providing a simple replication system for systems that do not make use of nondeterministic update queries. </para></listitem> </itemizedlist> </sect3> </sect2> <sect2> <title> PITR - Point In Time Recovery </title> <para> If you have a database cluster that supports a large number of database instances (<emphasis>e.g.</emphasis> - varying values for PGDATABASE), connection-managing systems like pgpool and systems like Slony-I which require a manager process for each database for each node that is replicated will turn out quite badly.</para> <para> For instance, if you have a database cluster that hosts 300 databases, as would be the case in a "web hosting" situation, for Slony-I to replicate all of this data, it would have to have 300 slon processes for each node. </para> <para> PITR is likely to be more suitable in this case; that doesn't provide you with a usable replica running, but it can recover <emphasis>all</emphasis> of the tables in <emphasis>all</emphasis> of the databases on the backend.</para> </sect2> <sect2> <title> Postgres-R </title> <para> This has been a research project at McGill University, building a multimaster synchronous replication system which uses a group communications system (<emphasis>e.g.</emphasis> - <ulink url="http://www.spread.org/"> Spread</ulink>) to control propagation of update requests, which it captures via adding <quote>hooks</quote> to the database engine to detect changes. </para> <para> Being a research project, the key has been to learn about replication as opposed to provide a <quote> production grade </quote> replication system. For a considerable period of time it was only at all usable on rather old releases of PostgreSQL; it is now available for recent releases. </para> <para> The handling of DDL changes has long been somewhat controversial; several attempts to implement DDL handlers have been made, none of which has yet <quote>stuck.</quote> </para> <para> The Slony-II project inherited directly from Postgres-R, with an intent to create a multimaster synchronous replication system atop a group communications system, but then to proceed to something more of <quote>production grade</quote>. </para> <para> The notable distinction from Postgres-R was that, in order to find conflicts earlier, and to diminish the amount of work needing to be done at the synchronization point, Slony-II would try to publish and promote lock requests as soon as possible. (It is possible for this to worsen behaviour in some cases.)</para> <para> Unfortunately several problems emerged: </para> <itemizedlist> <listitem><para> The available open source group communications systems turn out to neither be fast enough nor reliable enough for the purpose. </para></listitem> <listitem><para> One of the goals was for there to be as little need as possible to modify applications to deal with replication. </para> <para> Unfortunately, there turn out to be some cases where competing updates (e.g. - for updates to account balances) would cause multimaster replication to reject transactions due to concurrency problems with high frequency. </para> </listitem> </itemizedlist> <para> As a result of those problems, Slony-II efforts have fallen off somewhat. </para> <para> The remaining developers plan to join together efforts for these two projects. There are working prototypes, but it is not clear when <quote>production grade</quote> versions will emerge. </para> </sect2> </sect1> </chapter> <!-- Keep this comment at the end of the file Local variables: mode:sgml sgml-omittag:nil sgml-shorttag:t sgml-minimize-attributes:nil sgml-always-quote-attributes:t sgml-indent-step:1 sgml-indent-data:t sgml-parent-document:postgres.sgml sgml-default-dtd-file:"./reference.ced" sgml-exposed-tags:nil sgml-local-catalogs:("/usr/lib/sgml/catalog") sgml-local-ecat-files:nil End: --> -- output = ("cbbrowne" "@" "cbbrowne.com") http://cbbrowne.com/info/ What's another word for synonym?
Chris Browne wrote: > Here's a patch to add in the material on replication recently > discussed on pgsql.docs. I'm not thrilled that there were only a few > comments made; I'd be happy to see "slicing and dicing" to see this > made more useful. s/e.g. -/e.g.,/ s/ - /–/ The indentation of the SGML file seems at odds with our conventions (we don't use tabs, for one thing.) You mention this: > <para> Common methods include having triggers on tables, > capturing SQL statements, and capturing transaction log (WAL) > updates </para> However you don't mention anything about WAL captures. Mentioning that PITR is one of these would be good. In the last few paragraphs, the title is about Postgres-R but then you comment on Slony-II. Should the title mention both? > <para> As a result of those problems, Slony-II efforts have fallen > off somewhat. </para> s/those/these/ ? Otherwise looks good to my untrained eyes. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
s/sequnce/sequence/
Nice work!
Nice work!
-- Korry Douglas korryd@enterprisedb.com EnterpriseDB http://www.enterprisedb.com |
Thanks for mentioning about pgpool! > <sect2><title> pgpool </title> > > <para> <application>pgpool</application> was initially created by > Tatsuo Isshii as a portable alternative to Java connection pool > modules. He subsequently observed that it wouldn't take very much > effort to extend it to create a simple replication system: if it > is forwarding SQL queries to a PostgreSQL instance, extending that > to two databases is very straightforward. </para> > > <para> It suffers, by nature, from the problems associated with > replicating using capture of SQL statements; any sort of > nondeterminism in the replicated statements will cause the > databases to diverge. </para> > > <para> On the other hand, it is very easy to install and > configure; for users with simple requirements, that can > suffice. </para> > > <para> A <application>pgpool-2</application> is under way which > introduces a more sophisticated query parser to try to address the > nondeterminism issues; that may limit ongoing support for the > legacy version.</para> pgpool-II (not pgpool-2, please) does not try to resolve nondeterminism issues but try to add parallel SELECT query execution. Also we will continue to support legacy version until pgpool-II becomes stable enough. Also you might want to add pgpool development site URL. FYI, pgpool-II presentation material for PostgreSQL Anniversary Summit can be obtained from: http://www.sraoss.co.jp/event_seminar/2006/pgpool_feat_and_devel.pdf -- Tatsuo Ishii SRA OSS, Inc. Japan
Chris Browne wrote: > Here's a patch to add in the material on replication recently > discussed on pgsql.docs. I'm not thrilled that there were only a few > comments made; I'd be happy to see "slicing and dicing" to see this > made more useful. The agreed-to process was 1. post information on pgsql-general 1.a. solicit comments 2. put information page on web site 3. link from documentation to web site You seem to have short-circuited all that. I don't think this sort of material belongs directly into the PostgreSQL documentation. -- Peter Eisentraut http://developer.postgresql.org/~petere/
> > 1. post information on pgsql-general > 1.a. solicit comments > 2. put information page on web site > 3. link from documentation to web site > > You seem to have short-circuited all that. > > I don't think this sort of material belongs directly into the PostgreSQL > documentation. It might be interesting to have some links in the external projects area for replication, but a section of its own doesn't seem relevant. Joshua D. Drkae -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > >I don't think this sort of material belongs directly into the PostgreSQL > >documentation. Why not? > It might be interesting to have some links in the external projects area > for replication, but a section of its own doesn't seem relevant. I disagree about "having some links". Maybe we should consider adding this as a section in the external projects chapter, instead of having a chapter of its own, but "some links" seems a little short on actual contents. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Joshua D. Drake wrote: > >>> I don't think this sort of material belongs directly into the PostgreSQL >>> documentation. > > Why not? Well Peter said that, not me :) > >> It might be interesting to have some links in the external projects area >> for replication, but a section of its own doesn't seem relevant. > > I disagree about "having some links". Maybe we should consider adding > this as a section in the external projects chapter, instead of having a > chapter of its own, but "some links" seems a little short on actual > contents. O.k. more specifically, I think that the content (even if it is a section) probably deserves discussion in the external projects section. Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > Alvaro Herrera wrote: > >Joshua D. Drake wrote: > > > >>>I don't think this sort of material belongs directly into the PostgreSQL > >>>documentation. > > > >Why not? > > Well Peter said that, not me :) I know, but I though I'd post one message instead of two. (In fact I didn't even think about it -- I just assume it's clear.) > >>It might be interesting to have some links in the external projects area > >>for replication, but a section of its own doesn't seem relevant. > > > >I disagree about "having some links". Maybe we should consider adding > >this as a section in the external projects chapter, instead of having a > >chapter of its own, but "some links" seems a little short on actual > >contents. > > O.k. more specifically, I think that the content (even if it is a > section) probably deserves discussion in the external projects section. Sure, see my suggestion above. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
peter_e@gmx.net (Peter Eisentraut) wrote: > Chris Browne wrote: >> Here's a patch to add in the material on replication recently >> discussed on pgsql.docs. I'm not thrilled that there were only a few >> comments made; I'd be happy to see "slicing and dicing" to see this >> made more useful. > > The agreed-to process was > > 1. post information on pgsql-general > 1.a. solicit comments > 2. put information page on web site > 3. link from documentation to web site > > You seem to have short-circuited all that. > > I don't think this sort of material belongs directly into the PostgreSQL > documentation. I don't recall that anyone agreed to do anything in particular, let alone the process being formalized thus. Bruce was looking for there to be some form of overview of the free replication options so he'd have some kind of tale to tell about it. Apparently the issue comes up fairly frequently. 1. I posted information on pgsql-docs 1.a. I solicited comments 2. There being not many of those, I have put together something that could fit into the documentation. I frankly don't care all that much where the material goes; if it ought to be some place else other than in the documentation tree proper, I'm fine with that. -- select 'cbbrowne' || '@' || 'gmail.com'; http://linuxdatabases.info/info/postgresql.html "How much more helpful could I be than to provide you with the appropriate e-mail address? I could engrave it on a clue-by-four and deliver it to you in Chicago, I suppose." -- Seen on Slashdot...
>> I don't think this sort of material belongs directly into the PostgreSQL >> documentation. > > I don't recall that anyone agreed to do anything in particular, let > alone the process being formalized thus. > > Bruce was looking for there to be some form of overview of the free > replication options so he'd have some kind of tale to tell about it. > Apparently the issue comes up fairly frequently. Then lets FAQ it. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Hello, Peter Eisentraut wrote: > 1. post information on pgsql-general > 1.a. solicit comments > 2. put information page on web site > 3. link from documentation to web site I don't remember such a clear agreement either. I'm glad Chris has written something. And posting it to -docs seems a much better fit, IMHO. Also, I think we didn't really agree on where exactly to put what information. See my previous mail on -hackers for my opinion on that. > I don't think this sort of material belongs directly into the PostgreSQL > documentation. I agree with that. Regards Markus