Re: Future In-Core Replication - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: Future In-Core Replication |
Date | |
Msg-id | 1335642042.3919.53.camel@hvost Whole thread Raw |
In response to | Re: Future In-Core Replication (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Future In-Core Replication
Re: Future In-Core Replication |
List | pgsql-hackers |
On Sat, 2012-04-28 at 09:36 +0100, Simon Riggs wrote: > On Fri, Apr 27, 2012 at 11:50 PM, Christopher Browne <cbbrowne@gmail.com> wrote: > > On Fri, Apr 27, 2012 at 4:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > >> What I'm hoping to do is to build a basic prototype of logical > >> replication using WAL translation, so we can inspect it to see what > >> the downsides are. It's an extremely non-trivial problem and so I > >> expect there to be mountains to climb. There are other routes to > >> logical replication, with messages marshalled in a similar way to > >> Slony/Londiste/Bucardo/Mammoth(?). So there are options, with > >> measurements to be made and discussions to be had. > > > > I'll note that the latest version of Slony ...has made a substantial change to its data > > representation.... > > The basic model I'm working to is that "logical replication" will ship > Logical Change Records (LCRs) using the same transport mechanism that > we built for WAL. One outcome of this LCR approach is probably that you will be shipping changes as they are made and on slave have to either apply them in N parallel transactions and commit each transaction when the LCR for the corresponding transaction says so, or you have to collect the LCR-s before applying and then apply and commit committed-on-master transactions in commit order and throw away the aborted ones. The optimal approach will probably be some combination of these, that is collect and apply short ones, start replay in separate transaction if commit does not arrive in N ms. As to what LCRs should contain, it will probably be locical equivalents of INSERT, UPDATE ... LIMIT 1, DELETE ... LIMIT 1, TRUNCATE and all DDL. The DDL could actually stay "raw" (as in LCRs for system tables) on generator side as hopefully the rule that system tables cant have triggers" does not apply when generating the LCR-s on WAL path. If we need to go back to ALTER TABLE ... commands, then this is probably the wisest to leave for client. Client here could also be some xReader like middleman. I would even go as far as propose a variant for DML-WITH-LIMIT-1 to be added to postgresql's SQL syntax so that the LCRs could be converted to SQL text for some tasks and thus should be easy to process using generic text-based tools. The DML-WITH-LIMIT-1 is required to do single logical updates on tables with non-unique rows. And as for any logical updates we will have huge performance problem when doing UPDATE or DELETE on large table with no indexes, but fortunately this problem is on slave, not master ;) Generating and shipping the LCR-s at WAL-generation time or perhaps even a bit earlier will have a huge performance benefit of not doing double writing of captured events on the master which currently is needed for several reasons, the main one being the determining of which transactions do commit and in what order. (this cant be solved on master without a local event log table as we dont have commit/rollback triggers) If we delegate that part out of the master then this alone enables us to be almost as fast as WAL based replica in most cases, even when we have different logical structure on slaves. > How the LCRs are produced and how they are applied is a subject for > debate and measurement. We're lucky enough to have a variety of > mechanisms to compare, Slony 1.0/2.0, Slony 2.2/Londiste/Bucardo and > its worth adding WAL translation there also. My initial thought is > that WAL translation has many positive aspects to it and we are > investigating. There are also some variants on those themes, such as > the one you discussed above. > > You probably won't recognise this as such, but I hope that people > might see that I'm hoping to build Slony 3.0, Londiste++ etc. At some > point, we'll all say "thats not Slony", but we'll also say (Josh > already did) "thats not binary replication". But it will be the > descendant of all. If we get efficient and flexible logical change event generation on the master, then I'm sure the current trigger-based logical replication providers will switch (for full replication) or at least add and extra LCR-source . It may still make sense to leave some flexibility to the master side, so the some decisions - possibly even complex ones - could be made when generating the LCR-s What I would like is to have some of it exposed to userspace via function which could be used by developers to push their own LCRs. As metioned above, significant part of this approach can be prototyped from user-level triggers as soon as we have triggers on commit and rollback , even though at a slightly reduced performance. That is it will still have the trigger overhead, but we can omit all the extra writing and then re-reading and event-table management on the master. Wanting to play with Streaming Logical Replication (as opposed to current Chunked Logical Replication) is also one of the reasons that I complained when the "command triggers" patch was kicked out from 9.2. > Backwards compatibility is not a goal, please note, but only because > that will complicate matters intensely. Currently there really is nothing similar enough this could be backward compatible to :) > -- > Simon Riggs http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
pgsql-hackers by date: