Re: Future In-Core Replication - Mailing list pgsql-hackers
From | Christopher Browne |
---|---|
Subject | Re: Future In-Core Replication |
Date | |
Msg-id | CAFNqd5U-6=i7+p_NZOSiXJu39VsA_3JH3+hRvVoaWhCrjfWXyQ@mail.gmail.com Whole thread Raw |
In response to | Re: Future In-Core Replication (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: Future In-Core Replication
|
List | pgsql-hackers |
On Fri, Apr 27, 2012 at 4:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > What I'm hoping to do is to build a basic prototype of logical > replication using WAL translation, so we can inspect it to see what > the downsides are. It's an extremely non-trivial problem and so I > expect there to be mountains to climb. There are other routes to > logical replication, with messages marshalled in a similar way to > Slony/Londiste/Bucardo/Mammoth(?). So there are options, with > measurements to be made and discussions to be had. I'll note that the latest version of Slony, expected to be 2.2 (which generally seems to work, but we're stuck at the moment waiting to get free cycles to QA it) has made a substantial change to its data representation. The triggers used to cook data into a sort of "fractional WHERE clause," transforming an I/U/D into a string that you'd trivially combine with the string INSERT INTO/UPDATE/DELETE FROM to get the logical update. If there was need to do anything fancier, you'd be left having to have a "fractional SQL parser" to split the data out by hand. New in 2.2 is that the log data is split out into an array of text values which means that if someone wanted to do some transformation, such as filtering on value, or filtering out columns, they could modify the application-of-updates code to query for the data that they want to fiddle with. No parser needed. It's doubtless worthwhile to take a peek at that to make sure it informs your data representation appropriately. It's important to have data represented in a fashion that is amenable to manipulation, and that decidedly wasn't the case pre-2.2. I wonder if a meaningful transport mechanism might involve combining: a) A trigger that indicates that some data needs to be captured in a "logical" form (rather than the presently pretty purely physical form of WAL) b) Perhaps a way of capturing logical updates in WAL c) One of the old ideas that fell through was to try to capture commit timestamps via triggers. Doing it directly turned out to be too controversial to get in. Perhaps that's something that could be captured via some process that parses WAL. Something seems wrong about that in that it mixes together updates of multiple forms into WAL, physical *and* logical, and perhaps that implies that there should be an altogether separate "logical updates log." (LUL?) That still involves capturing updates in a duplicative fashion, e.g. - WAL + LUL, which seems somehow wrong. Or perhaps I'm tilting at a windmill here. With Slony/Londiste/Bucardo, we're capturing "LUL" in some tables, meaning that it gets written both to the tables' data files as well as WAL. Adding a binary LUL eliminates those table files and attendant WAL updates, thus providing some savings. [Insert a LULCATS joke here...] Perhaps I've just had too much coffee... -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
pgsql-hackers by date: