Re: logical changeset generation v4 - Heikki's thoughts about the patch state - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: logical changeset generation v4 - Heikki's thoughts about the patch state |
Date | |
Msg-id | CA+TgmoYhkMpkB8JZYhVei--h-onT-kT-Ko8bHvrrBUsewi_u-Q@mail.gmail.com Whole thread Raw |
In response to | Re: logical changeset generation v4 - Heikki's thoughts about the patch state (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: logical changeset generation v4 - Heikki's thoughts
about the patch state
Re: logical changeset generation v4 - Heikki's thoughts about the patch state Re: logical changeset generation v4 - Heikki's thoughts about the patch state |
List | pgsql-hackers |
On Thu, Jan 24, 2013 at 6:14 AM, Andres Freund <andres@2ndquadrant.com> wrote: > Thats way much more along the lines of what I am afraid of than the > performance stuff - but Heikki cited those, so I replied to that. > > Note that I didn't say this must, must go in - I just don't think > Heikki's reasoning about why not hit the nail on the head. Fair enough, no argument. Before getting bogged down in technical commentary, let me say this very clearly: I am enormously grateful for your work on this project. Logical replication based on WAL decoding is a feature of enormous value that PostgreSQL has needed for a long time, and your work has made that look like an achievable goal. Furthermore, it seems to me that you have pursued the community process with all the vigor and sincerity for which anyone could ask. Serious design concerns were raised early in the process and you made radical changes to the design which I believe have improved it tremendously, and you've continued to display an outstanding attitude at every phase of this process about which I can't say enough good things. There is no question in my mind that this work is going to be the beginning of a process that revolutionizes the way people think about replication and PostgreSQL, and you deserve our sincere thanks for that. Now, the bad news is, I don't think it's very reasonable to try to commit this to 9.3. I think it is just too much stuff too late in the cycle. I've reviewed some of the patches from time to time but there is a lot more stuff and it's big and complicated and it's not really clear that we have the interface quite right yet, even though I think it's also clear that we are a lot of closer than we were. I don't want to be fixing that during beta, much less after release. > I tried very, very hard to get the basics of the design & interface > solid. Which obviously doesn't man I am succeeding - luckily not being > superhuman after all ;). And I think thats very much where input is > desparetely needed and where I failed to raise enough attention. The > "output plugin" interface follewed by the walsender interface is what > needs to be most closely vetted. > Those are the permanent, user/developer exposed UI and the one we should > try to keep as stable as possible. > > The output plugin callbacks are defined here: > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/include/replication/output_plugin.h;hb=xlog-decoding-rebasing-cf4 > To make it more agnostic of the technology to implement changeset > extraction we possibly should replace the ReorderBuffer(TXN|Change) > structs being passed by something more implementation agnostic. > > walsender interface: > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/backend/replication/repl_gram.y;hb=xlog-decoding-rebasing-cf4 > The interesting new commands are: > 1) K_INIT_LOGICAL_REPLICATION NAME NAME > 2) K_START_LOGICAL_REPLICATION NAME RECPTR plugin_options > 3) K_FREE_LOGICAL_REPLICATION NAME > > 1 & 3 allocate (respectively free) the permanent state associated with > one changeset consumer whereas START_LOGICAL_REPLICATION streams out > changes starting at RECPTR. Forgive me for not having looked at the patch, but to what extent is all this, ah, documented? > Btw, there are currently *no* changes to the wal format at all if > wal_format < logical except that xl_running_xacts are logged more > frequently which obviously could easily be made conditional. Baring bugs > of course. > The changes with wal_level>=logical aren't that big either imo: > * heap_insert, heap_update prevent full page writes from removing their > normal record by using a separate XLogRecData block for the buffer and > the record > * heap_delete adds more data (the pkey of the tuple) after the unchanged > xl_heap_delete struct > * On changes to catalog tables (relfilenode, tid, cmin, cmax) are logged. > > No changes to mvcc for normal backends at all, unless you count the very > slightly changed *Satisfies interface (getting passed a HeapTuple > instead of HeapTupleHeader). > > I am not sure what you're concerned about WRT the on-disk format of the > tuples? We are pretty much nailed down on that due to pg_upgrade'ability > anyway and it could be changed from this patches POV without a problem, > the output plugin just sees normal HeapTuples? Or are you concerned > about the code extracting them from the xlog records? Mostly, my concern is that you've accidentally broken something, or that your code will turn out to be flaky in ways we can't now predict.My only really specific concern at this point is aboutthe special treatment of catalog tables. We've never done anything like that before, and it feels like a bad idea. In particular, the fact that you have to WAL-log new information about cmin/cmax really suggests that we're committing ourselves to the MVCC infrastructure in a way that we weren't previously. There's some category of stuff that our MVCC implementation didn't previously require us to persist on disk which, after this, it will. I don't understand exactly where the boundaries of that are in terms of future changes we might want to make - but I don't like moving the goalposts in that area. > So I think the "won't break anything else" argument can be made rather > fairly if the heapam.c changes, which aren't that complex, are vetted > closely. > > Now, the disucssion about all the code thats active *during* decoding is > something else entirely :/ > >> You >> agreed with Tom that 9.2 is the buggiest release in recent memory, but >> I think logical replication could easily be an order of magnitude >> worse. > > I unfortunately think that not providing more builtin capabilities in > this area also has significant dangers. Imo this is one of the weakest, > or even the weakest, area of postgres. > > I personally have the impression that just about nobody did actual beta > testing of the lastest releases, especially 9.2, and that is the reason > why its the buggiest recent release. > >> I also have serious concerns about checksums and foreign key locks. >> Any single one of those three patches could really inflict >> unprecedented damage on our community's reputation for stability and >> reliability if they turn out to be seriously buggy, and unfortunately >> I don't consider that an unlikely outcome. I don't know what to do >> about it, either. > > I see your point although I would attest both having far more danger of > collateral damage than logical decoding itself. Especially fklocks > pretty much has to be active by default and there's not much that can > be reasonably done about that. > I tried to give fklocks a very thorough look but unfortunately I didn't > know anything beforehand about most areas of the code it touches which > obviously limits the amount of dangers one is seeing (FWIW I am still > mostly concerned with multixact logging and the locking changes in > heapam.c itself). Yeah, I don't disagree with any of that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: