Re: Timeline following for logical slots - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Timeline following for logical slots |
Date | |
Msg-id | 20160404151529.GD25969@awork2.anarazel.de Whole thread Raw |
In response to | Re: Timeline following for logical slots (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: Timeline following for logical slots
|
List | pgsql-hackers |
On 2016-04-04 22:59:41 +0800, Craig Ringer wrote: > Assuming that here you mean separate slots on different machines > replicating via physical rep: No, I don't. > We don't currently allow the creation of a logical slot on a standby. Nor > replay from it, even to advance it without receiving the decoded > changes. Yes. I know. > > I think the right way to do this is to focus on failover for logical > > rep, with separate slots. The whole idea of integrating this physical > > rep imo makes this a *lot* more complex than necessary. Not all that > > many people are going to want to physical rep and logical rep. > If you're saying we should focus on failover between nodes that're > themselves connected using logical replication rather than physical > replication, I really have to strongly disagree. > > TL;DR for book-length below: We're a long, long way from being able to > deliver even vaguely decent logical rep based failover. I don't buy that. > * Robust sequence decoding and replication. If you were following the later > parts of that discussion you will've seen how fun that's going to be, but > it's the simplest of all of the problems. Unconvinced. People used londiste and slony for years without that, and it's not even remotely at the top of the list of problems with either. > * Logical decoding and sending of in-progress xacts, so the logical client > can already be most of the way through receiving a big xact when it > commits. Without this we have a huge lag spike whenever a big xact happens, > since we must first finish decoding it in to a reorder buffer and can only > then *begin* to send it to the client. During which time no later xacts may > be decoded or replayed to the client. If you're running that rare thing, > the perfect pure OLTP system, you won't care... but good luck finding one > in the real world. So? If you're using logical rep, you've always have that. > * Either parallel apply on the client side or at least buffering of > in-progress xacts on the client side so they can be safely flushed to disk > and confirmed, allowing receive to continue while replay is done on the > client. Otherwise sync rep is completely impractical... and there's no > shortage of systems out there that can't afford to lose any transactions. > Or at least have some crucial transactions they can't lose. That's more or less trivial to implement. In an extension. > * Robust, seamless DDL replication, so things don't just break randomly. > This makes the other points above look nice and simple by comparison. We're talking about a system which involves logical decoding. Whether you have failover via physical rep or not, doesn't have anything to do with this point. > Physical rep *works*. Robustly. Reliably. With decent performance. It's > proven. It supports sync rep. I'm confident telling people to use it. Sure? And nothing prevents you from using it. > I don't think there's any realistic way we're going to get there for > logical rep in 9.6+n for n<2 unless a whole lot more people get on board > and work on it. Even then. I think the primary problem here is that you're focusing on things that just are not very interesting for the majority of users, and which thus won't get very enthusastic help. The way to make progress is to get something basic in, and then iterate from there. Instead you're focussing on the fringes; which nobody cares about, because the basics aren't there. FWIW, I plan to aggressively work on in-core (9.7) logical rep starting in a few weeks. If we can coordinate on that end, I'm very happy, if not then not. > Right now we can deliver logical failover for DBs that: > (b) don't do DDL, ever, or only do some limited DDL via direct admin > commands where they can call some kind of helper function to queue and > apply the DDL; > (c) don't do big transactions or don't care about unbounded lag; > (d) don't need synchronous replication or don't care about possibly large > delays before commit is confirmed; > (e) only manage role creation (among other things) via very strict > processes that can be absolutely guaranteed to run on all nodes And just about nothing of that has to do with any of the recent patches send towards logical rep. I agree about the problems, but you should attack them, instead of building way too complicated workarounds which barely anybody will find interesting. > ... which in my view isn't a great many databases. Meh^2. Slony and londiste have these problems. And the reason they're not used isn't primarily those, but that they're external and waaayyy to complicated to use. > Physical rep has *none* of those problems. (Sure, it has others, but we're > used to them). So? -Andres
pgsql-hackers by date: