Re: Timeline following for logical slots - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Timeline following for logical slots |
Date | |
Msg-id | 20160404100116.GB25969@awork2.anarazel.de Whole thread Raw |
In response to | Re: Timeline following for logical slots (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: Timeline following for logical slots
|
List | pgsql-hackers |
On 2016-04-04 17:50:02 +0800, Craig Ringer wrote: > To rephrase per my understanding: The client only specifies the point it > wants to start seeing decoded commits. Decoding starts from the slot's > restart_lsn, and that's the point from which the accumulation of reorder > buffer contents begins, the snapshot building process begins, and where > accumulation of relcache invalidation information begins. At restart_lsn no > xact that is to be emitted to the client may yet be in progress. Decoding, s/yet/already/ > whether or not the xacts will be fed to the output plugin callbacks, > requires access to the system catalogs. Therefore catalog_xmin reported by > the slot must be >= the real effective catalog_xmin of the heap and valid > at the restart_lsn, not just the confirmed flush point or the point the > client specifies to resume fetching changes from. Hm. Maybe I'm misunderstanding you here, but doesn't it have to be <=? > On the original copy of the slot on the pre-failover master the restart_lsn > would've been further ahead, as would the catalog_xmin. So catalog rows > have been purged. +may > So it's necessary to ensure that the slot's restart_lsn and catalog_xmin > are advanced in a timely, consistent manner on the replica's copy of the > slot at a point where no vacuum changes to the catalog that could remove > needed tuples have been replayed. Right. > The only way I can think of to do that really reliably right now, without > full failover slots, is to use the newly committed pluggable WAL mechanism > and add a hook to SaveSlotToPath() so slot info can be captured, injected > in WAL, and replayed on the replica. I personally think the primary answer is to use separate slots on different machines. Failover slots can be an extension to that at some point, but I think they're a secondary goal. > It'd also be necessary to move > CheckPointReplicationSlots() out of CheckPointGuts() to the start of a > checkpoint/restartpoint when WAL writing is still permitted, like the > failover slots patch does. Ugh. That makes me rather wary. > Basically, failover slots as a plugin using a hook, without the > additions to base backup commands and the backup label. I'm going to be *VERY* hard to convince that adding a hook inside checkpointing code is acceptable. > I'd really hate 9.6 to go out with - still - no way to use logical decoding > in a basic, bog-standard HA/failover environment. It overwhelmingly limits > their utility and it's becoming a major drag on practical use of the > feature. That's a difficulty given that the failover slots patch isn't > especially trivial and you've shown that lazy sync of slot state is not > sufficient. I think the right way to do this is to focus on failover for logical rep, with separate slots. The whole idea of integrating this physical rep imo makes this a *lot* more complex than necessary. Not all that many people are going to want to physical rep and logical rep. > The restart_lsn from the newer copy of the slot is, as you said, a point we > know we can reconstruct visibility info. We can on the master. There's absolutely no guarantee that the associated serialized snapshot is present on the standby. Andres
pgsql-hackers by date: