Re: [HACKERS] Logical decoding on standby - Mailing list pgsql-hackers
From | Petr Jelinek |
---|---|
Subject | Re: [HACKERS] Logical decoding on standby |
Date | |
Msg-id | 989b919c-e6c8-662b-2c07-93eba85f700d@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] Logical decoding on standby (Craig Ringer <craig@2ndquadrant.com>) |
List | pgsql-hackers |
On 21/12/16 04:06, Craig Ringer wrote: > On 20 December 2016 at 15:03, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: > >>> The biggest change in this patch, and the main intrusive part, is that >>> procArray->replication_slot_catalog_xmin is no longer directly used by >>> vacuum. Instead, a new ShmemVariableCache->oldestCatalogXmin field is >>> added, with a corresponding CheckPoint field. >>> [snip] >> >> If this mechanism would not be needed most of the time, wouldn't it be >> better to not have it and just have a way to ask physical slot about >> what's the current reserved catalog_xmin (in most cases the standby >> should actually know what it is since it's sending the hs_feedback, but >> first slot created on such standby may need to check). > > Yes, and that was actually my originally preferred approach, though > the one above does offer the advantage that if something goes wrong we > can detect it and fail gracefully. Possibly not worth the complexity > though. > > Your approach requires us to make very sure that hot_standby_feedback > does not get turned off by user or become ineffective once we're > replicating, though, since we won't have any way to detect when needed > tuples are removed. We'd probably just bail out with relcache/syscache > lookup errors, but I can't guarantee we wouldn't crash if we tried > logical decoding on WAL where needed catalogs have been removed. > > I initially ran into trouble doing that, but now think I have a way forward. > >> WRT preventing >> hs_feedback going off, we can refuse to start with hs_feedback off when >> there are logical slots detected. > > Yes. There are some ordering issues there though. We load slots quite > late in startup and they don't get tracked in checkpoints. So we might > launch the walreceiver before we load slots and notice their needed > xmin/catalog_xmin. So we need to prevent sending of > hot_standby_feedback until slots are loaded, or load slots earlier in > startup. The former sounds less intrusive and safer - probably just > add an "initialized" flag to ReplicationSlotCtlData and suppress > hs_feedback until it becomes true. > > We'd also have to suppress the validation callback action on the > hot_standby_feedback GUC until we know replication slot state is > initialised, and perform the check during slot startup instead. The > hot_standby_feedback GUC validation check might get called before > shmem is even set up so we have to guard against attempts to access a > shmem segment that may not event exist yet. > > The general idea is workable though. Refuse to start if logical slots > exist and hot_standby_feedback is off or walreceiver isn't using a > physical slot. Refuse to allow hot_standby_feedback to change > These are all problems associated with replication slots being in shmem if I understand correctly. I wonder, could we put just bool someplace which is available early that says if there are any logical slots defined? We don't actually need all the slot info, just to know if there are some. > >> You may ask what if user drops the slot and recreates or somehow >> otherwise messes up catalog_xmin on master, well, considering that under >> my proposal we'd first (on connect) check the slot for catalog_xmin we'd >> know about it so we'd either mark the local slots broken/drop them or >> plainly refuse to connect to the master same way as if it didn't have >> required WAL anymore (not sure which behavior is better). Note that user >> could mess up catalog_xmin even in your design the same way, so it's not >> really a regression. > > Agreed. Checking catalog_xmin of the slot when we connect is > sufficient to guard against that, assuming we can trust that the > catalog_xmin is actually in effect on the master. Consider cascading > setups, where we set our catalog_xmin but it might not be "locked in" > until the middle cascaded server relays it to the master. > > I have a proposed solution to that which I'll outline in a separate > patch+post; it ties in to some work on addressing the race between hot > standby feedback taking effect and queries starting on the hot > standby. It boils down to "add a hot_standby_feedback reply protocol > message". > >> Plus >> it might even be okay to only allow creating logical slots on standbys >> connected directly to master in v1. > > True. I didn't consider that. > > We haven't had much luck in the past with such limitations, but > personally I'd consider it a perfectly reasonable one. > I think it's infinitely better with that limitation than the status quo. Especially for failover scenario (you usually won't failover to replica down the cascade as it's always more behind). Not to mention that with every level of cascading you get automatically more lag which means more bloat so it might not even be all that desirable to go that route immediately in v1 when we don't have way to control that bloat/maximum slot lag. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: