Re: [HACKERS] Logical decoding on standby - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: [HACKERS] Logical decoding on standby |
Date | |
Msg-id | CAMsr+YHggfrs4CDKBNLmmMgtw81cfnzbQ5Y_m1fvaUUxF6Ttyg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Logical decoding on standby (Petr Jelinek <petr.jelinek@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Logical decoding on standby
Re: [HACKERS] Logical decoding on standby |
List | pgsql-hackers |
On 20 December 2016 at 15:03, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote: >> The biggest change in this patch, and the main intrusive part, is that >> procArray->replication_slot_catalog_xmin is no longer directly used by >> vacuum. Instead, a new ShmemVariableCache->oldestCatalogXmin field is >> added, with a corresponding CheckPoint field. >> [snip] > > If this mechanism would not be needed most of the time, wouldn't it be > better to not have it and just have a way to ask physical slot about > what's the current reserved catalog_xmin (in most cases the standby > should actually know what it is since it's sending the hs_feedback, but > first slot created on such standby may need to check). Yes, and that was actually my originally preferred approach, though the one above does offer the advantage that if something goes wrong we can detect it and fail gracefully. Possibly not worth the complexity though. Your approach requires us to make very sure that hot_standby_feedback does not get turned off by user or become ineffective once we're replicating, though, since we won't have any way to detect when needed tuples are removed. We'd probably just bail out with relcache/syscache lookup errors, but I can't guarantee we wouldn't crash if we tried logical decoding on WAL where needed catalogs have been removed. I initially ran into trouble doing that, but now think I have a way forward. > WRT preventing > hs_feedback going off, we can refuse to start with hs_feedback off when > there are logical slots detected. Yes. There are some ordering issues there though. We load slots quite late in startup and they don't get tracked in checkpoints. So we might launch the walreceiver before we load slots and notice their needed xmin/catalog_xmin. So we need to prevent sending of hot_standby_feedback until slots are loaded, or load slots earlier in startup. The former sounds less intrusive and safer - probably just add an "initialized" flag to ReplicationSlotCtlData and suppress hs_feedback until it becomes true. We'd also have to suppress the validation callback action on the hot_standby_feedback GUC until we know replication slot state is initialised, and perform the check during slot startup instead. The hot_standby_feedback GUC validation check might get called before shmem is even set up so we have to guard against attempts to access a shmem segment that may not event exist yet. The general idea is workable though. Refuse to start if logical slots exist and hot_standby_feedback is off or walreceiver isn't using a physical slot. Refuse to allow hot_standby_feedback to change > We can also refuse to connect to the > master without physical slot if there are logical slots detected. I > don't see problem with either of those. Agreed. We must also be able to reliably enforce that the walreceiver is using a replication slot to connect to the master and refuse to start if it is not. The user could change recovery.conf and restart the walreceiver while we're running, after we perform that check, so walreceiver must also refuse to start if logical replication slots exist but it has no primary slot name configured. > You may ask what if user drops the slot and recreates or somehow > otherwise messes up catalog_xmin on master, well, considering that under > my proposal we'd first (on connect) check the slot for catalog_xmin we'd > know about it so we'd either mark the local slots broken/drop them or > plainly refuse to connect to the master same way as if it didn't have > required WAL anymore (not sure which behavior is better). Note that user > could mess up catalog_xmin even in your design the same way, so it's not > really a regression. Agreed. Checking catalog_xmin of the slot when we connect is sufficient to guard against that, assuming we can trust that the catalog_xmin is actually in effect on the master. Consider cascading setups, where we set our catalog_xmin but it might not be "locked in" until the middle cascaded server relays it to the master. I have a proposed solution to that which I'll outline in a separate patch+post; it ties in to some work on addressing the race between hot standby feedback taking effect and queries starting on the hot standby. It boils down to "add a hot_standby_feedback reply protocol message". > Plus > it might even be okay to only allow creating logical slots on standbys > connected directly to master in v1. True. I didn't consider that. We haven't had much luck in the past with such limitations, but personally I'd consider it a perfectly reasonable one. > But in 0003 I don't understand following code: >> + if (endpos != InvalidXLogRecPtr && !do_start_slot) >> + { >> + fprintf(stderr, >> + _("%s: cannot use --create-slot or --drop-slot together with --endpos\n"), >> + progname); >> + fprintf(stderr, _("Try \"%s --help\" for more information.\n"), >> + progname); >> + exit(1); >> + } > > Why is --create-slot and --endpos not allowed together? What would --create-slot with --endpos do? Ah. I had not realised that it is legal to do pg_recvlogical -S test --create-slot --start -f - -d 'test' i.e. "create a slot then in the same invocation begin decoding from it". I misread and thought that --create-slot and --start were mutually exclusive. I will address that. > 0005 is timeline following which is IMHO ready for committer, as is 0006 > and 0008 and I still maintain the opinion that these should go in soon. I wonder if I should re-order 0005 and 0006 so we can commit the hot_standby test improvements before logical decoding timeline following. > I think parts of this could be committed separately and are ready for > committer IMHO, but there is no way in CF application to mark only part > of patch-set for committer to attract the attention. Yeah. I raised that before and nobody was really sure what to do about it. It's confusing to post patches on the same thread on separate CF entries. It's also confusing to post patches on a nest of inter-related threads to allow each thread to be tracked by a separate CF entry. At the moment I'm aiming to progressively get the underlying infrastructure/test stuff in so we can focus on the core feature. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: