Race condition in InvalidateObsoleteReplicationSlots() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Race condition in InvalidateObsoleteReplicationSlots() |
Date | |
Msg-id | 20210408001037.wfmk6jud36auhfqm@alap3.anarazel.de Whole thread Raw |
Responses |
Re: Race condition in InvalidateObsoleteReplicationSlots()
|
List | pgsql-hackers |
Hi, I was looking at InvalidateObsoleteReplicationSlots() while reviewing / polishing the logical decoding on standby patch. Which lead me to notice that I think there's a race in InvalidateObsoleteReplicationSlots() (because ResolveRecoveryConflictWithLogicalSlots has a closely related one). void InvalidateObsoleteReplicationSlots(XLogSegNo oldestSegno) { ... LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); for (int i = 0; i < max_replication_slots; i++) { ... if (XLogRecPtrIsInvalid(restart_lsn) || restart_lsn >= oldestLSN) continue; LWLockRelease(ReplicationSlotControlLock); ... for (;;) { ... wspid = ReplicationSlotAcquireInternal(s, NULL, SAB_Inquire); ... SpinLockAcquire(&s->mutex); s->data.invalidated_at = s->data.restart_lsn; s->data.restart_lsn = InvalidXLogRecPtr; SpinLockRelease(&s->mutex); ... As far as I can tell there's no guarantee that the slot wasn't concurrently dropped and another replication slot created at the same offset in ReplicationSlotCtl->replication_slots. Which we then promptly would invalidate, regardless of the slot not actually needing to be invalidated. Note that this is different from the race mentioned in a comment: /* * Signal to terminate the process that owns the slot. * * There is the race condition where other process may own * the slot after the process using it was terminated and before * this process owns it. To handle this case, we signal again * if the PID of the owning process is changed than the last. * * XXX This logic assumes that the same PID is not reused * very quickly. */ It's one thing to terminate a connection erroneously - permanently breaking a replica due to invalidating the wrong slot or such imo is different. Interestingly this problem seems to have been present both in commit c6550776394e25c1620bc8258427c8f1d448080d Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Date: 2020-04-07 18:35:00 -0400 Allow users to limit storage reserved by replication slots commit f9e9704f09daf882f5a1cf1fbe3f5a3150ae2bb9 Author: Fujii Masao <fujii@postgresql.org> Date: 2020-06-19 17:15:52 +0900 Fix issues in invalidation of obsolete replication slots. I think this can be solved in two different ways: 1) Hold ReplicationSlotAllocationLock with LW_SHARED across most of InvalidateObsoleteReplicationSlots(). That way nobody could re-create a new slot in the to-be-obsoleted-slot's place. 2) Atomically check whether the slot needs to be invalidated and try to acquire if needed. Don't release ReplicationSlotControlLock between those two steps. Signal the owner to release the slot iff we couldn't acquire the slot. In the latter case wait and then recheck if the slot still needs to be dropped. To me 2) seems better, because we then can also be sure that the slot still needs to be obsoleted, rather than potentially doing so unnecessarily. It looks to me like several of the problems here stem from trying to reuse code from ReplicationSlotAcquireInternal() (which before this was just named ReplicationSlotAcquire()). I don't think that makes sense, because cases like this want to check if a condition is true, and acquire it only if so. IOW, I think this basically needs to look like ReplicationSlotsDropDBSlots(), except that a different condition is checked, and the if (active_pid) case needs to prepare a condition variable, signal the owner and then wait on the condition variable, to restart after. Greetings, Andres Freund
pgsql-hackers by date: