Thread: Missing LWLock protection in pgstat_reset_replslot()
Hi hackers, I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, we don't have any guarantee that the slot is active (then preventing it to be dropped/recreated) when this function is executed. Attached a patch to add the missing protection. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On 01/03/2024 12:15, Bertrand Drouvot wrote: > Hi hackers, > > I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, we > don't have any guarantee that the slot is active (then preventing it to be > dropped/recreated) when this function is executed. Yes, so it seems at quick glance. We have a similar issue in pgstat_fetch_replslot(); it might return stats for wrong slot, if the slot is dropped/recreated concurrently. Do we care? > --- a/src/backend/utils/activity/pgstat_replslot.c > +++ b/src/backend/utils/activity/pgstat_replslot.c > @@ -46,6 +46,8 @@ pgstat_reset_replslot(const char *name) > > Assert(name != NULL); > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + > /* Check if the slot exits with the given name. */ > slot = SearchNamedReplicationSlot(name, true); SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED mode, when you pass need_lock=true. So that at least should be changed to false. -- Heikki Linnakangas Neon (https://neon.tech)
On Tue, Mar 5, 2024 at 1:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED > mode, when you pass need_lock=true. So that at least should be changed > to false. > Also don't we need to release the lock when we return here: /* * Nothing to do for physical slots as we collect stats only for logical * slots. */ if (SlotIsPhysical(slot)) return; thanks Shveta
Hi, On Tue, Mar 05, 2024 at 09:55:32AM +0200, Heikki Linnakangas wrote: > On 01/03/2024 12:15, Bertrand Drouvot wrote: > > Hi hackers, > > > > I think that pgstat_reset_replslot() is missing LWLock protection. Indeed, we > > don't have any guarantee that the slot is active (then preventing it to be > > dropped/recreated) when this function is executed. > > Yes, so it seems at quick glance. Thanks for looking at it! > We have a similar issue in > pgstat_fetch_replslot(); it might return stats for wrong slot, if the slot > is dropped/recreated concurrently. Good catch! > Do we care? Yeah, I think we should: done in v2 attached. > > --- a/src/backend/utils/activity/pgstat_replslot.c > > +++ b/src/backend/utils/activity/pgstat_replslot.c > > @@ -46,6 +46,8 @@ pgstat_reset_replslot(const char *name) > > Assert(name != NULL); > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > + > > /* Check if the slot exits with the given name. */ > > slot = SearchNamedReplicationSlot(name, true); > > SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED mode, > when you pass need_lock=true. So that at least should be changed to false. > Right, done in v2. Also had to add an extra "need_lock" argument to get_replslot_index() for the same reason while taking care of pgstat_fetch_replslot(). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Hi, On Tue, Mar 05, 2024 at 02:19:19PM +0530, shveta malik wrote: > On Tue, Mar 5, 2024 at 1:25 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > > SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED > > mode, when you pass need_lock=true. So that at least should be changed > > to false. > > > > Also don't we need to release the lock when we return here: > > /* > * Nothing to do for physical slots as we collect stats only for logical > * slots. > */ > if (SlotIsPhysical(slot)) > return; D'oh! Thanks! Fixed in v2 shared up-thread. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > /* > > * Nothing to do for physical slots as we collect stats only for logical > > * slots. > > */ > > if (SlotIsPhysical(slot)) > > return; > > D'oh! Thanks! Fixed in v2 shared up-thread. Thanks. Can we try to get rid of multiple LwLockRelease in pgstat_reset_replslot(). Is this any better? /* - * Nothing to do for physical slots as we collect stats only for logical - * slots. + * Reset stats if it is a logical slot. Nothing to do for physical slots + * as we collect stats only for logical slots. */ - if (SlotIsPhysical(slot)) - { - LWLockRelease(ReplicationSlotControlLock); - return; - } - - /* reset this one entry */ - pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, - ReplicationSlotIndex(slot)); + if (SlotIsLogical(slot)) + pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, + ReplicationSlotIndex(slot)); LWLockRelease(ReplicationSlotControlLock); Something similar in pgstat_fetch_replslot() perhaps? thanks Shveta
Hi, On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > Thanks. Can we try to get rid of multiple LwLockRelease in > pgstat_reset_replslot(). Is this any better? > > /* > - * Nothing to do for physical slots as we collect stats only for logical > - * slots. > + * Reset stats if it is a logical slot. Nothing to do for physical slots > + * as we collect stats only for logical slots. > */ > - if (SlotIsPhysical(slot)) > - { > - LWLockRelease(ReplicationSlotControlLock); > - return; > - } > - > - /* reset this one entry */ > - pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, > - ReplicationSlotIndex(slot)); > + if (SlotIsLogical(slot)) > + pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, > + ReplicationSlotIndex(slot)); > > LWLockRelease(ReplicationSlotControlLock); > Yeah, it's easier to read and probably reduce the pgstat_replslot.o object file size a bit for non optimized build. > Something similar in pgstat_fetch_replslot() perhaps? Yeah, all of the above done in v3 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Mar 06, 2024 at 09:05:59AM +0000, Bertrand Drouvot wrote: > Yeah, all of the above done in v3 attached. Interesting, so this relies on the slot index to ensure the unicity of the stat entries. And if the entry pointing to this ID is updated we may refer to just incorrect data. The inconsistency you could get for the fetch and reset cases are narrow, but at least what you are proposing here would protect the index lookup until the entry is copied from shmem, so this offers a better consistency protection when querying pg_stat_get_replication_slot() on a fetch, while avoiding a reset of incorrect data under concurrent activity. In passing.. pgstat_create_replslot() and pgstat_drop_replslot() rely on the assumption that the LWLock ReplicationSlotAllocationLock is taken while calling these routines. Perhaps that would be worth some extra Assert(LWLockHeldByMeInMode()) in pgstat_replslot.c for these two? Not directly related to this problem. -- Michael
Attachment
On Wed, Mar 6, 2024 at 2:36 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > Thanks. Can we try to get rid of multiple LwLockRelease in > > pgstat_reset_replslot(). Is this any better? > > > > /* > > - * Nothing to do for physical slots as we collect stats only for logical > > - * slots. > > + * Reset stats if it is a logical slot. Nothing to do for physical slots > > + * as we collect stats only for logical slots. > > */ > > - if (SlotIsPhysical(slot)) > > - { > > - LWLockRelease(ReplicationSlotControlLock); > > - return; > > - } > > - > > - /* reset this one entry */ > > - pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, > > - ReplicationSlotIndex(slot)); > > + if (SlotIsLogical(slot)) > > + pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid, > > + ReplicationSlotIndex(slot)); > > > > LWLockRelease(ReplicationSlotControlLock); > > > > Yeah, it's easier to read and probably reduce the pgstat_replslot.o object file > size a bit for non optimized build. > > > Something similar in pgstat_fetch_replslot() perhaps? > > Yeah, all of the above done in v3 attached. > Thanks for the patch. For the fix in pgstat_fetch_replslot(), even with the lock in fetch call, there are chances that the concerned slot can be dropped and recreated. --It can happen in a small window in pg_stat_get_replication_slot() when we are consuming the return values of pgstat_fetch_replslot (using slotent). --Also it can happen at a later stage when we have switched to fetching the next slot (obtained from 'pg_replication_slots' through view' pg_stat_replication_slots'), the previous one can be dropped. Ultimately the results of system view 'pg_replication_slots' can still give us non-existing or re-created slots. But yes, I do not deny that it gives us better consistency protection. Do you feel that the lock in pgstat_fetch_replslot() should be moved to its caller when we are done copying the results of that slot? This will improve the protection slightly. thanks Shveta
On Thu, Mar 07, 2024 at 10:57:28AM +0530, shveta malik wrote: > --It can happen in a small window in pg_stat_get_replication_slot() > when we are consuming the return values of pgstat_fetch_replslot > (using slotent). Yeah, it is possible that what you retrieve from pgstat_fetch_replslot() does not refer exactly to the slot's content under concurrent activity, but you cannot protect a single scan of pg_stat_replication_slots as of an effect of its design: pg_stat_get_replication_slot() has to be called multiple times. The patch at least makes sure that the copy of the slot's stats retrieved by pgstat_fetch_entry() is slightly more consistent, but you cannot do better than that except if the data retrieved from pg_replication_slots and its stats are fetched in the same context function call, holding the replslot LWLock for the whole scan duration. > Do you feel that the lock in pgstat_fetch_replslot() should be moved > to its caller when we are done copying the results of that slot? This > will improve the protection slightly. I don't see what extra protection this would offer, as pg_stat_get_replication_slot() is called once for each slot. Feel free to correct me if I'm missing something. -- Michael
Attachment
On Thu, Mar 7, 2024 at 11:12 AM Michael Paquier <michael@paquier.xyz> wrote: > > Yeah, it is possible that what you retrieve from > pgstat_fetch_replslot() does not refer exactly to the slot's content > under concurrent activity, but you cannot protect a single scan of > pg_stat_replication_slots as of an effect of its design: > pg_stat_get_replication_slot() has to be called multiple times. The > patch at least makes sure that the copy of the slot's stats retrieved > by pgstat_fetch_entry() is slightly more consistent Right, I understand that. , but you cannot do > better than that except if the data retrieved from > pg_replication_slots and its stats are fetched in the same context > function call, holding the replslot LWLock for the whole scan > duration. Yes, agreed. > > > Do you feel that the lock in pgstat_fetch_replslot() should be moved > > to its caller when we are done copying the results of that slot? This > > will improve the protection slightly. > > I don't see what extra protection this would offer, as > pg_stat_get_replication_slot() is called once for each slot. Feel > free to correct me if I'm missing something. It slightly improves the chances. pgstat_fetch_replslot is only called from pg_stat_get_replication_slot(), I thought it might be better to acquire lock before we call pgstat_fetch_replslot and release once we are done copying the results for that particular slot. But I also understand that it will not prevent someone from dropping that slot at a later stage when the rest of the calls of pg_stat_get_replication_slot() are still pending. So I am okay with the current one. thanks Shveta
On Thu, Mar 07, 2024 at 11:30:55AM +0530, shveta malik wrote: > It slightly improves the chances. pgstat_fetch_replslot is only > called from pg_stat_get_replication_slot(), I thought it might be > better to acquire lock before we call pgstat_fetch_replslot and > release once we are done copying the results for that particular slot. > But I also understand that it will not prevent someone from dropping > that slot at a later stage when the rest of the calls of > pg_stat_get_replication_slot() are still pending. I doubt that there will be more callers of pgstat_fetch_replslot() in the near future, but at least we would be a bit safer with these internals IDs when manipulating the slots, when considered in isolation of this API call > So I am okay with the current one. Okay, noted. Let's give a couple of days to others, in case there are more comments. (Patch looked OK here after a second look this afternoon.) -- Michael
Attachment
Hi, On Thu, Mar 07, 2024 at 02:17:53PM +0900, Michael Paquier wrote: > On Wed, Mar 06, 2024 at 09:05:59AM +0000, Bertrand Drouvot wrote: > > Yeah, all of the above done in v3 attached. > > In passing.. pgstat_create_replslot() and pgstat_drop_replslot() rely > on the assumption that the LWLock ReplicationSlotAllocationLock is > taken while calling these routines. Perhaps that would be worth some > extra Assert(LWLockHeldByMeInMode()) in pgstat_replslot.c for these > two? Not directly related to this problem. Yeah, good point: I'll create a dedicated patch for that. Note that currently pgstat_drop_replslot() would not satisfy this new Assert when being called from InvalidatePossiblyObsoleteSlot(). I think this call should be removed and created a dedicated thread for that [1]. [1]: https://www.postgresql.org/message-id/ZermH08Eq6YydHpO%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Fri, Mar 08, 2024 at 10:26:21AM +0000, Bertrand Drouvot wrote: > Yeah, good point: I'll create a dedicated patch for that. Sounds good to me. > Note that currently pgstat_drop_replslot() would not satisfy this new Assert > when being called from InvalidatePossiblyObsoleteSlot(). I think this call > should be removed and created a dedicated thread for that [1]. > > [1]: https://www.postgresql.org/message-id/ZermH08Eq6YydHpO%40ip-10-97-1-34.eu-west-3.compute.internal Thanks. -- Michael
Attachment
On Fri, Mar 08, 2024 at 07:46:26PM +0900, Michael Paquier wrote: > Sounds good to me. I've applied the patch of this thread as b36fbd9f8da1, though I did not see a huge point in backpatching as at the end this is just a consistency improvement. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Mar 08, 2024 at 07:46:26PM +0900, Michael Paquier wrote: >> Sounds good to me. > I've applied the patch of this thread as b36fbd9f8da1, though I did > not see a huge point in backpatching as at the end this is just a > consistency improvement. I've closed the CF entry for this [1] as committed. Please re-open it if there was something left to do here. regards, tom lane [1] https://commitfest.postgresql.org/47/4878/
On Tue, Apr 02, 2024 at 03:18:24PM -0400, Tom Lane wrote: > I've closed the CF entry for this [1] as committed. Please re-open > it if there was something left to do here. > > [1] https://commitfest.postgresql.org/47/4878/ Thanks, I was not aware of this one. -- Michael
Attachment
Hi, On Wed, Dec 04, 2024 at 03:45:13AM +0300, Anton A. Melnikov wrote: > Hi! > > b36fbd9f8d message says that inconsistency may still be possible because > statistics are not completely consistent for a single scan of > pg_stat_replication_slots under concurrent replication slot drop or > creation activity. > > Seems there is a reproduction of such a case via isolation test. > Please see the repslot_stat.spec attached. Thanks for the report! While I agree that your test case does produce a failed assertion, I don't think it's linked to b36fbd9f8d (which focused on retrieving consistent stats). As far your test case, it produces: TRAP: failed Assert("!ps->dropped"), File: "pgstat.c", Line: 1400, PID: 189292 I did some test and from what I can see: This is due to the fact that the "dropped" entry is "still in pgStatLocal.shared_hash". Indeed, during the shutdown, Session 1 is going through: pgstat_report_disconnect()->…->pgstat_get_entry_ref()->pgstat_gc_entry_refs()-> pgstat_release_entry_ref()-> “Shared stats entry has been reinitialized, so do not drop” So it's not dropping the entry because It's going in the "Shared stats entry has been reinitialized" case in pgstat_release_entry_ref(). It's doing so because the previous test: " if (pg_atomic_read_u32(&entry_ref->shared_entry->generation) == entry_ref->generation) " is false (entry_ref->shared_entry->generation is 1 while entry_ref->generation is 0). I need to think more about it but it seems to me that those values make sense, so maybe we should drop the entry for this particular case (shmem_exit()). Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Thu, Dec 05, 2024 at 01:26:38PM +0900, Michael Paquier wrote: > On Wed, Dec 04, 2024 at 03:20:03PM +0000, Bertrand Drouvot wrote: > > I need to think more about it but it seems to me that those values make sense, > > so maybe we should drop the entry for this particular case (shmem_exit()). > > The values reported in the central hash table make sense to me. Yeah. > What > does not is that we could hold in the local cache of the process doing > the peeks references to stats from a different slot when doing drops > and creates that would reuse the same stats key. > > I have added in the backends some logs to get into the details of the > sequence with this specific test regarding the generation markup and > the refcount (feel free to use the 0002 attached, grep for "key > 4/0/0" in your logs by replaying the test), and here some details > based on the two permutations in the test sent: Agree. I had created about the same 0002 and was seeing the exact same behavior that lead to my previous message. BTW, that's a good idea to share your 0002, I'll keep in mind to do the same in the future (for ease of discussion). > The assertion is right to complain here at shutdown when writing the > stats: we shouldn't try to drop this entry. Yes. > I've spent a few hours > checking the whole, and we are doing the right things with > gc_request_count, pgStatSharedRefAge and the local cache refresh, as > I've thought first that we were not aggressive with this part of the > cache resets. The problem, as far as I can see, is that > pgstat_gc_entry_refs() did not get the call that it needs to think > about refreshing its local references when an entry updates its > generation number; it only checks if the entry has been dropped, the > generation check is equally important after a reinit because the local > reference is also outdated. I had the same diagnostic but did not reach the "what should be done" yet. I agree with the proposal and with the fact that 818119afcc did miss to use the "generation" in pgstat_gc_entry_refs() to also discard reinitialized entries: - if (!entry_ref->shared_entry->dropped) + /* + * "generation" checks for the case of entries being reinitialized, and + * "dropped" for the case where these are.. dropped. + */ + if (!entry_ref->shared_entry->dropped && + pg_atomic_read_u32(&entry_ref->shared_entry->generation) == + entry_ref->generation) Yeah that seems the right thing to do for me too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com