Thread: Re: shared-memory based stats collector - v70
I'm trying to wrap my head around the shared memory stats collector infrastructure from <20220406030008.2qxipjxo776dwnqs@alap3.anarazel.de> committed in 5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd. I have one question about locking -- afaics there's nothing protecting reading the shared memory stats. There is an lwlock protecting concurrent updates of the shared memory stats, but that lock isn't taken when we read the stats. Are we ok relying on atomic 64-bit reads or is there something else going on that I'm missing? In particular I'm looking at pgstat.c:847 in pgstat_fetch_entry() which does this: memcpy(stats_data, pgstat_get_entry_data(kind, entry_ref->shared_stats), kind_info->shared_data_len); stats_data is the returned copy of the stats entry with all the statistics in it. But it's copied from the shared memory location directly using memcpy and there's no locking or change counter or anything protecting this memcpy that I can see. -- greg
At Mon, 8 Aug 2022 11:53:19 -0400, Greg Stark <stark@mit.edu> wrote in > I'm trying to wrap my head around the shared memory stats collector > infrastructure from > <20220406030008.2qxipjxo776dwnqs@alap3.anarazel.de> committed in > 5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd. > > I have one question about locking -- afaics there's nothing protecting > reading the shared memory stats. There is an lwlock protecting > concurrent updates of the shared memory stats, but that lock isn't > taken when we read the stats. Are we ok relying on atomic 64-bit reads > or is there something else going on that I'm missing? > > In particular I'm looking at pgstat.c:847 in pgstat_fetch_entry() > which does this: > > memcpy(stats_data, > pgstat_get_entry_data(kind, entry_ref->shared_stats), > kind_info->shared_data_len); > > stats_data is the returned copy of the stats entry with all the > statistics in it. But it's copied from the shared memory location > directly using memcpy and there's no locking or change counter or > anything protecting this memcpy that I can see. We take LW_SHARED while creating a snapshot of fixed-numbered stats. On the other hand we don't for variable-numbered stats. I agree to you, that we need that also for variable-numbered stats. If I'm not missing something, it's strange that pgstat_lock_entry() only takes LW_EXCLUSIVE. The atached changes the interface of pgstat_lock_entry() but there's only one user since another read of shared stats entry is not using reference. Thus the interface change might be too much. If I just add bare LWLockAcquire/Release() to pgstat_fetch_entry,the amount of the patch will be reduced. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Hi, On 2022-08-09 17:24:35 +0900, Kyotaro Horiguchi wrote: > At Mon, 8 Aug 2022 11:53:19 -0400, Greg Stark <stark@mit.edu> wrote in > > I'm trying to wrap my head around the shared memory stats collector > > infrastructure from > > <20220406030008.2qxipjxo776dwnqs@alap3.anarazel.de> committed in > > 5891c7a8ed8f2d3d577e7eea34dacff12d7b6bbd. > > > > I have one question about locking -- afaics there's nothing protecting > > reading the shared memory stats. There is an lwlock protecting > > concurrent updates of the shared memory stats, but that lock isn't > > taken when we read the stats. Are we ok relying on atomic 64-bit reads > > or is there something else going on that I'm missing? Yes, that's not right. Not sure how it ended up that way. There was a lot of refactoring and pushing down the locking into different places, I guess it got lost somewhere on the way :(. It's unlikely to be a large problem, but we should fix it. > If I'm not missing something, it's strange that pgstat_lock_entry() > only takes LW_EXCLUSIVE. I think it makes some sense, given that there's a larger number of callers for that in various stats-emitting code. Perhaps we could just add a separate function with a _shared() suffix? > The atached changes the interface of > pgstat_lock_entry() but there's only one user since another read of > shared stats entry is not using reference. Thus the interface change > might be too much. If I just add bare LWLockAcquire/Release() to > pgstat_fetch_entry,the amount of the patch will be reduced. Could you try the pgstat_lock_entry_shared() approach? Greetings, Andres Freund
At Tue, 9 Aug 2022 09:53:19 -0700, Andres Freund <andres@anarazel.de> wrote in > Hi, > > On 2022-08-09 17:24:35 +0900, Kyotaro Horiguchi wrote: > > If I'm not missing something, it's strange that pgstat_lock_entry() > > only takes LW_EXCLUSIVE. > > I think it makes some sense, given that there's a larger number of callers for > that in various stats-emitting code. Perhaps we could just add a separate > function with a _shared() suffix? Sure. That was an alternative I had in my mind. > > The atached changes the interface of > > pgstat_lock_entry() but there's only one user since another read of > > shared stats entry is not using reference. Thus the interface change > > might be too much. If I just add bare LWLockAcquire/Release() to > > pgstat_fetch_entry,the amount of the patch will be reduced. > > Could you try the pgstat_lock_entry_shared() approach? Of course. Please find the attached. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Hi, On 8/10/22 4:39 AM, Kyotaro Horiguchi wrote: > At Tue, 9 Aug 2022 09:53:19 -0700, Andres Freund <andres@anarazel.de> wrote in >> Could you try the pgstat_lock_entry_shared() approach? > Of course. Please find the attached. Thanks for the patch! It looks good to me. One nit comment though, instead of: + /* + * Take lwlock directly instead of using pg_stat_lock_entry_shared() + * which requires a reference. + */ what about? + /* + * Acquire the LWLock directly instead of using pg_stat_lock_entry_shared() + * which requires a reference. + */ I think that's more consistent with other comments mentioning LWLock acquisition. Regards, -- Bertrand Drouvot Amazon Web Services: https://aws.amazon.com
At Wed, 10 Aug 2022 14:02:34 +0200, "Drouvot, Bertrand" <bdrouvot@amazon.com> wrote in > what about? > > + /* > + * Acquire the LWLock directly instead of using > pg_stat_lock_entry_shared() > + * which requires a reference. > + */ > > > I think that's more consistent with other comments mentioning LWLock > acquisition. Sure. Thaks!. I did that in the attached. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Attachment
Hi, On 8/22/22 4:32 AM, Kyotaro Horiguchi wrote: > At Wed, 10 Aug 2022 14:02:34 +0200, "Drouvot, Bertrand" <bdrouvot@amazon.com> wrote in >> what about? >> >> + /* >> + * Acquire the LWLock directly instead of using >> pg_stat_lock_entry_shared() >> + * which requires a reference. >> + */ >> >> >> I think that's more consistent with other comments mentioning LWLock >> acquisition. > Sure. Thaks!. I did that in the attached. Thank you! The patch looks good to me. Regards, Bertrand
Hi, On 2022-08-22 11:32:14 +0900, Kyotaro Horiguchi wrote: > At Wed, 10 Aug 2022 14:02:34 +0200, "Drouvot, Bertrand" <bdrouvot@amazon.com> wrote in > > what about? > > > > + /* > > + * Acquire the LWLock directly instead of using > > pg_stat_lock_entry_shared() > > + * which requires a reference. > > + */ > > > > > > I think that's more consistent with other comments mentioning LWLock > > acquisition. > > Sure. Thaks!. I did that in the attached. Pushed, thanks! Greetings, Andres Freund