Home > mailing lists

Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers

From	Rahila Syed
Subject	Re: Enhancing Memory Context Statistics Reporting
Date	January 29 15:45:38
Msg-id	CAH2L28vAxcePsqV1AbjYeU4QAojyXS5M39d5MVGCSgAcoNybkQ@mail.gmail.com Whole thread Raw
In response to	Re: Enhancing Memory Context Statistics Reporting (Tomas Vondra <tomas@vondra.me>)
List	pgsql-hackers

Tree view

Hi,

On Sat, Jan 25, 2025 at 3:50 AM Tomas Vondra <tomas@vondra.me> wrote:

On 1/24/25 14:47, Rahila Syed wrote:
>
> Hi,
>
>
> Just idea; as an another option, how about blocking new requests to
> the target process (e.g., causing them to fail with an error or
> returning NULL with a warning) if a previous request is still pending?
> Users can simply retry the request if it fails. IMO failing quickly
> seems preferable to getting stuck for a while in cases with concurrent
> requests.
>
> Thank you for the suggestion. I agree that it is better to fail
> early and avoid waiting for a timeout in such cases. I will add a
> "pending request" tracker for this in shared memory. This approach
> will help prevent sending a concurrent request if a request for the
> same backend is still being processed.
>

AFAIK these failures should be extremely rare - we're only talking about
that because the workload I used for testing is highly concurrent, i.e.
it requests memory context info extremely often. I doubt anyone sane is
going to do that in practice ...

Yes, that makes sense.

> IMO, one downside of throwing an error in such cases is that the
> users might wonder if they need to take a corrective action, even
> though the issue is actually going to solve itself and they just
> need to retry. Therefore, issuing a warning or displaying previously
> updated statistics might be a better alternative to throwing an
> error.
>

Wouldn't this be mostly mitigated by adding proper detail/hint to the
error message? Sure, the user can always ignore that (especially when
calling this from a script), but well ... we can only do so much.

OK.

All this makes me think about how we shared pgstat data before the shmem
approach was introduced in PG15. Until then the process signaled pgstat
collector, and the collector wrote the statistics into a file, with a
timestamp. And the process used the timestamp to decide if it's fresh
enough ... Wouldn't the same approach work here?

I imagined it would work something like this:

requesting backend:
-------------------
* set request_ts to current timestamp
* signal the target process, to generate memory context info
* wait until the DSA gets filled with stats_ts > request_ts
* return the data, don't erase anything

target backend
--------------
* clear the signal
* generate the statistics
* set stats_ts to current timestamp
* wait all the backends waiting for the stats (through CV)

I see v11 does almost this, except that it accepts somewhat stale data.

That's correct.

But why would that be necessary? I don't think it's needed, and I don't
think we should accept data from before the process sends the signal.

This is done in an attempt to avoid concurrent requests from timing out.
In such cases, data in response to another request is likely to already be in the

dynamic shared memory. Hence instead of waiting for the latest data and risking a

timeout, the approach displays available statistics that are newer than a defined
threshold. Additionally, since we can't distinguish between sequential and

concurrent requests, we accept somewhat stale data for all requests.

I realize this approach has some issues, mainly regarding how to determine

an appropriate threshold value or a limit for old data.

Therefore, I agree that it makes sense to display the data that is published

after the request is made. If such data can't be published due to concurrent

requests or other delays, the function should detect this and return as soon as

possible.

Thank you,

Rahila Syed

pgsql-hackers by date:

From: "Vitaly Davydov"
Date: 29 January, 15:00:54
Subject: Re: An improvement of ProcessTwoPhaseBuffer logic

From: Bertrand Drouvot
Date: 29 January, 16:32:37
Subject: Re: per backend WAL statistics

Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers

Previous

Next