Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers
| From | torikoshia |
|---|---|
| Subject | Re: Enhancing Memory Context Statistics Reporting |
| Date | |
| Msg-id | c31942268d228c69cd457a503de6926b@oss.nttdata.com Whole thread Raw |
| In response to | Re: Enhancing Memory Context Statistics Reporting (Rahila Syed <rahilasyed90@gmail.com>) |
| Responses |
Re: Enhancing Memory Context Statistics Reporting
|
| List | pgsql-hackers |
On 2025-08-14 07:35, Rahila Syed wrote:
> Hi Torikoshia,
>
> Thank you for reviewing the patch.
>
>> This function is added at the end of Table "9.96. Server Signaling
>> Functions", but since pg_get_process_memory_contexts outputs
>> essentially
>> the same information as pg_log_backend_memory_contexts, it might be
>> better to place them next to each other in the table.
>
> The idea was to place the new addition at the end of the table instead
> of in the middle.
> I’m fine with putting them together, though. I’ll do that in the
> next version unless there’s a
> reason not to.
>
>>> + <parameter>stats_timestamp</parameter>
>> <type>timestamptz</type> )
>>
>>> +typedef struct MemoryStatsDSHashEntry
>>> +{
>>> + char key[64];
>>> + ConditionVariable memcxt_cv;
>>> + int proc_id;
>>> + int total_stats;
>>> + bool summary;
>>> + dsa_pointer memstats_dsa_pointer;
>>> + TimestampTz stats_timestamp;
>>> +} MemoryStatsDSHashEntry;
>>
>> stats_timestamp appears only in the two places below in the patch,
>> but
>> it does not seem to be actually output.
>> Is this column unnecessary?
>
> Thank you for pointing this out. This is removed in the attached
> patch, as it was a
> remnant from the previous design. As old statistics are discarded in
> the new design,
> a timestamp field is not needed anymore.
>
>> Specifying 0 for timeout causes a crash:
>> Should 0 be handled safely and treated as “no timeout”, or
>> rejected as
>> an error?
>
> Good catch.
> The crash has been resolved in the attached patch. It was caused by a
> missing
> ConditionVariableCancelSleep() call when exiting without statistics
> due to a timeout value of 0.
> A 0 timeout means that statistics should only be retrieved if they are
> immediately available,
> without waiting. We could exit with a warning/error saying "too low
> timeout", but I think it's worthwhile
> to try fetching the statistics if possible.
>
>> Similarly, specifying a negative value for timeout still works:
>>
>> =# select * from pg_get_process_memory_contexts(30590, true,
>> -10);
>>
>> It might be better to reject negative values similar to
>> pg_terminate_backend().
>
> Fixed as suggested by you in the attached patch.
> Currently, negative values are interpreted as an indefinite wait for
> statistics.
> This could cause the client to hang if the server process exits
> without providing statistics.
> To avoid this, it would be better to exit after displaying a warning
> when the user specifies
> negative timeouts.
>
>>> + /* Retreive the client key fo publishing statistics */
>>
>> fo -> for?
>
> Fixed.
>
>>> + */
>>> +#define MEMSTATS_WAIT_TIMEOUT 100
>>
>> MEMSTATS_WAIT_TIMEOUT is defined, but it doesn’t seem to be used.
>
> This is removed now as it was a leftover from the previous design.
>
> The attached patch also fixes an assertion failure I observed when a
> client times out
> before the last requested process can publish its statistics. A client
> frees the memory
> reserved for storing the statistics when it exits the function after
> timeout. Since a
> server process was notified, it might attempt to read the same client
> entry and access the dsa
> memory reserved for statistics resulting in the assertion
> failure. I resolved this by including a check for this scenario and
> then exiting the handler
> function accordingly.
Thanks for updating the patch!
However, when I ran pg_get_process_memory_contexts() with summary =
true, it took a while and returned nothing:
=# select pg_get_process_memory_contexts(pg_backend_pid(), true, 1)
from pg_stat_activity ;
pg_get_process_memory_contexts
--------------------------------
(0 rows)
Time: 6026.291 ms (00:06.026)
Since v32 patch quickly returned the memory contexts as expected with
the same parameter specified, there seems to be some degradation. Could
you check it?
--
Regards,
--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.
pgsql-hackers by date: