Thread: Re: Enhancing Memory Context Statistics Reporting
On 2024-10-22 03:24, Rahila Syed wrote: > Hi, > > PostgreSQL provides following capabilities for reporting memory > contexts statistics. > 1. pg_get_backend_memory_contexts(); [1] > 2. pg_log_backend_memory_contexts(pid); [2] > > [1] provides a view of memory context statistics for a local backend, > while [2] prints the memory context statistics of any backend or > auxiliary > process to the PostgreSQL logs. Although [1] offers detailed > statistics, > it is limited to the local backend, restricting its use to PostgreSQL > client backends only. > On the other hand, [2] provides the statistics for all backends but > logs them in a file, > which may not be convenient for quick access. > > I propose enhancing memory context statistics reporting by combining > these > capabilities and offering a view of memory statistics for all > PostgreSQL backends > and auxiliary processes. Thanks for working on this! I originally tried to develop something like your proposal in [2], but there were some difficulties and settled down to implement pg_log_backend_memory_contexts(). > Attached is a patch that implements this functionality. It introduces > a SQL function > that takes the PID of a backend as an argument, returning a set of > records, > each containing statistics for a single memory context. The underlying > C function > sends a signal to the backend and waits for it to publish its memory > context statistics > before returning them to the user. The publishing backend copies > these statistics > during the next CHECK_FOR_INTERRUPTS call. I remember waiting for dumping memory contexts stats could cause trouble considering some erroneous cases. For example, just after the target process finished dumping stats, pg_get_remote_backend_memory_contexts() caller is terminated before reading the stats, calling pg_get_remote_backend_memory_contexts() has no response any more: [session1]$ psql (40699)=# $ kill -s SIGSTOP 40699 [session2] psql (40866)=# select * FROM pg_get_remote_backend_memory_contexts('40699', false); -- waiting $ kill -s SIGSTOP 40866 $ kill -s SIGCONT 40699 [session3] psql (47656) $ select pg_terminate_backend(40866); $ kill -s SIGCONT 40866 -- session2 terminated [session3] (47656)=# select * FROM pg_get_remote_backend_memory_contexts('47656', false); -- no response It seems the reason is memCtxState->in_use is now and memCtxState->proc_id is 40699. We can continue to use pg_get_remote_backend_memory_contexts() after specifying 40699, but it'd be hard to understand for users. > This approach facilitates on-demand publication of memory statistics > for a specific backend, rather than collecting them at regular > intervals. > Since past memory context statistics may no longer be relevant, > there is little value in retaining historical data. Any collected > statistics > can be discarded once read by the client backend. > > A fixed-size shared memory block, currently accommodating 30 records, > is used to store the statistics. This number was chosen arbitrarily, > as it covers all parent contexts at level 1 (i.e., direct children of > the top memory context) > based on my tests. > Further experiments are needed to determine the optimal number > for summarizing memory statistics. > > Any additional statistics that exceed the shared memory capacity > are written to a file per backend in the PG_TEMP_FILES_DIR. The client > backend > first reads from the shared memory, and if necessary, retrieves the > remaining data from the file, > combining everything into a unified view. The files are cleaned up > automatically > if a backend crashes or during server restarts. > > The statistics are reported in a breadth-first search order of the > memory context tree, > with parent contexts reported before their children. This provides a > cumulative summary > before diving into the details of each child context's consumption. > > The rationale behind the shared memory chunk is to ensure that the > majority of contexts which are the direct children of > TopMemoryContext, > fit into memory > This allows a client to request a summary of memory statistics, > which can be served from memory without the overhead of file access, > unless necessary. > > A publishing backend signals waiting client backends using a condition > > variable when it has finished writing its statistics to memory. > The client backend checks whether the statistics belong to the > requested backend. > If not, it continues waiting on the condition variable, timing out > after 2 minutes. > This timeout is an arbitrary choice, and further work is required to > determine > a more practical value. > > All backends use the same memory space to publish their statistics. > Before publishing, a backend checks whether the previous statistics > have been > successfully read by a client using a shared flag, "in_use." > This flag is set by the publishing backend and cleared by the client > backend once the data is read. If a backend cannot publish due to > shared > memory being occupied, it exits the interrupt processing code, > and the client backend times out with a warning. > > Please find below an example query to fetch memory contexts from the > backend > with id '106114'. Second argument -'get_summary' is 'false', > indicating a request for statistics of all the contexts. > > postgres=# > select * FROM pg_get_remote_backend_memory_contexts('116292', false) > LIMIT 2; > -[ RECORD 1 ]-+---------------------- > name | TopMemoryContext > ident | > type | AllocSet > path | {0} > total_bytes | 97696 > total_nblocks | 5 > free_bytes | 15376 > free_chunks | 11 > used_bytes | 82320 > pid | 116292 > -[ RECORD 2 ]-+---------------------- > name | RowDescriptionContext > ident | > type | AllocSet > path | {0,1} > total_bytes | 8192 > total_nblocks | 1 > free_bytes | 6912 > free_chunks | 0 > used_bytes | 1280 > pid | 116292 32d3ed8165f821f introduced 1-based path to pg_backend_memory_contexts, but pg_get_remote_backend_memory_contexts() seems to have 0-base path. pg_backend_memory_contexts has "level" column, but pg_get_remote_backend_memory_contexts doesn't. Are there any reasons for these? > TODO: > 1. Determine the behaviour when the statistics don't fit in one file. > > [1] PostgreSQL: Re: Creating a function for exposing memory usage of > backend process [1] > > [2] PostgreSQL: Re: Get memory contexts of an arbitrary backend > process [2] > > Thank you, > Rahila Syed > > > > Links: > ------ > [1] > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2F0a768ae1-1703-59c7-86cc-7068ff5e318c%2540oss.nttdata.com&data=05%7C02%7Csyedrahila%40microsoft.com%7C3b35e97c29cf4796042408dcee8a4dbb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638647525436604911%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=cbO2DBP6IsgMPTEVFNh%2FKeq4IoK3MZvTpzKkCQzNPMo%3D&reserved=0 > [2] > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fmessage-id%2Fbea016ad-d1a7-f01d-a7e8-01106a1de77f%2540oss.nttdata.com&data=05%7C02%7Csyedrahila%40microsoft.com%7C3b35e97c29cf4796042408dcee8a4dbb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638647525436629740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=UCwkwg6kikVEf0oHf3%2BlliA%2FTUdMG%2F0cOiMta7fjPPk%3D&reserved=0 -- Regards, -- Atsushi Torikoshi Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.
Hi Torikoshia,
Thank you for reviewing the patch!
On Wed, Oct 23, 2024 at 9:28 AM torikoshia <torikoshia@oss.nttdata.com> wrote:
On 2024-10-22 03:24, Rahila Syed wrote:
> Hi,
>
> PostgreSQL provides following capabilities for reporting memory
> contexts statistics.
> 1. pg_get_backend_memory_contexts(); [1]
> 2. pg_log_backend_memory_contexts(pid); [2]
>
> [1] provides a view of memory context statistics for a local backend,
> while [2] prints the memory context statistics of any backend or
> auxiliary
> process to the PostgreSQL logs. Although [1] offers detailed
> statistics,
> it is limited to the local backend, restricting its use to PostgreSQL
> client backends only.
> On the other hand, [2] provides the statistics for all backends but
> logs them in a file,
> which may not be convenient for quick access.
>
> I propose enhancing memory context statistics reporting by combining
> these
> capabilities and offering a view of memory statistics for all
> PostgreSQL backends
> and auxiliary processes.
Thanks for working on this!
I originally tried to develop something like your proposal in [2], but
there were some difficulties and settled down to implement
pg_log_backend_memory_contexts().
Yes. I am revisiting this problem :)
> Attached is a patch that implements this functionality. It introduces
> a SQL function
> that takes the PID of a backend as an argument, returning a set of
> records,
> each containing statistics for a single memory context. The underlying
> C function
> sends a signal to the backend and waits for it to publish its memory
> context statistics
> before returning them to the user. The publishing backend copies
> these statistics
> during the next CHECK_FOR_INTERRUPTS call.
I remember waiting for dumping memory contexts stats could cause trouble
considering some erroneous cases.
For example, just after the target process finished dumping stats,
pg_get_remote_backend_memory_contexts() caller is terminated before
reading the stats, calling pg_get_remote_backend_memory_contexts() has
no response any more:
[session1]$ psql
(40699)=#
$ kill -s SIGSTOP 40699
[session2] psql
(40866)=# select * FROM
pg_get_remote_backend_memory_contexts('40699', false); -- waiting
$ kill -s SIGSTOP 40866
$ kill -s SIGCONT 40699
[session3] psql
(47656) $ select pg_terminate_backend(40866);
$ kill -s SIGCONT 40866 -- session2 terminated
[session3] (47656)=# select * FROM
pg_get_remote_backend_memory_contexts('47656', false); -- no response
It seems the reason is memCtxState->in_use is now and
memCtxState->proc_id is 40699.
We can continue to use pg_get_remote_backend_memory_contexts() after
specifying 40699, but it'd be hard to understand for users.
Thanks for testing and reporting. While I am not able to reproduce this problem,
I think this may be happening because the requesting backend/caller is terminated
before it gets a chance to mark memCtxState->in_use as false.
In this case memCtxState->in_use should be marked as
'false' possibly during the processing of ProcDiePending in
ProcessInterrupts().
> This approach facilitates on-demand publication of memory statistics
> for a specific backend, rather than collecting them at regular
> intervals.
> Since past memory context statistics may no longer be relevant,
> there is little value in retaining historical data. Any collected
> statistics
> can be discarded once read by the client backend.
>
> A fixed-size shared memory block, currently accommodating 30 records,
> is used to store the statistics. This number was chosen arbitrarily,
> as it covers all parent contexts at level 1 (i.e., direct children of
> the top memory context)
> based on my tests.
> Further experiments are needed to determine the optimal number
> for summarizing memory statistics.
>
> Any additional statistics that exceed the shared memory capacity
> are written to a file per backend in the PG_TEMP_FILES_DIR. The client
> backend
> first reads from the shared memory, and if necessary, retrieves the
> remaining data from the file,
> combining everything into a unified view. The files are cleaned up
> automatically
> if a backend crashes or during server restarts.
>
> The statistics are reported in a breadth-first search order of the
> memory context tree,
> with parent contexts reported before their children. This provides a
> cumulative summary
> before diving into the details of each child context's consumption.
>
> The rationale behind the shared memory chunk is to ensure that the
> majority of contexts which are the direct children of
> TopMemoryContext,
> fit into memory
> This allows a client to request a summary of memory statistics,
> which can be served from memory without the overhead of file access,
> unless necessary.
>
> A publishing backend signals waiting client backends using a condition
>
> variable when it has finished writing its statistics to memory.
> The client backend checks whether the statistics belong to the
> requested backend.
> If not, it continues waiting on the condition variable, timing out
> after 2 minutes.
> This timeout is an arbitrary choice, and further work is required to
> determine
> a more practical value.
>
> All backends use the same memory space to publish their statistics.
> Before publishing, a backend checks whether the previous statistics
> have been
> successfully read by a client using a shared flag, "in_use."
> This flag is set by the publishing backend and cleared by the client
> backend once the data is read. If a backend cannot publish due to
> shared
> memory being occupied, it exits the interrupt processing code,
> and the client backend times out with a warning.
>
> Please find below an example query to fetch memory contexts from the
> backend
> with id '106114'. Second argument -'get_summary' is 'false',
> indicating a request for statistics of all the contexts.
>
> postgres=#
> select * FROM pg_get_remote_backend_memory_contexts('116292', false)
> LIMIT 2;
> -[ RECORD 1 ]-+----------------------
> name | TopMemoryContext
> ident |
> type | AllocSet
> path | {0}
> total_bytes | 97696
> total_nblocks | 5
> free_bytes | 15376
> free_chunks | 11
> used_bytes | 82320
> pid | 116292
> -[ RECORD 2 ]-+----------------------
> name | RowDescriptionContext
> ident |
> type | AllocSet
> path | {0,1}
> total_bytes | 8192
> total_nblocks | 1
> free_bytes | 6912
> free_chunks | 0
> used_bytes | 1280
> pid | 116292
32d3ed8165f821f introduced 1-based path to pg_backend_memory_contexts,
but pg_get_remote_backend_memory_contexts() seems to have 0-base path.
Right. I will change it to match this commit.
pg_backend_memory_contexts has "level" column, but
pg_get_remote_backend_memory_contexts doesn't.
Are there any reasons for these?
No particular reason, I can add this column as well.
Thank you,
Rahila Syed
On 2024-10-24 14:59, Rahila Syed wrote: > Hi Torikoshia, > > Thank you for reviewing the patch! > > On Wed, Oct 23, 2024 at 9:28 AM torikoshia > <torikoshia@oss.nttdata.com> wrote: > >> On 2024-10-22 03:24, Rahila Syed wrote: >>> Hi, >>> >>> PostgreSQL provides following capabilities for reporting memory >>> contexts statistics. >>> 1. pg_get_backend_memory_contexts(); [1] >>> 2. pg_log_backend_memory_contexts(pid); [2] >>> >>> [1] provides a view of memory context statistics for a local >> backend, >>> while [2] prints the memory context statistics of any backend or >>> auxiliary >>> process to the PostgreSQL logs. Although [1] offers detailed >>> statistics, >>> it is limited to the local backend, restricting its use to >> PostgreSQL >>> client backends only. >>> On the other hand, [2] provides the statistics for all backends >> but >>> logs them in a file, >>> which may not be convenient for quick access. >>> >>> I propose enhancing memory context statistics reporting by >> combining >>> these >>> capabilities and offering a view of memory statistics for all >>> PostgreSQL backends >>> and auxiliary processes. >> >> Thanks for working on this! >> >> I originally tried to develop something like your proposal in [2], >> but >> there were some difficulties and settled down to implement >> pg_log_backend_memory_contexts(). > > Yes. I am revisiting this problem :) > >>> Attached is a patch that implements this functionality. It >> introduces >>> a SQL function >>> that takes the PID of a backend as an argument, returning a set of >>> records, >>> each containing statistics for a single memory context. The >> underlying >>> C function >>> sends a signal to the backend and waits for it to publish its >> memory >>> context statistics >>> before returning them to the user. The publishing backend copies >>> these statistics >>> during the next CHECK_FOR_INTERRUPTS call. >> >> I remember waiting for dumping memory contexts stats could cause >> trouble >> considering some erroneous cases. >> >> For example, just after the target process finished dumping stats, >> pg_get_remote_backend_memory_contexts() caller is terminated before >> reading the stats, calling pg_get_remote_backend_memory_contexts() >> has >> no response any more: >> >> [session1]$ psql >> (40699)=# >> >> $ kill -s SIGSTOP 40699 >> >> [session2] psql >> (40866)=# select * FROM >> pg_get_remote_backend_memory_contexts('40699', false); -- waiting >> >> $ kill -s SIGSTOP 40866 >> >> $ kill -s SIGCONT 40699 >> >> [session3] psql >> (47656) $ select pg_terminate_backend(40866); >> >> $ kill -s SIGCONT 40866 -- session2 terminated >> >> [session3] (47656)=# select * FROM >> pg_get_remote_backend_memory_contexts('47656', false); -- no >> response >> >> It seems the reason is memCtxState->in_use is now and >> memCtxState->proc_id is 40699. >> We can continue to use pg_get_remote_backend_memory_contexts() after >> >> specifying 40699, but it'd be hard to understand for users. > > Thanks for testing and reporting. While I am not able to reproduce > this problem, > I think this may be happening because the requesting backend/caller is > terminated > before it gets a chance to mark memCtxState->in_use as false. Yeah, when I attached a debugger to 47656 when it was waiting on pg_get_remote_backend_memory_contexts('47656', false), memCtxState->in_use was true as you suspected: (lldb) p memCtxState->in_use (bool) $1 = true (lldb) p memCtxState->proc_id (int) $2 = 40699 (lldb) p pid (int) $3 = 47656 > In this case memCtxState->in_use should be marked as > 'false' possibly during the processing of ProcDiePending in > ProcessInterrupts(). > >>> This approach facilitates on-demand publication of memory >> statistics >>> for a specific backend, rather than collecting them at regular >>> intervals. >>> Since past memory context statistics may no longer be relevant, >>> there is little value in retaining historical data. Any collected >>> statistics >>> can be discarded once read by the client backend. >>> >>> A fixed-size shared memory block, currently accommodating 30 >> records, >>> is used to store the statistics. This number was chosen >> arbitrarily, >>> as it covers all parent contexts at level 1 (i.e., direct >> children of >>> the top memory context) >>> based on my tests. >>> Further experiments are needed to determine the optimal number >>> for summarizing memory statistics. >>> >>> Any additional statistics that exceed the shared memory capacity >>> are written to a file per backend in the PG_TEMP_FILES_DIR. The >> client >>> backend >>> first reads from the shared memory, and if necessary, retrieves >> the >>> remaining data from the file, >>> combining everything into a unified view. The files are cleaned up >>> automatically >>> if a backend crashes or during server restarts. >>> >>> The statistics are reported in a breadth-first search order of the >>> memory context tree, >>> with parent contexts reported before their children. This >> provides a >>> cumulative summary >>> before diving into the details of each child context's >> consumption. >>> >>> The rationale behind the shared memory chunk is to ensure that the >>> majority of contexts which are the direct children of >>> TopMemoryContext, >>> fit into memory >>> This allows a client to request a summary of memory statistics, >>> which can be served from memory without the overhead of file >> access, >>> unless necessary. >>> >>> A publishing backend signals waiting client backends using a >> condition >>> >>> variable when it has finished writing its statistics to memory. >>> The client backend checks whether the statistics belong to the >>> requested backend. >>> If not, it continues waiting on the condition variable, timing out >>> after 2 minutes. >>> This timeout is an arbitrary choice, and further work is required >> to >>> determine >>> a more practical value. >>> >>> All backends use the same memory space to publish their >> statistics. >>> Before publishing, a backend checks whether the previous >> statistics >>> have been >>> successfully read by a client using a shared flag, "in_use." >>> This flag is set by the publishing backend and cleared by the >> client >>> backend once the data is read. If a backend cannot publish due to >>> shared >>> memory being occupied, it exits the interrupt processing code, >>> and the client backend times out with a warning. >>> >>> Please find below an example query to fetch memory contexts from >> the >>> backend >>> with id '106114'. Second argument -'get_summary' is 'false', >>> indicating a request for statistics of all the contexts. >>> >>> postgres=# >>> select * FROM pg_get_remote_backend_memory_contexts('116292', >> false) >>> LIMIT 2; >>> -[ RECORD 1 ]-+---------------------- >>> name | TopMemoryContext >>> ident | >>> type | AllocSet >>> path | {0} >>> total_bytes | 97696 >>> total_nblocks | 5 >>> free_bytes | 15376 >>> free_chunks | 11 >>> used_bytes | 82320 >>> pid | 116292 >>> -[ RECORD 2 ]-+---------------------- >>> name | RowDescriptionContext >>> ident | >>> type | AllocSet >>> path | {0,1} >>> total_bytes | 8192 >>> total_nblocks | 1 >>> free_bytes | 6912 >>> free_chunks | 0 >>> used_bytes | 1280 >>> pid | 116292 >> >> 32d3ed8165f821f introduced 1-based path to >> pg_backend_memory_contexts, >> but pg_get_remote_backend_memory_contexts() seems to have 0-base >> path. > > Right. I will change it to match this commit. > >> pg_backend_memory_contexts has "level" column, but >> pg_get_remote_backend_memory_contexts doesn't. >> >> Are there any reasons for these? > > No particular reason, I can add this column as well. > > Thank you, > Rahila Syed -- Regards, -- Atsushi Torikoshi Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.