Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers

From torikoshia
Subject Re: Enhancing Memory Context Statistics Reporting
Date
Msg-id 2c54ef2a67a3da3a8bc15428ed85e58a@oss.nttdata.com
Whole thread Raw
In response to Re: Enhancing Memory Context Statistics Reporting  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers
On 2024-10-24 14:59, Rahila Syed wrote:
> Hi Torikoshia,
> 
> Thank you for reviewing the patch!
> 
> On Wed, Oct 23, 2024 at 9:28 AM torikoshia
> <torikoshia@oss.nttdata.com> wrote:
> 
>> On 2024-10-22 03:24, Rahila Syed wrote:
>>> Hi,
>>> 
>>> PostgreSQL provides following capabilities for reporting memory
>>> contexts statistics.
>>> 1. pg_get_backend_memory_contexts(); [1]
>>> 2. pg_log_backend_memory_contexts(pid); [2]
>>> 
>>> [1]  provides a view of memory context statistics for a local
>> backend,
>>> while [2] prints the memory context statistics of any backend or
>>> auxiliary
>>> process to the PostgreSQL logs. Although [1] offers detailed
>>> statistics,
>>> it is limited to the local backend, restricting its use to
>> PostgreSQL
>>> client backends only.
>>> On the other hand, [2] provides the statistics for all backends
>> but
>>> logs them in a file,
>>> which may not be convenient for quick access.
>>> 
>>> I propose enhancing memory context statistics reporting by
>> combining
>>> these
>>> capabilities and offering a view of memory statistics for all
>>> PostgreSQL backends
>>> and auxiliary processes.
>> 
>> Thanks for working on this!
>> 
>> I originally tried to develop something like your proposal in [2],
>> but
>> there were some difficulties and settled down to implement
>> pg_log_backend_memory_contexts().
> 
> Yes. I am revisiting this problem :)
> 
>>> Attached is a patch that implements this functionality. It
>> introduces
>>> a SQL function
>>> that takes the PID of a backend as an argument, returning a set of
>>> records,
>>> each containing statistics for a single memory context. The
>> underlying
>>> C function
>>> sends a signal to the backend and waits for it to publish its
>> memory
>>> context statistics
>>> before returning them to the user. The publishing backend copies
>>> these statistics
>>> during the next CHECK_FOR_INTERRUPTS call.
>> 
>> I remember waiting for dumping memory contexts stats could cause
>> trouble
>> considering some erroneous cases.
>> 
>> For example, just after the target process finished dumping stats,
>> pg_get_remote_backend_memory_contexts() caller is terminated before
>> reading the stats, calling pg_get_remote_backend_memory_contexts()
>> has
>> no response any more:
>> 
>> [session1]$ psql
>> (40699)=#
>> 
>> $ kill -s SIGSTOP 40699
>> 
>> [session2] psql
>> (40866)=# select * FROM
>> pg_get_remote_backend_memory_contexts('40699', false); -- waiting
>> 
>> $ kill -s SIGSTOP 40866
>> 
>> $ kill -s SIGCONT 40699
>> 
>> [session3] psql
>> (47656) $ select pg_terminate_backend(40866);
>> 
>> $ kill -s SIGCONT 40866 -- session2 terminated
>> 
>> [session3] (47656)=# select * FROM
>> pg_get_remote_backend_memory_contexts('47656', false); -- no
>> response
>> 
>> It seems the reason is memCtxState->in_use is now and
>> memCtxState->proc_id is 40699.
>> We can continue to use pg_get_remote_backend_memory_contexts() after
>> 
>> specifying 40699, but it'd be hard to understand for users.
> 
> Thanks for testing and reporting. While I am not able to reproduce
> this problem,
> I think this may be happening because the requesting backend/caller is
> terminated
> before it gets a chance to mark  memCtxState->in_use as false.

Yeah, when I attached a debugger to 47656 when it was waiting on 
pg_get_remote_backend_memory_contexts('47656', false), 
memCtxState->in_use was true as you suspected:

   (lldb) p memCtxState->in_use
   (bool) $1 = true
   (lldb) p memCtxState->proc_id
   (int) $2 = 40699
   (lldb) p pid
   (int) $3 = 47656

> In this case memCtxState->in_use should be marked as
> 'false' possibly during the processing of ProcDiePending in
> ProcessInterrupts().
> 
>>> This approach facilitates on-demand publication of memory
>> statistics
>>> for a specific backend, rather than collecting them at regular
>>> intervals.
>>> Since past memory context statistics may no longer be relevant,
>>> there is little value in retaining historical data. Any collected
>>> statistics
>>> can be discarded once read by the client backend.
>>> 
>>> A fixed-size shared memory block, currently accommodating 30
>> records,
>>> is used to store the statistics. This number was chosen
>> arbitrarily,
>>> as it covers all parent contexts at level 1 (i.e., direct
>> children of
>>> the top memory context)
>>> based on my tests.
>>> Further experiments are needed to determine the optimal number
>>> for summarizing memory statistics.
>>> 
>>> Any additional statistics that exceed the shared memory capacity
>>> are written to a file per backend in the PG_TEMP_FILES_DIR. The
>> client
>>> backend
>>> first reads from the shared memory, and if necessary, retrieves
>> the
>>> remaining data from the file,
>>> combining everything into a unified view. The files are cleaned up
>>> automatically
>>> if a backend crashes or during server restarts.
>>> 
>>> The statistics are reported in a breadth-first search order of the
>>> memory context tree,
>>> with parent contexts reported before their children. This
>> provides a
>>> cumulative summary
>>> before diving into the details of each child context's
>> consumption.
>>> 
>>> The rationale behind the shared memory chunk is to ensure that the
>>> majority of contexts which are the direct children of
>>> TopMemoryContext,
>>> fit into memory
>>> This allows a client to request a summary of memory statistics,
>>> which can be served from memory without the overhead of file
>> access,
>>> unless necessary.
>>> 
>>> A publishing backend signals waiting client backends using a
>> condition
>>> 
>>> variable when it has finished writing its statistics to memory.
>>> The client backend checks whether the statistics belong to the
>>> requested backend.
>>> If not, it continues waiting on the condition variable, timing out
>>> after 2 minutes.
>>> This timeout is an arbitrary choice, and further work is required
>> to
>>> determine
>>> a more practical value.
>>> 
>>> All backends use the same memory space to publish their
>> statistics.
>>> Before publishing, a backend checks whether the previous
>> statistics
>>> have been
>>> successfully read by a client using a shared flag, "in_use."
>>> This flag is set by the publishing backend and cleared by the
>> client
>>> backend once the data is read. If a backend cannot publish due to
>>> shared
>>> memory being occupied, it exits the interrupt processing code,
>>> and the client backend times out with a warning.
>>> 
>>> Please find below an example query to fetch memory contexts from
>> the
>>> backend
>>> with id '106114'. Second argument -'get_summary' is 'false',
>>> indicating a request for statistics of all the contexts.
>>> 
>>> postgres=#
>>> select * FROM pg_get_remote_backend_memory_contexts('116292',
>> false)
>>> LIMIT 2;
>>> -[ RECORD 1 ]-+----------------------
>>> name          | TopMemoryContext
>>> ident         |
>>> type          | AllocSet
>>> path          | {0}
>>> total_bytes   | 97696
>>> total_nblocks | 5
>>> free_bytes    | 15376
>>> free_chunks   | 11
>>> used_bytes    | 82320
>>> pid           | 116292
>>> -[ RECORD 2 ]-+----------------------
>>> name          | RowDescriptionContext
>>> ident         |
>>> type          | AllocSet
>>> path          | {0,1}
>>> total_bytes   | 8192
>>> total_nblocks | 1
>>> free_bytes    | 6912
>>> free_chunks   | 0
>>> used_bytes    | 1280
>>> pid           | 116292
>> 
>> 32d3ed8165f821f introduced 1-based path to
>> pg_backend_memory_contexts,
>> but pg_get_remote_backend_memory_contexts() seems to have 0-base
>> path.
> 
> Right. I will change it to match this commit.
> 
>> pg_backend_memory_contexts has "level" column, but
>> pg_get_remote_backend_memory_contexts doesn't.
>> 
>> Are there any reasons for these?
> 
>  No particular reason, I can add this column as well.
> 
> Thank you,
> Rahila Syed

-- 
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.



pgsql-hackers by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Re: Changing the default random_page_cost value
Next
From: Melanie Plageman
Date:
Subject: Docs Build in CI failing with "failed to load external entity"