Re: Creating a function for exposing memory usage of backend process - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Creating a function for exposing memory usage of backend process |
Date | |
Msg-id | 398eb7cd-5f75-be1c-3897-36865b7bfabf@oss.nttdata.com Whole thread Raw |
In response to | Creating a function for exposing memory usage of backend process (torikoshia <torikoshia@oss.nttdata.com>) |
Responses |
Re: Creating a function for exposing memory usage of backend process
Re: Creating a function for exposing memory usage of backend process |
List | pgsql-hackers |
On 2020/06/17 22:00, torikoshia wrote: > Hi, > > As you may know better than I do, backend processes sometimes use a lot > of memory because of the various reasons like caches, prepared > statements and cursors. > When supporting PostgreSQL, I face situations for investigating the > reason of memory bloat. > > AFAIK, the way to examine it is attaching a debugger and call > MemoryContextStats(TopMemoryContext), however, I feel some difficulties > doing it: > > - some production environments don't allow us to run a debugger easily > - many lines about memory contexts are hard to analyze Agreed. The feature to view how local memory contexts are used in each process is very useful! > Using an extension(pg_stat_get_memory_context() in pg_cheat_funcs[1]), > we can get the view of the memory contexts, but it's the memory contexts > of the backend which executed the pg_stat_get_memory_context(). > > > [user interface] > If we have a function exposing memory contexts for specified PID, > we can easily examine them. > I imagine a user interface something like this: > > =# SELECT * FROM pg_stat_get_backend_memory_context(PID); I'm afraid that this interface is not convenient when we want to monitor the usages of local memory contexts for all the processes. For example, I'd like to monitor how much memory is totally used to store prepared statements information. For that purpose, I wonder if it's more convenient to provide the view displaying the memory context usages for all the processes. To provide that view, all the processes need to save their local memory context usages into the shared memory or the special files in their convenient timing. For example, backends do that every end of query execution (during waiting for new request from clients). OTOH, the query on the view scans and displays all those information. Of course there would be several issues in this idea. One issue is the performance overhead caused when each process stores its own memory context usage to somewhere. Even if backends do that during waiting for new request from clients, non-negligible overhead might happen. Performance test is necessary. Also this means that we cannot see the memory context usage of the process in the middle of query execution since it's saved at the end of query. If local memory bloat occurs only during query execution and we want to investigate it, we still need to use gdb to output the memory context information. Another issue is that the large amount of shared memory might be necessary to save the memory context usages for all the proceses. We can save the usage information into the file instead, but which would cause more overhead. If we use shared memory, the similar parameter like track_activity_query_size might be necessary. That is, backends save only the specified number of memory context information. If it's zero, the feature is disabled. Also we should reduce the same of information to save. For example, instead of saving all memory context information like MemoryContextStats() prints, it might be better to save the summary stats (per memory context type) from them. > > name | parent | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes| some other attibutes.. > --------------------------+--------------------+-------+-------------+---------------+------------+-------------+------------ > TopMemoryContext | | 0 | 68720 | 5 | 9936 | 16 | 58784 > TopTransactionContext | TopMemoryContext | 1 | 8192 | 1 | 7720 | 0 | 472 > PL/pgSQL function | TopMemoryContext | 1 | 16384 | 2 | 5912 | 1 | 10472 > PL/pgSQL function | TopMemoryContext | 1 | 32768 | 3 | 15824 | 3 | 16944 > dynahash | TopMemoryContext | 1 | 8192 | 1 | 512 | 0 | 7680 > ... > > > [rough implementation ideas and challenges] > I suppose communication between a process which runs > pg_stat_get_backend_memory_context()(referred to as A) and > target backend(referred to as B) is like: > > 1. A sends a message to B and order to dump the memory contexts > 2. B dumps its memory contexts to some shared area > 3. A reads the shared area and returns it to the function invoker > > To do so, there seem some challenges. > > (1) how to share memory contexts information between backend processes > The amount of memory contexts greatly varies depending on the > situation, so it's not appropriate to share the memory contexts using > fixed shared memory. > Also using the file on 'stats_temp_directory' seems difficult thinking > about the background of the shared-memory based stats collector > proposal[2]. > Instead, I'm thinking about using dsm_mq, which allows messages of > arbitrary length to be sent and receive. > > (2) how to send messages wanting memory contexts > Communicating via signal seems simple but assigning a specific number > of signal for this purpose seems wasting. > I'm thinking about using fixed shared memory to put dsm_mq handle. > To send a message, A creates a dsm_mq and put the handle on the shared > memory area. When B founds a handle, B dumps the memory contexts to the > corresponding dsm_mq. > > However, enabling B to find the handle needs to check the shared memory > periodically. I'm not sure the suitable location and timing for this > checking yet, and doubt this way of communication is acceptable because > it gives certain additional loads to all the backends. > > (3) clarifying the necessary attributes > As far as reading the past disucussion[3], it's not so clear what kind > of information should be exposed regarding memory contexts. > > > As a first step, to deal with (3) I'd like to add > pg_stat_get_backend_memory_context() which target is limited to the > local backend process. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: