Thread: Creating a function for exposing memory usage of backend process
Hi, As you may know better than I do, backend processes sometimes use a lot of memory because of the various reasons like caches, prepared statements and cursors. When supporting PostgreSQL, I face situations for investigating the reason of memory bloat. AFAIK, the way to examine it is attaching a debugger and call MemoryContextStats(TopMemoryContext), however, I feel some difficulties doing it: - some production environments don't allow us to run a debugger easily - many lines about memory contexts are hard to analyze Using an extension(pg_stat_get_memory_context() in pg_cheat_funcs[1]), we can get the view of the memory contexts, but it's the memory contexts of the backend which executed the pg_stat_get_memory_context(). [user interface] If we have a function exposing memory contexts for specified PID, we can easily examine them. I imagine a user interface something like this: =# SELECT * FROM pg_stat_get_backend_memory_context(PID); name | parent | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes | some other attibutes.. --------------------------+--------------------+-------+-------------+---------------+------------+-------------+------------ TopMemoryContext | | 0 | 68720 | 5 | 9936 | 16 | 58784 TopTransactionContext | TopMemoryContext | 1 | 8192 | 1 | 7720 | 0 | 472 PL/pgSQL function | TopMemoryContext | 1 | 16384 | 2 | 5912 | 1 | 10472 PL/pgSQL function | TopMemoryContext | 1 | 32768 | 3 | 15824 | 3 | 16944 dynahash | TopMemoryContext | 1 | 8192 | 1 | 512 | 0 | 7680 ... [rough implementation ideas and challenges] I suppose communication between a process which runs pg_stat_get_backend_memory_context()(referred to as A) and target backend(referred to as B) is like: 1. A sends a message to B and order to dump the memory contexts 2. B dumps its memory contexts to some shared area 3. A reads the shared area and returns it to the function invoker To do so, there seem some challenges. (1) how to share memory contexts information between backend processes The amount of memory contexts greatly varies depending on the situation, so it's not appropriate to share the memory contexts using fixed shared memory. Also using the file on 'stats_temp_directory' seems difficult thinking about the background of the shared-memory based stats collector proposal[2]. Instead, I'm thinking about using dsm_mq, which allows messages of arbitrary length to be sent and receive. (2) how to send messages wanting memory contexts Communicating via signal seems simple but assigning a specific number of signal for this purpose seems wasting. I'm thinking about using fixed shared memory to put dsm_mq handle. To send a message, A creates a dsm_mq and put the handle on the shared memory area. When B founds a handle, B dumps the memory contexts to the corresponding dsm_mq. However, enabling B to find the handle needs to check the shared memory periodically. I'm not sure the suitable location and timing for this checking yet, and doubt this way of communication is acceptable because it gives certain additional loads to all the backends. (3) clarifying the necessary attributes As far as reading the past disucussion[3], it's not so clear what kind of information should be exposed regarding memory contexts. As a first step, to deal with (3) I'd like to add pg_stat_get_backend_memory_context() which target is limited to the local backend process. Thanks for reading and how do you think? [1] https://github.com/MasaoFujii/pg_cheat_funcs#setof-record-pg_stat_get_memory_context [2] https://www.postgresql.org/message-id/flat/20180629.173418.190173462.horiguchi.kyotaro@lab.ntt.co.jp [3] https://www.postgresql.org/message-id/20190805171608.g22gxwmfr2r7uf6t%40alap3.anarazel.de Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020/06/17 22:00, torikoshia wrote: > Hi, > > As you may know better than I do, backend processes sometimes use a lot > of memory because of the various reasons like caches, prepared > statements and cursors. > When supporting PostgreSQL, I face situations for investigating the > reason of memory bloat. > > AFAIK, the way to examine it is attaching a debugger and call > MemoryContextStats(TopMemoryContext), however, I feel some difficulties > doing it: > > - some production environments don't allow us to run a debugger easily > - many lines about memory contexts are hard to analyze Agreed. The feature to view how local memory contexts are used in each process is very useful! > Using an extension(pg_stat_get_memory_context() in pg_cheat_funcs[1]), > we can get the view of the memory contexts, but it's the memory contexts > of the backend which executed the pg_stat_get_memory_context(). > > > [user interface] > If we have a function exposing memory contexts for specified PID, > we can easily examine them. > I imagine a user interface something like this: > > =# SELECT * FROM pg_stat_get_backend_memory_context(PID); I'm afraid that this interface is not convenient when we want to monitor the usages of local memory contexts for all the processes. For example, I'd like to monitor how much memory is totally used to store prepared statements information. For that purpose, I wonder if it's more convenient to provide the view displaying the memory context usages for all the processes. To provide that view, all the processes need to save their local memory context usages into the shared memory or the special files in their convenient timing. For example, backends do that every end of query execution (during waiting for new request from clients). OTOH, the query on the view scans and displays all those information. Of course there would be several issues in this idea. One issue is the performance overhead caused when each process stores its own memory context usage to somewhere. Even if backends do that during waiting for new request from clients, non-negligible overhead might happen. Performance test is necessary. Also this means that we cannot see the memory context usage of the process in the middle of query execution since it's saved at the end of query. If local memory bloat occurs only during query execution and we want to investigate it, we still need to use gdb to output the memory context information. Another issue is that the large amount of shared memory might be necessary to save the memory context usages for all the proceses. We can save the usage information into the file instead, but which would cause more overhead. If we use shared memory, the similar parameter like track_activity_query_size might be necessary. That is, backends save only the specified number of memory context information. If it's zero, the feature is disabled. Also we should reduce the same of information to save. For example, instead of saving all memory context information like MemoryContextStats() prints, it might be better to save the summary stats (per memory context type) from them. > > name | parent | level | total_bytes | total_nblocks | free_bytes | free_chunks | used_bytes| some other attibutes.. > --------------------------+--------------------+-------+-------------+---------------+------------+-------------+------------ > TopMemoryContext | | 0 | 68720 | 5 | 9936 | 16 | 58784 > TopTransactionContext | TopMemoryContext | 1 | 8192 | 1 | 7720 | 0 | 472 > PL/pgSQL function | TopMemoryContext | 1 | 16384 | 2 | 5912 | 1 | 10472 > PL/pgSQL function | TopMemoryContext | 1 | 32768 | 3 | 15824 | 3 | 16944 > dynahash | TopMemoryContext | 1 | 8192 | 1 | 512 | 0 | 7680 > ... > > > [rough implementation ideas and challenges] > I suppose communication between a process which runs > pg_stat_get_backend_memory_context()(referred to as A) and > target backend(referred to as B) is like: > > 1. A sends a message to B and order to dump the memory contexts > 2. B dumps its memory contexts to some shared area > 3. A reads the shared area and returns it to the function invoker > > To do so, there seem some challenges. > > (1) how to share memory contexts information between backend processes > The amount of memory contexts greatly varies depending on the > situation, so it's not appropriate to share the memory contexts using > fixed shared memory. > Also using the file on 'stats_temp_directory' seems difficult thinking > about the background of the shared-memory based stats collector > proposal[2]. > Instead, I'm thinking about using dsm_mq, which allows messages of > arbitrary length to be sent and receive. > > (2) how to send messages wanting memory contexts > Communicating via signal seems simple but assigning a specific number > of signal for this purpose seems wasting. > I'm thinking about using fixed shared memory to put dsm_mq handle. > To send a message, A creates a dsm_mq and put the handle on the shared > memory area. When B founds a handle, B dumps the memory contexts to the > corresponding dsm_mq. > > However, enabling B to find the handle needs to check the shared memory > periodically. I'm not sure the suitable location and timing for this > checking yet, and doubt this way of communication is acceptable because > it gives certain additional loads to all the backends. > > (3) clarifying the necessary attributes > As far as reading the past disucussion[3], it's not so clear what kind > of information should be exposed regarding memory contexts. > > > As a first step, to deal with (3) I'd like to add > pg_stat_get_backend_memory_context() which target is limited to the > local backend process. +1 Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Wed, Jun 17, 2020 at 11:56 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > As a first step, to deal with (3) I'd like to add > > pg_stat_get_backend_memory_context() which target is limited to the > > local backend process. > > +1 +1 from me, too. Something that exposed this via shared memory would be quite useful, and there are other things we'd like to expose similarly, such as the plan(s) from the running queries, or even just the untruncated query string(s). I'd like to have a good infrastructure for that sort of thing, but I think it's quite tricky to do properly. You need a variable-size chunk of shared memory, so probably it has to use DSM somehow, and you need to update it as things change, but if you update it too frequently performance will stink. If you ping processes to do the updates, how do you know when they've completed them, and what happens if they don't respond in a timely fashion? These are probably all solvable problems, but I don't think they are very easy ones. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi ! On Thu, Jun 18, 2020 at 12:56 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Agreed. The feature to view how local memory contexts are used in > each process is very useful! +1 > > =# SELECT * FROM pg_stat_get_backend_memory_context(PID); > > I'm afraid that this interface is not convenient when we want to monitor > the usages of local memory contexts for all the processes. For example, > I'd like to monitor how much memory is totally used to store prepared > statements information. For that purpose, I wonder if it's more convenient > to provide the view displaying the memory context usages for > all the processes. How about separating a function that examines memory consumption trends for all processes and a function that examines memory consumption for a particular phase of a particular process? For the former, as Fujii said, the function shows the output limited information for each context type. All processes calculation and output the information at idle status. I think the latter is useful for debugging and other purposes. For example, imagine preparing a function for registration like the following. =# SELECT pg_stat_get_backend_memory_context_regist (pid, context, max_children, calc_point) pid: A target process context: The top level of the context of interest max_children: Maximum number of output for the target context's children (similar to MemoryContextStatsInternal()'s max_children) calc_point: Single or multiple position(s) to calculate and output context information (Existing hooks such as planner_hook, executor_start, etc.. could be used. ) This function informs the target PID to output the information of the specified context at the specified calc_point. When the target PID process reaches the calc_point, it calculates and output the context information one time to a file or DSM. (Currently PostgreSQL has no formal ways of externally modifying the parameters of a particular process, so it may need to be implemented...) Sometimes I want to know the memory usage in the planning phase or others with a query_string and/or plan_tree that before target process move to the idle status. So it would be nice to retrieve memory usage at some arbitrary point in time ! Regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Hi, While going through the mail chain on relation, plan and catalogue caching [1], I'm thinking on the lines that is there a way to know the current relation, plan and catalogue cache sizes? If there is a way already, please ignore this and it would be grateful if someone point me to that. Posting this here as I felt it's relevant. If there is no such way to know the cache sizes and other info such as statistics, number of entries, cache misses, hits etc. can the approach discussed here be applied? If the user knows the cache statistics and other information, may be we can allow user to take appropriate actions such as allowing him to delete few entries through a command or some other way. I'm sorry, If I'm diverting the topic being discussed in this mail thread, please ignore if it is irrelevant. [1] - https://www.postgresql.org/message-id/flat/20161219.201505.11562604.horiguchi.kyotaro%40lab.ntt.co.jp With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Hi, On Fri, Jun 26, 2020 at 3:42 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > While going through the mail chain on relation, plan and catalogue > caching [1], I'm thinking on the lines that is there a way to know the > current relation, plan and catalogue cache sizes? If there is a way > already, please ignore this and it would be grateful if someone point > me to that. AFAIK the only way to get statistics on PostgreSQL's backend internal local memory usage is to use MemoryContextStats() via gdb to output the information to the log, so far. > If there is no such way to know the cache sizes and other info such as > statistics, number of entries, cache misses, hits etc. can the > approach discussed here be applied? I think it's partially yes. > If the user knows the cache statistics and other information, may be > we can allow user to take appropriate actions such as allowing him to > delete few entries through a command or some other way. Yeah, one of the purposes of the features we are discussing here is to use them for such situation. Regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
On 2020-06-20 03:11, Robert Haas wrote: > On Wed, Jun 17, 2020 at 11:56 PM Fujii Masao > <masao.fujii@oss.nttdata.com> wrote: >> > As a first step, to deal with (3) I'd like to add >> > pg_stat_get_backend_memory_context() which target is limited to the >> > local backend process. >> >> +1 > > +1 from me, too. Attached a patch that adds a function exposing memory usage of local backend. It's almost same as pg_cheat_funcs's pg_stat_get_memory_context(). I've also added MemoryContexts identifier because it seems useful to distinguish the same kind of memory contexts. For example, when there are many prepared statements we can distinguish them using identifiers, which shows the cached query. =# SELECT name, ident FROM pg_stat_get_memory_contexts() WHERE name = 'CachedPlanSource'; name | ident ------------------+-------------------------------- CachedPlanSource | PREPARE q1(text) AS SELECT ..; CachedPlanSource | PREPARE q2(text) AS SELECT ..; (2 rows) > Something that exposed this via shared memory would > be quite useful, and there are other things we'd like to expose > similarly, such as the plan(s) from the running queries, or even just > the untruncated query string(s). I'd like to have a good > infrastructure for that sort of thing, but I think it's quite tricky > to do properly. You need a variable-size chunk of shared memory, so > probably it has to use DSM somehow, and you need to update it as > things change, but if you update it too frequently performance will > stink. If you ping processes to do the updates, how do you know when > they've completed them, and what happens if they don't respond in a > timely fashion? These are probably all solvable problems, but I don't > think they are very easy ones. Thanks for your comments! It seems hard as you pointed out. I'm going to consider it along with the advice of Fujii and Kasahara. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/06/29 12:01, torikoshia wrote: > On 2020-06-20 03:11, Robert Haas wrote: >> On Wed, Jun 17, 2020 at 11:56 PM Fujii Masao >> <masao.fujii@oss.nttdata.com> wrote: >>> > As a first step, to deal with (3) I'd like to add >>> > pg_stat_get_backend_memory_context() which target is limited to the >>> > local backend process. >>> >>> +1 >> >> +1 from me, too. > > Attached a patch that adds a function exposing memory usage of local > backend. Thanks for the patch! Could you add this patch to Commitfest 2020-07? > > It's almost same as pg_cheat_funcs's pg_stat_get_memory_context(). This patch provides only the function, but isn't it convenient to provide the view like pg_shmem_allocations? > I've also added MemoryContexts identifier because it seems useful to > distinguish the same kind of memory contexts. Sounds good. But isn't it better to document each column? Otherwise, users cannot undersntad what "ident" column indicates. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
> > If there is no such way to know the cache sizes and other info such as > > statistics, number of entries, cache misses, hits etc. can the > > approach discussed here be applied? > I think it's partially yes. > > > If the user knows the cache statistics and other information, may be > > we can allow user to take appropriate actions such as allowing him to > > delete few entries through a command or some other way. > Yeah, one of the purposes of the features we are discussing here is to > use them for such situation. > Thanks Kasahara for the response. I will try to understand more about getting the cache statistics and also will study the possibility of applying this approach being discussed here in this thread. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
On Mon, Jun 29, 2020 at 3:13 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Could you add this patch to Commitfest 2020-07? Thanks for notifying, I've added it. BTW, I registered you as an author because this patch used lots of pg_cheat_funcs' codes. https://commitfest.postgresql.org/28/2622/ > This patch provides only the function, but isn't it convenient to > provide the view like pg_shmem_allocations? > Sounds good. But isn't it better to document each column? > Otherwise, users cannot undersntad what "ident" column indicates. Agreed. Attached a patch for creating a view for local memory context and its explanation on the document. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/07/01 14:48, torikoshia wrote: > On Mon, Jun 29, 2020 at 3:13 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > >> Could you add this patch to Commitfest 2020-07? > > Thanks for notifying, I've added it. > BTW, I registered you as an author because this patch used > lots of pg_cheat_funcs' codes. > > https://commitfest.postgresql.org/28/2622/ Thanks! > >> This patch provides only the function, but isn't it convenient to >> provide the view like pg_shmem_allocations? > >> Sounds good. But isn't it better to document each column? >> Otherwise, users cannot undersntad what "ident" column indicates. > > Agreed. > Attached a patch for creating a view for local memory context > and its explanation on the document. Thanks for updating the patch! You treat pg_stat_local_memory_contexts view as a dynamic statistics view. But isn't it better to treat it as just system view like pg_shmem_allocations or pg_prepared_statements because it's not statistics information? If yes, we would need to rename the view, move the documentation from monitoring.sgml to catalogs.sgml, and move the code from pgstat.c to the more appropriate source file. + tupdesc = rsinfo->setDesc; + tupstore = rsinfo->setResult; These seem not to be necessary. + /* + * It seems preferable to label dynahash contexts with just the hash table + * name. Those are already unique enough, so the "dynahash" part isn't + * very helpful, and this way is more consistent with pre-v11 practice. + */ + if (ident && strcmp(name, "dynahash") == 0) + { + name = ident; + ident = NULL; + } IMO it seems better to report both name and ident even in the case of dynahash than report only ident (as name). We can easily understand the context is used for dynahash when it's reported. But you think it's better to report NULL rather than "dynahash"? +/* ---------- + * The max bytes for showing identifiers of MemoryContext. + * ---------- + */ +#define MEMORY_CONTEXT_IDENT_SIZE 1024 Do we really need this upper size limit? Could you tell me why this is necessary? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
> On 1 Jul 2020, at 07:48, torikoshia <torikoshia@oss.nttdata.com> wrote: > Attached a patch for creating a view for local memory context > and its explanation on the document. For the next version (if there will be one), please remove the catversion bump from the patch as it will otherwise just break patch application without constant rebasing (as it's done now). The committer will handle the catversion change if/when it gets committed. cheers ./daniel
On Wed, Jul 1, 2020 at 4:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: Thanks for reviewing! > You treat pg_stat_local_memory_contexts view as a dynamic statistics > view. > But isn't it better to treat it as just system view like > pg_shmem_allocations > or pg_prepared_statements because it's not statistics information? If > yes, > we would need to rename the view, move the documentation from > monitoring.sgml to catalogs.sgml, and move the code from pgstat.c to > the more appropriate source file. Agreed. At first, I thought not only statistical but dynamic information about exactly what is going on was OK when reading the sentence on the manual below. > PostgreSQL also supports reporting dynamic information about exactly > what is going on in the system right now, such as the exact command > currently being executed by other server processes, and which other > connections exist in the system. This facility is independent of the > collector process. However, now I feel something strange because existing pg_stats_* views seem to be per cluster information but the view I'm adding is about per backend information. I'm going to do some renaming and transportations. - view name: pg_memory_contexts - function name: pg_get_memory_contexts() - source file: mainly src/backend/utils/mmgr/mcxt.c > + tupdesc = rsinfo->setDesc; > + tupstore = rsinfo->setResult; > > These seem not to be necessary. Thanks! > + /* > + * It seems preferable to label dynahash contexts with just the > hash table > + * name. Those are already unique enough, so the "dynahash" > part isn't > + * very helpful, and this way is more consistent with pre-v11 > practice. > + */ > + if (ident && strcmp(name, "dynahash") == 0) > + { > + name = ident; > + ident = NULL; > + } > > IMO it seems better to report both name and ident even in the case of > dynahash than report only ident (as name). We can easily understand > the context is used for dynahash when it's reported. But you think it's > better to report NULL rather than "dynahash"? These codes come from MemoryContextStatsPrint() and my intension is to keep consistent with it. > +/* ---------- > + * The max bytes for showing identifiers of MemoryContext. > + * ---------- > + */ > +#define MEMORY_CONTEXT_IDENT_SIZE 1024 > > Do we really need this upper size limit? Could you tell me why > this is necessary? It also derived from MemoryContextStatsPrint(). I suppose it is for keeping readability of the log.. I'm going to follow the discussion on the mailing list and find why these codes were introduced. If there's no important reason to do the same in our context, I'll change them. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020-07-01 20:47, Daniel Gustafsson wrote: > For the next version (if there will be one), please remove the > catversion bump > from the patch as it will otherwise just break patch application > without > constant rebasing (as it's done now). The committer will handle the > catversion > change if/when it gets committed. > > cheers ./daniel Thanks for telling me it! I'll do that way next time. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020/07/01 22:15, torikoshia wrote: > On Wed, Jul 1, 2020 at 4:43 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > Thanks for reviewing! > >> You treat pg_stat_local_memory_contexts view as a dynamic statistics view. >> But isn't it better to treat it as just system view like pg_shmem_allocations >> or pg_prepared_statements because it's not statistics information? If yes, >> we would need to rename the view, move the documentation from >> monitoring.sgml to catalogs.sgml, and move the code from pgstat.c to >> the more appropriate source file. > > Agreed. > At first, I thought not only statistical but dynamic information about exactly > what is going on was OK when reading the sentence on the manual below. > >> PostgreSQL also supports reporting dynamic information about exactly what is going on in the system right now, such asthe exact command currently being executed by other server processes, and which other connections exist in the system.This facility is independent of the collector process. > > However, now I feel something strange because existing pg_stats_* views seem > to be per cluster information but the view I'm adding is about per backend > information. > > I'm going to do some renaming and transportations. > > - view name: pg_memory_contexts > - function name: pg_get_memory_contexts() > - source file: mainly src/backend/utils/mmgr/mcxt.c > > >> + tupdesc = rsinfo->setDesc; >> + tupstore = rsinfo->setResult; >> >> These seem not to be necessary. > > Thanks! > >> + /* >> + * It seems preferable to label dynahash contexts with just the hash table >> + * name. Those are already unique enough, so the "dynahash" part isn't >> + * very helpful, and this way is more consistent with pre-v11 practice. >> + */ >> + if (ident && strcmp(name, "dynahash") == 0) >> + { >> + name = ident; >> + ident = NULL; >> + } >> >> IMO it seems better to report both name and ident even in the case of >> dynahash than report only ident (as name). We can easily understand >> the context is used for dynahash when it's reported. But you think it's >> better to report NULL rather than "dynahash"? > > These codes come from MemoryContextStatsPrint() and my intension is to > keep consistent with it. Ok, understood! I agree that it's strange to display different names for the same memory context between this view and logging. It's helpful if the comment there refers to MemoryContextStatsPrint() and mentions the reason that you told. > >> +/* ---------- >> + * The max bytes for showing identifiers of MemoryContext. >> + * ---------- >> + */ >> +#define MEMORY_CONTEXT_IDENT_SIZE 1024 >> >> Do we really need this upper size limit? Could you tell me why >> this is necessary? > > It also derived from MemoryContextStatsPrint(). > I suppose it is for keeping readability of the log.. Understood. I may want to change the upper limit of query size to display. But at the first step, I'm fine with limitting 1024 bytes. > > I'm going to follow the discussion on the mailing list and find why > these codes were introduced. https://www.postgresql.org/message-id/12319.1521999065%40sss.pgh.pa.us Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Wed, Jul 1, 2020 at 10:15 PM torikoshia <torikoshia@oss.nttdata.com> wrote: > I'm going to do some renaming and transportations. > > - view name: pg_memory_contexts > - function name: pg_get_memory_contexts() > - source file: mainly src/backend/utils/mmgr/mcxt.c Attached an updated patch. On Wed, Jul 1, 2020 at 10:58 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > Ok, understood! I agree that it's strange to display different names > for the same memory context between this view and logging. > > It's helpful if the comment there refers to MemoryContextStatsPrint() > and mentions the reason that you told. Agreed. I changed the comments. > > It also derived from MemoryContextStatsPrint(). > > I suppose it is for keeping readability of the log.. > > Understood. I may want to change the upper limit of query size to > display. > But at the first step, I'm fine with limitting 1024 bytes. Thanks, I've left it as it was. > > I'm going to follow the discussion on the mailing list and find why > > these codes were introduced. > > https://www.postgresql.org/message-id/12319.1521999065%40sss.pgh.pa.us Thanks for sharing! Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/07/03 11:45, torikoshia wrote: > On Wed, Jul 1, 2020 at 10:15 PM torikoshia <torikoshia@oss.nttdata.com> wrote: >> I'm going to do some renaming and transportations. >> >> - view name: pg_memory_contexts I like more specific name like pg_backend_memory_contexts. But I'd like to hear more opinions about the name from others. >> - function name: pg_get_memory_contexts() >> - source file: mainly src/backend/utils/mmgr/mcxt.c > > Attached an updated patch. Thanks for updating the patch! + <structfield>level</structfield> <type>integer</type> In catalog.sgml, "int4" and "int8" are used in other catalogs tables. So "integer" in the above should be "int4"? + <structfield>total_bytes</structfield> <type>bigint</type> "bigint" should be "int8"? + Identification information of the memory context. This field is truncated if the identification field is longer than1024 characters "characters" should be "bytes"? It's a bit confusing to have both "This field" and "the identification field" in one description. What about just "This field is truncated at 1024 bytes"? + <para> + Total bytes requested from malloc Isn't it better not to use "malloc" in the description? For example, what about something like "Total bytes allocated for this memory context"? +#define PG_STAT_GET_MEMORY_CONTEXT_COLS 9 Isn't it better to rename this to PG_GET_MEMORY_CONTEXTS_COLS for the consistency with the function name? + memset(nulls, 0, sizeof(nulls)); "values[]" also should be initialized with zero? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Fri, Jul 3, 2020 at 7:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: Thanks for your review! > I like more specific name like pg_backend_memory_contexts. Agreed. When I was trying to add this function as statistics function, I thought that naming pg_stat_getbackend_memory_context() might make people regarded it as a "per-backend statistics function", whose parameter is backend ID number. So I removed "backend", but now there is no necessity to do so. > But I'd like to hear more opinions about the name from others. I changed the name to pg_backend_memory_contexts for the time being. >> - function name: pg_get_memory_contexts() >> - source file: mainly src/backend/utils/mmgr/mcxt.c >> + Identification information of the memory context. This field >> is truncated if the identification field is longer than 1024 >> characters > > "characters" should be "bytes"? Fixed, but I used "characters" while referring to the descriptions on the manual of pg_stat_activity.query below. | By default the query text is truncated at 1024 characters; It has nothing to do with this thread, but considering multibyte characters, it also may be better to change it to "bytes". Regarding the other comments, I revised the patch as you pointed. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/07/06 12:12, torikoshia wrote: > On Fri, Jul 3, 2020 at 7:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > Thanks for your review! > >> I like more specific name like pg_backend_memory_contexts. > > Agreed. > > When I was trying to add this function as statistics function, > I thought that naming pg_stat_getbackend_memory_context() > might make people regarded it as a "per-backend statistics > function", whose parameter is backend ID number. > So I removed "backend", but now there is no necessity to do > so. > >> But I'd like to hear more opinions about the name from others. > > I changed the name to pg_backend_memory_contexts for the time > being. +1 >>> - function name: pg_get_memory_contexts() >>> - source file: mainly src/backend/utils/mmgr/mcxt.c > > >>> + Identification information of the memory context. This field is truncated if the identification field is longerthan 1024 characters >> >> "characters" should be "bytes"? > > Fixed, but I used "characters" while referring to the > descriptions on the manual of pg_stat_activity.query > below. > > | By default the query text is truncated at 1024 characters; > > It has nothing to do with this thread, but considering > multibyte characters, it also may be better to change it > to "bytes". Yeah, I agree we should write the separate patch fixing that. You will? If not, I will do that later. > Regarding the other comments, I revised the patch as you pointed. Thanks for updating the patch! The patch basically looks good to me/ Here are some minor comments: +#define MEMORY_CONTEXT_IDENT_SIZE 1024 This macro varible name sounds like the maximum allowed length of ident that each menory context has. But actually this limits the maximum bytes of ident to display. So I think that it's better to rename this macro to something like MEMORY_CONTEXT_IDENT_DISPLAY_SIZE. Thought? +#define PG_GET_MEMORY_CONTEXTS_COLS 9 + Datum values[PG_GET_MEMORY_CONTEXTS_COLS]; + bool nulls[PG_GET_MEMORY_CONTEXTS_COLS]; This macro variable name should be PG_GET_BACKEND_MEMORY_CONTEXTS_COLS for the consistency with the function name? +{ oid => '2282', descr => 'statistics: information about all memory contexts of local backend', Isn't it better to remove "statistics: " from the above description? + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>parent</structfield> <type>text</type> There can be multiple memory contexts with the same name. So I'm afraid that it's difficult to identify the actual parent memory context from this "parent" column. This is ok when logging memory contexts by calling MemoryContextStats() via gdb. Because child memory contexts are printed just under their parent, with indents. But this doesn't work in the view. To identify the actual parent memory or calculate the memory contexts tree from the view, we might need to assign unique ID to each memory context and display it. But IMO this is overkill. So I'm fine with current "parent" column. Thought? Do you have any better idea? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020-07-06 15:16, Fujii Masao wrote: > On 2020/07/06 12:12, torikoshia wrote: >> On Fri, Jul 3, 2020 at 7:33 PM Fujii Masao >> <masao.fujii@oss.nttdata.com> wrote: >> >> Thanks for your review! >> >>> I like more specific name like pg_backend_memory_contexts. >> >> Agreed. >> >> When I was trying to add this function as statistics function, >> I thought that naming pg_stat_getbackend_memory_context() >> might make people regarded it as a "per-backend statistics >> function", whose parameter is backend ID number. >> So I removed "backend", but now there is no necessity to do >> so. >> >>> But I'd like to hear more opinions about the name from others. >> >> I changed the name to pg_backend_memory_contexts for the time >> being. > > +1 > > >>>> - function name: pg_get_memory_contexts() >>>> - source file: mainly src/backend/utils/mmgr/mcxt.c >> >> >>>> + Identification information of the memory context. This field >>>> is truncated if the identification field is longer than 1024 >>>> characters >>> >>> "characters" should be "bytes"? >> >> Fixed, but I used "characters" while referring to the >> descriptions on the manual of pg_stat_activity.query >> below. >> >> | By default the query text is truncated at 1024 characters; >> >> It has nothing to do with this thread, but considering >> multibyte characters, it also may be better to change it >> to "bytes". > > Yeah, I agree we should write the separate patch fixing that. You will? > If not, I will do that later. Thanks, I will try it! >> Regarding the other comments, I revised the patch as you pointed. > > Thanks for updating the patch! The patch basically looks good to me/ > Here are some minor comments: > > +#define MEMORY_CONTEXT_IDENT_SIZE 1024 > > This macro varible name sounds like the maximum allowed length of ident > that > each menory context has. But actually this limits the maximum bytes of > ident > to display. So I think that it's better to rename this macro to > something like > MEMORY_CONTEXT_IDENT_DISPLAY_SIZE. Thought? Agreed. MEMORY_CONTEXT_IDENT_DISPLAY_SIZE seems more accurate. > +#define PG_GET_MEMORY_CONTEXTS_COLS 9 > + Datum values[PG_GET_MEMORY_CONTEXTS_COLS]; > + bool nulls[PG_GET_MEMORY_CONTEXTS_COLS]; > > This macro variable name should be PG_GET_BACKEND_MEMORY_CONTEXTS_COLS > for the consistency with the function name? Thanks! Fixed it. > > +{ oid => '2282', descr => 'statistics: information about all memory > contexts of local backend', > > Isn't it better to remove "statistics: " from the above description? Yeah, it's my oversight. > > + <row> > + <entry role="catalog_table_entry"><para > role="column_definition"> > + <structfield>parent</structfield> <type>text</type> > > There can be multiple memory contexts with the same name. So I'm afraid > that it's difficult to identify the actual parent memory context from > this > "parent" column. This is ok when logging memory contexts by calling > MemoryContextStats() via gdb. Because child memory contexts are printed > just under their parent, with indents. But this doesn't work in the > view. > To identify the actual parent memory or calculate the memory contexts > tree > from the view, we might need to assign unique ID to each memory context > and display it. But IMO this is overkill. So I'm fine with current > "parent" > column. Thought? Do you have any better idea? Indeed. I also feel it's not usual to assign a unique ID, which can vary every time the view displayed. We show each context using a recursive function and this is a kind of depth-first search. So, as far as I understand, we can identify its parent using both the "parent" column and the order of the rows. Documenting these things may worth for who want to identify the relation between parents and children. Of course, in the relational model, the order of relation does not have meaning so it's also unusual in this sense.. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/07/07 22:02, torikoshia wrote: > On 2020-07-06 15:16, Fujii Masao wrote: >> On 2020/07/06 12:12, torikoshia wrote: >>> On Fri, Jul 3, 2020 at 7:33 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: >>> >>> Thanks for your review! >>> >>>> I like more specific name like pg_backend_memory_contexts. >>> >>> Agreed. >>> >>> When I was trying to add this function as statistics function, >>> I thought that naming pg_stat_getbackend_memory_context() >>> might make people regarded it as a "per-backend statistics >>> function", whose parameter is backend ID number. >>> So I removed "backend", but now there is no necessity to do >>> so. >>> >>>> But I'd like to hear more opinions about the name from others. >>> >>> I changed the name to pg_backend_memory_contexts for the time >>> being. >> >> +1 >> >> >>>>> - function name: pg_get_memory_contexts() >>>>> - source file: mainly src/backend/utils/mmgr/mcxt.c >>> >>> >>>>> + Identification information of the memory context. This field is truncated if the identification field is longerthan 1024 characters >>>> >>>> "characters" should be "bytes"? >>> >>> Fixed, but I used "characters" while referring to the >>> descriptions on the manual of pg_stat_activity.query >>> below. >>> >>> | By default the query text is truncated at 1024 characters; >>> >>> It has nothing to do with this thread, but considering >>> multibyte characters, it also may be better to change it >>> to "bytes". >> >> Yeah, I agree we should write the separate patch fixing that. You will? >> If not, I will do that later. > > Thanks, I will try it! Thanks! > >>> Regarding the other comments, I revised the patch as you pointed. >> >> Thanks for updating the patch! The patch basically looks good to me/ >> Here are some minor comments: >> >> +#define MEMORY_CONTEXT_IDENT_SIZE 1024 >> >> This macro varible name sounds like the maximum allowed length of ident that >> each menory context has. But actually this limits the maximum bytes of ident >> to display. So I think that it's better to rename this macro to something like >> MEMORY_CONTEXT_IDENT_DISPLAY_SIZE. Thought? > > Agreed. > MEMORY_CONTEXT_IDENT_DISPLAY_SIZE seems more accurate. > >> +#define PG_GET_MEMORY_CONTEXTS_COLS 9 >> + Datum values[PG_GET_MEMORY_CONTEXTS_COLS]; >> + bool nulls[PG_GET_MEMORY_CONTEXTS_COLS]; >> >> This macro variable name should be PG_GET_BACKEND_MEMORY_CONTEXTS_COLS >> for the consistency with the function name? > > Thanks! Fixed it. Thanks for updating the patch! It basically looks good to me. + <indexterm zone="view-pg-backend-memory-contexts"> + <primary>backend memory contexts</primary> + </indexterm> Do we need this indexterm? > >> >> +{ oid => '2282', descr => 'statistics: information about all memory >> contexts of local backend', >> >> Isn't it better to remove "statistics: " from the above description? > > Yeah, it's my oversight. > >> >> + <row> >> + <entry role="catalog_table_entry"><para role="column_definition"> >> + <structfield>parent</structfield> <type>text</type> >> >> There can be multiple memory contexts with the same name. So I'm afraid >> that it's difficult to identify the actual parent memory context from this >> "parent" column. This is ok when logging memory contexts by calling >> MemoryContextStats() via gdb. Because child memory contexts are printed >> just under their parent, with indents. But this doesn't work in the view. >> To identify the actual parent memory or calculate the memory contexts tree >> from the view, we might need to assign unique ID to each memory context >> and display it. But IMO this is overkill. So I'm fine with current "parent" >> column. Thought? Do you have any better idea? > > Indeed. > I also feel it's not usual to assign a unique ID, which > can vary every time the view displayed. Agreed. Displaying such ID would be more confusing to users. Ok, let's leave the code as it is. Another comment about parent column is: dynahash can be parent? If yes, its indent instead of name should be displayed in parent column? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, I think this is an incredibly useful feature. On 2020-07-07 22:02:10 +0900, torikoshia wrote: > > There can be multiple memory contexts with the same name. So I'm afraid > > that it's difficult to identify the actual parent memory context from > > this > > "parent" column. This is ok when logging memory contexts by calling > > MemoryContextStats() via gdb. Because child memory contexts are printed > > just under their parent, with indents. But this doesn't work in the > > view. > > To identify the actual parent memory or calculate the memory contexts > > tree > > from the view, we might need to assign unique ID to each memory context > > and display it. But IMO this is overkill. So I'm fine with current > > "parent" > > column. Thought? Do you have any better idea? > > Indeed. > I also feel it's not usual to assign a unique ID, which > can vary every time the view displayed. Hm. I wonder if we just could include the address of the context itself. There might be reasons not to do so (e.g. security concerns about leaked pointers making attacks easier), but I think it's worth considering. > We show each context using a recursive function and this is > a kind of depth-first search. So, as far as I understand, > we can identify its parent using both the "parent" column > and the order of the rows. > > Documenting these things may worth for who want to identify > the relation between parents and children. > > Of course, in the relational model, the order of relation > does not have meaning so it's also unusual in this sense.. It makes it pretty hard to write summarizing queries, so I am not a huge fan of just relying on the order. > +/* > + * PutMemoryContextsStatsTupleStore > + * One recursion level for pg_get_backend_memory_contexts. > + */ > +static void > +PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore, > + TupleDesc tupdesc, MemoryContext context, > + MemoryContext parent, int level) > +{ > +#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9 > + Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS]; > + bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS]; > + MemoryContextCounters stat; > + MemoryContext child; > + const char *name = context->name; > + const char *ident = context->ident; > + > + if (context == NULL) > + return; > + > + /* > + * To be consistent with logging output, we label dynahash contexts > + * with just the hash table name as with MemoryContextStatsPrint(). > + */ > + if (ident && strcmp(name, "dynahash") == 0) > + { > + name = ident; > + ident = NULL; > + } > + > + /* Examine the context itself */ > + memset(&stat, 0, sizeof(stat)); > + (*context->methods->stats) (context, NULL, (void *) &level, &stat); > + > + memset(values, 0, sizeof(values)); > + memset(nulls, 0, sizeof(nulls)); > + > + values[0] = CStringGetTextDatum(name); > + > + if (ident) > + { > + int idlen = strlen(ident); > + char clipped_ident[MEMORY_CONTEXT_IDENT_DISPLAY_SIZE]; > + > + /* > + * Some identifiers such as SQL query string can be very long, > + * truncate oversize identifiers. > + */ > + if (idlen >= MEMORY_CONTEXT_IDENT_DISPLAY_SIZE) > + idlen = pg_mbcliplen(ident, idlen, MEMORY_CONTEXT_IDENT_DISPLAY_SIZE - 1); > + Why? Greetings, Andres Freund
On 2020-07-08 22:12, Fujii Masao wrote: > Thanks for updating the patch! It basically looks good to me. > > + <indexterm zone="view-pg-backend-memory-contexts"> > + <primary>backend memory contexts</primary> > + </indexterm> > > Do we need this indexterm? Thanks! it's not necessary. I remove this indexterm. >>> >>> +{ oid => '2282', descr => 'statistics: information about all memory >>> contexts of local backend', >>> >>> Isn't it better to remove "statistics: " from the above description? >> >> Yeah, it's my oversight. >> >>> >>> + <row> >>> + <entry role="catalog_table_entry"><para >>> role="column_definition"> >>> + <structfield>parent</structfield> <type>text</type> >>> >>> There can be multiple memory contexts with the same name. So I'm >>> afraid >>> that it's difficult to identify the actual parent memory context from >>> this >>> "parent" column. This is ok when logging memory contexts by calling >>> MemoryContextStats() via gdb. Because child memory contexts are >>> printed >>> just under their parent, with indents. But this doesn't work in the >>> view. >>> To identify the actual parent memory or calculate the memory contexts >>> tree >>> from the view, we might need to assign unique ID to each memory >>> context >>> and display it. But IMO this is overkill. So I'm fine with current >>> "parent" >>> column. Thought? Do you have any better idea? >> >> Indeed. >> I also feel it's not usual to assign a unique ID, which >> can vary every time the view displayed. > > Agreed. Displaying such ID would be more confusing to users. > Ok, let's leave the code as it is. > > Another comment about parent column is: dynahash can be parent? > If yes, its indent instead of name should be displayed in parent > column? I'm not sure yet, but considering the changes in the future, it seems better to do so. But if we add information for identifying parent-child relation like the memory address suggested from Andres, it seems not necessary. So I'd like to go back to this point. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020-07-09 02:03, Andres Freund wrote: > Hi, > > I think this is an incredibly useful feature. Thanks for your kind comments and suggestion! > On 2020-07-07 22:02:10 +0900, torikoshia wrote: >> > There can be multiple memory contexts with the same name. So I'm afraid >> > that it's difficult to identify the actual parent memory context from >> > this >> > "parent" column. This is ok when logging memory contexts by calling >> > MemoryContextStats() via gdb. Because child memory contexts are printed >> > just under their parent, with indents. But this doesn't work in the >> > view. >> > To identify the actual parent memory or calculate the memory contexts >> > tree >> > from the view, we might need to assign unique ID to each memory context >> > and display it. But IMO this is overkill. So I'm fine with current >> > "parent" >> > column. Thought? Do you have any better idea? >> >> Indeed. >> I also feel it's not usual to assign a unique ID, which >> can vary every time the view displayed. > > Hm. I wonder if we just could include the address of the context > itself. There might be reasons not to do so (e.g. security concerns > about leaked pointers making attacks easier), but I think it's worth > considering. I tried exposing addresses of each context and their parent. Attached a poc patch. =# SELECT name, address, parent_address, total_bytes FROM pg_backend_memory_contexts ; name | address | parent_address | total_bytes --------------------------+-----------+----------------+------------- TopMemoryContext | 0x1280da0 | | 80800 TopTransactionContext | 0x1309040 | 0x1280da0 | 8192 Prepared Queries | 0x138a480 | 0x1280da0 | 16384 Type information cache | 0x134b8c0 | 0x1280da0 | 24624 ... CacheMemoryContext | 0x12cb390 | 0x1280da0 | 1048576 CachedPlanSource | 0x13c47f0 | 0x12cb390 | 4096 CachedPlanQuery | 0x13c9ae0 | 0x13c47f0 | 4096 CachedPlanSource | 0x13c7310 | 0x12cb390 | 4096 CachedPlanQuery | 0x13c1230 | 0x13c7310 | 4096 ... Now it's possible to identify the actual parent memory context even when there are multiple memory contexts with the same name. I'm not sure, but I'm also worrying about this might incur some security related problems.. I'd like to hear more opinions about: - whether information for identifying parent-child relation is necessary or it's an overkill - if this information is necessary, memory address is suitable or other means like assigning unique numbers are required >> +/* >> + * PutMemoryContextsStatsTupleStore >> + * One recursion level for pg_get_backend_memory_contexts. >> + */ >> +static void >> +PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore, >> + TupleDesc tupdesc, MemoryContext context, >> + MemoryContext parent, int level) >> +{ >> +#define PG_GET_BACKEND_MEMORY_CONTEXTS_COLS 9 >> + Datum values[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS]; >> + bool nulls[PG_GET_BACKEND_MEMORY_CONTEXTS_COLS]; >> + MemoryContextCounters stat; >> + MemoryContext child; >> + const char *name = context->name; >> + const char *ident = context->ident; >> + >> + if (context == NULL) >> + return; >> + >> + /* >> + * To be consistent with logging output, we label dynahash contexts >> + * with just the hash table name as with MemoryContextStatsPrint(). >> + */ >> + if (ident && strcmp(name, "dynahash") == 0) >> + { >> + name = ident; >> + ident = NULL; >> + } >> + >> + /* Examine the context itself */ >> + memset(&stat, 0, sizeof(stat)); >> + (*context->methods->stats) (context, NULL, (void *) &level, &stat); >> + >> + memset(values, 0, sizeof(values)); >> + memset(nulls, 0, sizeof(nulls)); >> + >> + values[0] = CStringGetTextDatum(name); >> + >> + if (ident) >> + { >> + int idlen = strlen(ident); >> + char clipped_ident[MEMORY_CONTEXT_IDENT_DISPLAY_SIZE]; >> + >> + /* >> + * Some identifiers such as SQL query string can be very long, >> + * truncate oversize identifiers. >> + */ >> + if (idlen >= MEMORY_CONTEXT_IDENT_DISPLAY_SIZE) >> + idlen = pg_mbcliplen(ident, idlen, >> MEMORY_CONTEXT_IDENT_DISPLAY_SIZE - 1); >> + > > Why? As described below[1], too long messages caused problems in the past and now MemoryContextStatsPrint() truncates ident, so I decided to truncate it also here. Do you think it's not necessary here? [1] https://www.postgresql.org/message-id/12319.1521999065@sss.pgh.pa.us Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/07/10 17:32, torikoshia wrote: > On 2020-07-09 02:03, Andres Freund wrote: >> Hi, >> >> I think this is an incredibly useful feature. > > Thanks for your kind comments and suggestion! > > >> On 2020-07-07 22:02:10 +0900, torikoshia wrote: >>> > There can be multiple memory contexts with the same name. So I'm afraid >>> > that it's difficult to identify the actual parent memory context from >>> > this >>> > "parent" column. This is ok when logging memory contexts by calling >>> > MemoryContextStats() via gdb. Because child memory contexts are printed >>> > just under their parent, with indents. But this doesn't work in the >>> > view. >>> > To identify the actual parent memory or calculate the memory contexts >>> > tree >>> > from the view, we might need to assign unique ID to each memory context >>> > and display it. But IMO this is overkill. So I'm fine with current >>> > "parent" >>> > column. Thought? Do you have any better idea? >>> >>> Indeed. >>> I also feel it's not usual to assign a unique ID, which >>> can vary every time the view displayed. >> >> Hm. I wonder if we just could include the address of the context >> itself. There might be reasons not to do so (e.g. security concerns >> about leaked pointers making attacks easier), but I think it's worth >> considering. > > > I tried exposing addresses of each context and their parent. > Attached a poc patch. > > =# SELECT name, address, parent_address, total_bytes FROM pg_backend_memory_contexts ; > > name | address | parent_address | total_bytes > --------------------------+-----------+----------------+------------- > TopMemoryContext | 0x1280da0 | | 80800 > TopTransactionContext | 0x1309040 | 0x1280da0 | 8192 > Prepared Queries | 0x138a480 | 0x1280da0 | 16384 > Type information cache | 0x134b8c0 | 0x1280da0 | 24624 > ... > CacheMemoryContext | 0x12cb390 | 0x1280da0 | 1048576 > CachedPlanSource | 0x13c47f0 | 0x12cb390 | 4096 > CachedPlanQuery | 0x13c9ae0 | 0x13c47f0 | 4096 > CachedPlanSource | 0x13c7310 | 0x12cb390 | 4096 > CachedPlanQuery | 0x13c1230 | 0x13c7310 | 4096 > ... > > > Now it's possible to identify the actual parent memory context even when > there are multiple memory contexts with the same name. > > I'm not sure, but I'm also worrying about this might incur some security > related problems.. > > I'd like to hear more opinions about: > > - whether information for identifying parent-child relation is necessary or it's an overkill > - if this information is necessary, memory address is suitable or other means like assigning unique numbers are required To consider this, I'd like to know what security issue can actually happen when memory addresses are exposed. I have no idea about this.. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, On Fri, Jul 10, 2020 at 5:32 PM torikoshia <torikoshia@oss.nttdata.com> wrote: > - whether information for identifying parent-child relation is necessary > or it's an overkill I think it's important to understand the parent-child relationship of the context. Personally, I often want to know the following two things .. - In which life cycle is the target context? (Remaining as long as the process is living? per query?) - Does the target context belong to the correct (parent) context? > - if this information is necessary, memory address is suitable or other > means like assigning unique numbers are required IMO, If each context can be uniquely identified (or easily guessed) by "name" and "ident", then I don't think the address information is necessary. Instead, I like the way that directly shows the context name of the parent, as in the 0005 patch. Best regards -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
On 2020-07-30 15:13, Kasahara Tatsuhito wrote: > Hi, > > On Fri, Jul 10, 2020 at 5:32 PM torikoshia <torikoshia@oss.nttdata.com> > wrote: >> - whether information for identifying parent-child relation is >> necessary >> or it's an overkill > I think it's important to understand the parent-child relationship of > the context. > Personally, I often want to know the following two things .. > > - In which life cycle is the target context? (Remaining as long as the > process is living? per query?) > - Does the target context belong to the correct (parent) context? > >> - if this information is necessary, memory address is suitable or >> other >> means like assigning unique numbers are required > IMO, If each context can be uniquely identified (or easily guessed) by > "name" and "ident", > then I don't think the address information is necessary. > Instead, I like the way that directly shows the context name of the > parent, as in the 0005 patch. Thanks for your opinion! I also feel it'll be sufficient to know not the exact memory context of the parent but the name of the parent context. And as Fujii-san told me in person, exposing memory address seems not preferable considering there are security techniques like address space layout randomization. > On 2020-07-10 08:30:22 +0900, torikoshia wrote: >> On 2020-07-08 22:12, Fujii Masao wrote: >> Another comment about parent column is: dynahash can be parent? >> If yes, its indent instead of name should be displayed in parent >> column? > I'm not sure yet, but considering the changes in the future, it seems > better to do so. Attached a patch which displays ident as parent when dynahash is a parent. I could not find the case when dynahash can be a parent so I tested it using attached test purposed patch. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On Fri, Jul 31, 2020 at 4:25 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > And as Fujii-san told me in person, exposing memory address seems > not preferable considering there are security techniques like > address space layout randomization. Yeah, exactly. ASLR wouldn't do anything to improve security if there were no other security bugs, but there are, and some of those bugs are harder to exploit if you don't know the precise memory addresses of certain data structures. Similarly, exposing the addresses of our internal data structures is harmless if we have no other security bugs, but if we do, it might make those bugs easier to exploit. I don't think this information is useful enough to justify taking that risk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: tested, passed I tested the latest patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) and test was passed. It looks good to me. The new status of this patch is: Ready for Committer
On Fri, Jul 31, 2020 at 03:23:52PM -0400, Robert Haas wrote: > On Fri, Jul 31, 2020 at 4:25 AM torikoshia <torikoshia@oss.nttdata.com> wrote: >> And as Fujii-san told me in person, exposing memory address seems >> not preferable considering there are security techniques like >> address space layout randomization. > > Yeah, exactly. ASLR wouldn't do anything to improve security if there > were no other security bugs, but there are, and some of those bugs are > harder to exploit if you don't know the precise memory addresses of > certain data structures. Similarly, exposing the addresses of our > internal data structures is harmless if we have no other security > bugs, but if we do, it might make those bugs easier to exploit. I > don't think this information is useful enough to justify taking that > risk. FWIW, this is the class of issues where it is possible to print some areas of memory, or even manipulate the stack so as it was possible to pass down a custom pointer, so exposing the pointer locations is a real risk, and this has happened in the past. Anyway, it seems to me that if this part is done, we could just make it superuser-only with restrictive REVOKE privileges, but I am not sure that we have enough user cases to justify this addition. -- Michael
Attachment
On 2020-08-08 10:44, Michael Paquier wrote: > On Fri, Jul 31, 2020 at 03:23:52PM -0400, Robert Haas wrote: >> On Fri, Jul 31, 2020 at 4:25 AM torikoshia >> <torikoshia@oss.nttdata.com> wrote: >>> And as Fujii-san told me in person, exposing memory address seems >>> not preferable considering there are security techniques like >>> address space layout randomization. >> >> Yeah, exactly. ASLR wouldn't do anything to improve security if there >> were no other security bugs, but there are, and some of those bugs are >> harder to exploit if you don't know the precise memory addresses of >> certain data structures. Similarly, exposing the addresses of our >> internal data structures is harmless if we have no other security >> bugs, but if we do, it might make those bugs easier to exploit. I >> don't think this information is useful enough to justify taking that >> risk. > > FWIW, this is the class of issues where it is possible to print some > areas of memory, or even manipulate the stack so as it was possible to > pass down a custom pointer, so exposing the pointer locations is a > real risk, and this has happened in the past. Anyway, it seems to me > that if this part is done, we could just make it superuser-only with > restrictive REVOKE privileges, but I am not sure that we have enough > user cases to justify this addition. Thanks for your comments! I convinced that exposing pointer locations introduce security risks and it seems better not to do so. And I now feel identifying exact memory context by exposing memory address or other means seems overkill. Showing just the context name of the parent would be sufficient and 0007 pattch takes this way. On 2020-08-07 16:38, Kasahara Tatsuhito wrote: > The following review has been posted through the commitfest > application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: not tested > Documentation: tested, passed > > I tested the latest > patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) > with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) > and test was passed. > It looks good to me. > > The new status of this patch is: Ready for Committer Thanks for your testing! Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020/08/11 15:24, torikoshia wrote: > On 2020-08-08 10:44, Michael Paquier wrote: >> On Fri, Jul 31, 2020 at 03:23:52PM -0400, Robert Haas wrote: >>> On Fri, Jul 31, 2020 at 4:25 AM torikoshia <torikoshia@oss.nttdata.com> wrote: >>>> And as Fujii-san told me in person, exposing memory address seems >>>> not preferable considering there are security techniques like >>>> address space layout randomization. >>> >>> Yeah, exactly. ASLR wouldn't do anything to improve security if there >>> were no other security bugs, but there are, and some of those bugs are >>> harder to exploit if you don't know the precise memory addresses of >>> certain data structures. Similarly, exposing the addresses of our >>> internal data structures is harmless if we have no other security >>> bugs, but if we do, it might make those bugs easier to exploit. I >>> don't think this information is useful enough to justify taking that >>> risk. >> >> FWIW, this is the class of issues where it is possible to print some >> areas of memory, or even manipulate the stack so as it was possible to >> pass down a custom pointer, so exposing the pointer locations is a >> real risk, and this has happened in the past. Anyway, it seems to me >> that if this part is done, we could just make it superuser-only with >> restrictive REVOKE privileges, but I am not sure that we have enough >> user cases to justify this addition. > > > Thanks for your comments! > > I convinced that exposing pointer locations introduce security risks > and it seems better not to do so. > > And I now feel identifying exact memory context by exposing memory > address or other means seems overkill. > Showing just the context name of the parent would be sufficient and > 0007 pattch takes this way. Agreed. > > > On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >> The following review has been posted through the commitfest application: >> make installcheck-world: tested, passed >> Implements feature: tested, passed >> Spec compliant: not tested >> Documentation: tested, passed >> >> I tested the latest >> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >> with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) >> and test was passed. >> It looks good to me. >> >> The new status of this patch is: Ready for Committer > > Thanks for your testing! Thanks for updating the patch! Here are the review comments. + <row> + <entry><link linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> + <entry>backend memory contexts</entry> + </row> The above is located just after pg_matviews entry. But it should be located just after pg_available_extension_versions entry. Because the rows in the table "System Views" should be located in alphabetical order. + <sect1 id="view-pg-backend-memory-contexts"> + <title><structname>pg_backend_memory_contexts</structname></title> Same as above. + The view <structname>pg_backend_memory_contexts</structname> displays all + the local backend memory contexts. This description seems a bit confusing because maybe we can interpret this as "... displays the memory contexts of all the local backends" wrongly. Thought? What about the following description, instead? The view <structname>pg_backend_memory_contexts</structname> displays all the memory contexts of the server process attached to the current session. + const char *name = context->name; + const char *ident = context->ident; + + if (context == NULL) + return; The above check "context == NULL" is useless? If "context" is actually NULL, "context->name" would cause segmentation fault, so ISTM that the check will never be performed. If "context" can be NULL, the check should be performed before accessing to "contect". OTOH, if "context" must not be NULL per the specification of PutMemoryContextStatsTupleStore(), assertion test checking "context != NULL" should be used here, instead? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/17 21:14, Fujii Masao wrote: > > > On 2020/08/11 15:24, torikoshia wrote: >> On 2020-08-08 10:44, Michael Paquier wrote: >>> On Fri, Jul 31, 2020 at 03:23:52PM -0400, Robert Haas wrote: >>>> On Fri, Jul 31, 2020 at 4:25 AM torikoshia <torikoshia@oss.nttdata.com> wrote: >>>>> And as Fujii-san told me in person, exposing memory address seems >>>>> not preferable considering there are security techniques like >>>>> address space layout randomization. >>>> >>>> Yeah, exactly. ASLR wouldn't do anything to improve security if there >>>> were no other security bugs, but there are, and some of those bugs are >>>> harder to exploit if you don't know the precise memory addresses of >>>> certain data structures. Similarly, exposing the addresses of our >>>> internal data structures is harmless if we have no other security >>>> bugs, but if we do, it might make those bugs easier to exploit. I >>>> don't think this information is useful enough to justify taking that >>>> risk. >>> >>> FWIW, this is the class of issues where it is possible to print some >>> areas of memory, or even manipulate the stack so as it was possible to >>> pass down a custom pointer, so exposing the pointer locations is a >>> real risk, and this has happened in the past. Anyway, it seems to me >>> that if this part is done, we could just make it superuser-only with >>> restrictive REVOKE privileges, but I am not sure that we have enough >>> user cases to justify this addition. >> >> >> Thanks for your comments! >> >> I convinced that exposing pointer locations introduce security risks >> and it seems better not to do so. >> >> And I now feel identifying exact memory context by exposing memory >> address or other means seems overkill. >> Showing just the context name of the parent would be sufficient and >> 0007 pattch takes this way. > > Agreed. > > >> >> >> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>> The following review has been posted through the commitfest application: >>> make installcheck-world: tested, passed >>> Implements feature: tested, passed >>> Spec compliant: not tested >>> Documentation: tested, passed >>> >>> I tested the latest >>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>> with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) >>> and test was passed. >>> It looks good to me. >>> >>> The new status of this patch is: Ready for Committer >> >> Thanks for your testing! > > Thanks for updating the patch! Here are the review comments. > > + <row> > + <entry><link linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> > + <entry>backend memory contexts</entry> > + </row> > > The above is located just after pg_matviews entry. But it should be located > just after pg_available_extension_versions entry. Because the rows in the table > "System Views" should be located in alphabetical order. > > > + <sect1 id="view-pg-backend-memory-contexts"> > + <title><structname>pg_backend_memory_contexts</structname></title> > > Same as above. > > > + The view <structname>pg_backend_memory_contexts</structname> displays all > + the local backend memory contexts. > > This description seems a bit confusing because maybe we can interpret this > as "... displays the memory contexts of all the local backends" wrongly. Thought? > What about the following description, instead? > > The view <structname>pg_backend_memory_contexts</structname> displays all > the memory contexts of the server process attached to the current session. > > > + const char *name = context->name; > + const char *ident = context->ident; > + > + if (context == NULL) > + return; > > The above check "context == NULL" is useless? If "context" is actually NULL, > "context->name" would cause segmentation fault, so ISTM that the check > will never be performed. > > If "context" can be NULL, the check should be performed before accessing > to "contect". OTOH, if "context" must not be NULL per the specification of > PutMemoryContextStatsTupleStore(), assertion test checking > "context != NULL" should be used here, instead? Here is another comment. + if (parent == NULL) + nulls[2] = true; + else + /* + * We labeled dynahash contexts with just the hash table name. + * To make it possible to identify its parent, we also display + * parent's ident here. + */ + if (parent->ident && strcmp(parent->name, "dynahash") == 0) + values[2] = CStringGetTextDatum(parent->ident); + else + values[2] = CStringGetTextDatum(parent->name); PutMemoryContextsStatsTupleStore() doesn't need "parent" memory context, but uses only the name of "parent" memory context. So isn't it better to use "const char *parent" instead of "MemoryContext parent", as the argument of the function? If we do that, we can simplify the above code. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020-08-17 21:19, Fujii Masao wrote: > On 2020/08/17 21:14, Fujii Masao wrote: >>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>> The following review has been posted through the commitfest >>>> application: >>>> make installcheck-world: tested, passed >>>> Implements feature: tested, passed >>>> Spec compliant: not tested >>>> Documentation: tested, passed >>>> >>>> I tested the latest >>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>> with the latest PG-version >>>> (199cec9779504c08aaa8159c6308283156547409) >>>> and test was passed. >>>> It looks good to me. >>>> >>>> The new status of this patch is: Ready for Committer >>> >>> Thanks for your testing! >> >> Thanks for updating the patch! Here are the review comments. Thanks for reviewing! >> >> + <row> >> + <entry><link >> linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >> + <entry>backend memory contexts</entry> >> + </row> >> >> The above is located just after pg_matviews entry. But it should be >> located >> just after pg_available_extension_versions entry. Because the rows in >> the table >> "System Views" should be located in alphabetical order. >> >> >> + <sect1 id="view-pg-backend-memory-contexts"> >> + <title><structname>pg_backend_memory_contexts</structname></title> >> >> Same as above. Modified both. >> >> >> + The view <structname>pg_backend_memory_contexts</structname> >> displays all >> + the local backend memory contexts. >> >> This description seems a bit confusing because maybe we can interpret >> this >> as "... displays the memory contexts of all the local backends" >> wrongly. Thought? >> What about the following description, instead? >> The view <structname>pg_backend_memory_contexts</structname> >> displays all >> the memory contexts of the server process attached to the current >> session. Thanks! it seems better. >> + const char *name = context->name; >> + const char *ident = context->ident; >> + >> + if (context == NULL) >> + return; >> >> The above check "context == NULL" is useless? If "context" is actually >> NULL, >> "context->name" would cause segmentation fault, so ISTM that the check >> will never be performed. >> >> If "context" can be NULL, the check should be performed before >> accessing >> to "contect". OTOH, if "context" must not be NULL per the >> specification of >> PutMemoryContextStatsTupleStore(), assertion test checking >> "context != NULL" should be used here, instead? Yeah, "context" cannot be NULL because "context" must be TopMemoryContext or it is already checked as not NULL as follows(child != NULL). I added the assertion check. | for (child = context->firstchild; child != NULL; child = child->nextchild) | { | ... | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, | child, parentname, level + 1); | } > Here is another comment. > > + if (parent == NULL) > + nulls[2] = true; > + else > + /* > + * We labeled dynahash contexts with just the hash table > name. > + * To make it possible to identify its parent, we also > display > + * parent's ident here. > + */ > + if (parent->ident && strcmp(parent->name, "dynahash") == > 0) > + values[2] = CStringGetTextDatum(parent->ident); > + else > + values[2] = CStringGetTextDatum(parent->name); > > PutMemoryContextsStatsTupleStore() doesn't need "parent" memory > context, > but uses only the name of "parent" memory context. So isn't it better > to use > "const char *parent" instead of "MemoryContext parent", as the argument > of > the function? If we do that, we can simplify the above code. Thanks, the attached patch adopted the advice. However, since PutMemoryContextsStatsTupleStore() used not only the name but also the ident of the "parent", I could not help but adding similar codes before calling the function. The total amount of codes and complexity seem not to change so much. Any thoughts? Am I misunderstanding something? Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/08/18 18:41, torikoshia wrote: > On 2020-08-17 21:19, Fujii Masao wrote: >> On 2020/08/17 21:14, Fujii Masao wrote: >>>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>>> The following review has been posted through the commitfest application: >>>>> make installcheck-world: tested, passed >>>>> Implements feature: tested, passed >>>>> Spec compliant: not tested >>>>> Documentation: tested, passed >>>>> >>>>> I tested the latest >>>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>>> with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) >>>>> and test was passed. >>>>> It looks good to me. >>>>> >>>>> The new status of this patch is: Ready for Committer >>>> >>>> Thanks for your testing! >>> >>> Thanks for updating the patch! Here are the review comments. > > Thanks for reviewing! > >>> >>> + <row> >>> + <entry><link linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >>> + <entry>backend memory contexts</entry> >>> + </row> >>> >>> The above is located just after pg_matviews entry. But it should be located >>> just after pg_available_extension_versions entry. Because the rows in the table >>> "System Views" should be located in alphabetical order. >>> >>> >>> + <sect1 id="view-pg-backend-memory-contexts"> >>> + <title><structname>pg_backend_memory_contexts</structname></title> >>> >>> Same as above. > > Modified both. > >>> >>> >>> + The view <structname>pg_backend_memory_contexts</structname> displays all >>> + the local backend memory contexts. >>> >>> This description seems a bit confusing because maybe we can interpret this >>> as "... displays the memory contexts of all the local backends" wrongly. Thought? >>> What about the following description, instead? > >>> The view <structname>pg_backend_memory_contexts</structname> displays all >>> the memory contexts of the server process attached to the current session. > > Thanks! it seems better. > >>> + const char *name = context->name; >>> + const char *ident = context->ident; >>> + >>> + if (context == NULL) >>> + return; >>> >>> The above check "context == NULL" is useless? If "context" is actually NULL, >>> "context->name" would cause segmentation fault, so ISTM that the check >>> will never be performed. >>> >>> If "context" can be NULL, the check should be performed before accessing >>> to "contect". OTOH, if "context" must not be NULL per the specification of >>> PutMemoryContextStatsTupleStore(), assertion test checking >>> "context != NULL" should be used here, instead? > > Yeah, "context" cannot be NULL because "context" must be TopMemoryContext > or it is already checked as not NULL as follows(child != NULL). > > I added the assertion check. Isn't it better to add AssertArg(MemoryContextIsValid(context)), instead? > > | for (child = context->firstchild; child != NULL; child = child->nextchild) > | { > | ... > | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, > | child, parentname, level + 1); > | } > >> Here is another comment. >> >> + if (parent == NULL) >> + nulls[2] = true; >> + else >> + /* >> + * We labeled dynahash contexts with just the hash table name. >> + * To make it possible to identify its parent, we also display >> + * parent's ident here. >> + */ >> + if (parent->ident && strcmp(parent->name, "dynahash") == 0) >> + values[2] = CStringGetTextDatum(parent->ident); >> + else >> + values[2] = CStringGetTextDatum(parent->name); >> >> PutMemoryContextsStatsTupleStore() doesn't need "parent" memory context, >> but uses only the name of "parent" memory context. So isn't it better to use >> "const char *parent" instead of "MemoryContext parent", as the argument of >> the function? If we do that, we can simplify the above code. > > Thanks, the attached patch adopted the advice. > > However, since PutMemoryContextsStatsTupleStore() used not only the name > but also the ident of the "parent", I could not help but adding similar > codes before calling the function. > The total amount of codes and complexity seem not to change so much. > > Any thoughts? Am I misunderstanding something? I was thinking that we can simplify the code as follows. That is, we can just pass "name" as the argument of PutMemoryContextsStatsTupleStore() since "name" indicates context->name or ident (if name is "dynahash"). for (child = context->firstchild; child != NULL; child = child->nextchild) { - const char *parentname; - - /* - * We labeled dynahash contexts with just the hash table name. - * To make it possible to identify its parent, we also use - * the hash table as its context name. - */ - if (context->ident && strcmp(context->name, "dynahash") == 0) - parentname = context->ident; - else - parentname = context->name; - PutMemoryContextsStatsTupleStore(tupstore, tupdesc, - child, parentname, level + 1); + child, name, level + 1); Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020-08-18 22:54, Fujii Masao wrote: > On 2020/08/18 18:41, torikoshia wrote: >> On 2020-08-17 21:19, Fujii Masao wrote: >>> On 2020/08/17 21:14, Fujii Masao wrote: >>>>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>>>> The following review has been posted through the commitfest >>>>>> application: >>>>>> make installcheck-world: tested, passed >>>>>> Implements feature: tested, passed >>>>>> Spec compliant: not tested >>>>>> Documentation: tested, passed >>>>>> >>>>>> I tested the latest >>>>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>>>> with the latest PG-version >>>>>> (199cec9779504c08aaa8159c6308283156547409) >>>>>> and test was passed. >>>>>> It looks good to me. >>>>>> >>>>>> The new status of this patch is: Ready for Committer >>>>> >>>>> Thanks for your testing! >>>> >>>> Thanks for updating the patch! Here are the review comments. >> >> Thanks for reviewing! >> >>>> >>>> + <row> >>>> + <entry><link >>>> linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >>>> + <entry>backend memory contexts</entry> >>>> + </row> >>>> >>>> The above is located just after pg_matviews entry. But it should be >>>> located >>>> just after pg_available_extension_versions entry. Because the rows >>>> in the table >>>> "System Views" should be located in alphabetical order. >>>> >>>> >>>> + <sect1 id="view-pg-backend-memory-contexts"> >>>> + >>>> <title><structname>pg_backend_memory_contexts</structname></title> >>>> >>>> Same as above. >> >> Modified both. >> >>>> >>>> >>>> + The view <structname>pg_backend_memory_contexts</structname> >>>> displays all >>>> + the local backend memory contexts. >>>> >>>> This description seems a bit confusing because maybe we can >>>> interpret this >>>> as "... displays the memory contexts of all the local backends" >>>> wrongly. Thought? >>>> What about the following description, instead? >> >>>> The view <structname>pg_backend_memory_contexts</structname> >>>> displays all >>>> the memory contexts of the server process attached to the >>>> current session. >> >> Thanks! it seems better. >> >>>> + const char *name = context->name; >>>> + const char *ident = context->ident; >>>> + >>>> + if (context == NULL) >>>> + return; >>>> >>>> The above check "context == NULL" is useless? If "context" is >>>> actually NULL, >>>> "context->name" would cause segmentation fault, so ISTM that the >>>> check >>>> will never be performed. >>>> >>>> If "context" can be NULL, the check should be performed before >>>> accessing >>>> to "contect". OTOH, if "context" must not be NULL per the >>>> specification of >>>> PutMemoryContextStatsTupleStore(), assertion test checking >>>> "context != NULL" should be used here, instead? >> >> Yeah, "context" cannot be NULL because "context" must be >> TopMemoryContext >> or it is already checked as not NULL as follows(child != NULL). >> >> I added the assertion check. > > Isn't it better to add AssertArg(MemoryContextIsValid(context)), > instead? Thanks, that's better. >> >> | for (child = context->firstchild; child != NULL; child = >> child->nextchild) >> | { >> | ... >> | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >> | child, >> parentname, level + 1); >> | } >> >>> Here is another comment. >>> >>> + if (parent == NULL) >>> + nulls[2] = true; >>> + else >>> + /* >>> + * We labeled dynahash contexts with just the hash >>> table name. >>> + * To make it possible to identify its parent, we also >>> display >>> + * parent's ident here. >>> + */ >>> + if (parent->ident && strcmp(parent->name, "dynahash") >>> == 0) >>> + values[2] = CStringGetTextDatum(parent->ident); >>> + else >>> + values[2] = CStringGetTextDatum(parent->name); >>> >>> PutMemoryContextsStatsTupleStore() doesn't need "parent" memory >>> context, >>> but uses only the name of "parent" memory context. So isn't it better >>> to use >>> "const char *parent" instead of "MemoryContext parent", as the >>> argument of >>> the function? If we do that, we can simplify the above code. >> >> Thanks, the attached patch adopted the advice. >> >> However, since PutMemoryContextsStatsTupleStore() used not only the >> name >> but also the ident of the "parent", I could not help but adding >> similar >> codes before calling the function. >> The total amount of codes and complexity seem not to change so much. >> >> Any thoughts? Am I misunderstanding something? > > I was thinking that we can simplify the code as follows. > That is, we can just pass "name" as the argument of > PutMemoryContextsStatsTupleStore() > since "name" indicates context->name or ident (if name is "dynahash"). > > for (child = context->firstchild; child != NULL; child = > child->nextchild) > { > - const char *parentname; > - > - /* > - * We labeled dynahash contexts with just the hash table name. > - * To make it possible to identify its parent, we also use > - * the hash table as its context name. > - */ > - if (context->ident && strcmp(context->name, "dynahash") == 0) > - parentname = context->ident; > - else > - parentname = context->name; > - > PutMemoryContextsStatsTupleStore(tupstore, tupdesc, > - child, parentname, level + 1); > + child, name, level + 1); I got it, thanks for the clarification! Attached a revised patch. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/08/19 9:43, torikoshia wrote: > On 2020-08-18 22:54, Fujii Masao wrote: >> On 2020/08/18 18:41, torikoshia wrote: >>> On 2020-08-17 21:19, Fujii Masao wrote: >>>> On 2020/08/17 21:14, Fujii Masao wrote: >>>>>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>>>>> The following review has been posted through the commitfest application: >>>>>>> make installcheck-world: tested, passed >>>>>>> Implements feature: tested, passed >>>>>>> Spec compliant: not tested >>>>>>> Documentation: tested, passed >>>>>>> >>>>>>> I tested the latest >>>>>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>>>>> with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) >>>>>>> and test was passed. >>>>>>> It looks good to me. >>>>>>> >>>>>>> The new status of this patch is: Ready for Committer >>>>>> >>>>>> Thanks for your testing! >>>>> >>>>> Thanks for updating the patch! Here are the review comments. >>> >>> Thanks for reviewing! >>> >>>>> >>>>> + <row> >>>>> + <entry><link linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >>>>> + <entry>backend memory contexts</entry> >>>>> + </row> >>>>> >>>>> The above is located just after pg_matviews entry. But it should be located >>>>> just after pg_available_extension_versions entry. Because the rows in the table >>>>> "System Views" should be located in alphabetical order. >>>>> >>>>> >>>>> + <sect1 id="view-pg-backend-memory-contexts"> >>>>> + <title><structname>pg_backend_memory_contexts</structname></title> >>>>> >>>>> Same as above. >>> >>> Modified both. >>> >>>>> >>>>> >>>>> + The view <structname>pg_backend_memory_contexts</structname> displays all >>>>> + the local backend memory contexts. >>>>> >>>>> This description seems a bit confusing because maybe we can interpret this >>>>> as "... displays the memory contexts of all the local backends" wrongly. Thought? >>>>> What about the following description, instead? >>> >>>>> The view <structname>pg_backend_memory_contexts</structname> displays all >>>>> the memory contexts of the server process attached to the current session. >>> >>> Thanks! it seems better. >>> >>>>> + const char *name = context->name; >>>>> + const char *ident = context->ident; >>>>> + >>>>> + if (context == NULL) >>>>> + return; >>>>> >>>>> The above check "context == NULL" is useless? If "context" is actually NULL, >>>>> "context->name" would cause segmentation fault, so ISTM that the check >>>>> will never be performed. >>>>> >>>>> If "context" can be NULL, the check should be performed before accessing >>>>> to "contect". OTOH, if "context" must not be NULL per the specification of >>>>> PutMemoryContextStatsTupleStore(), assertion test checking >>>>> "context != NULL" should be used here, instead? >>> >>> Yeah, "context" cannot be NULL because "context" must be TopMemoryContext >>> or it is already checked as not NULL as follows(child != NULL). >>> >>> I added the assertion check. >> >> Isn't it better to add AssertArg(MemoryContextIsValid(context)), instead? > > Thanks, that's better. > >>> >>> | for (child = context->firstchild; child != NULL; child = child->nextchild) >>> | { >>> | ... >>> | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >>> | child, parentname, level + 1); >>> | } >>> >>>> Here is another comment. >>>> >>>> + if (parent == NULL) >>>> + nulls[2] = true; >>>> + else >>>> + /* >>>> + * We labeled dynahash contexts with just the hash table name. >>>> + * To make it possible to identify its parent, we also display >>>> + * parent's ident here. >>>> + */ >>>> + if (parent->ident && strcmp(parent->name, "dynahash") == 0) >>>> + values[2] = CStringGetTextDatum(parent->ident); >>>> + else >>>> + values[2] = CStringGetTextDatum(parent->name); >>>> >>>> PutMemoryContextsStatsTupleStore() doesn't need "parent" memory context, >>>> but uses only the name of "parent" memory context. So isn't it better to use >>>> "const char *parent" instead of "MemoryContext parent", as the argument of >>>> the function? If we do that, we can simplify the above code. >>> >>> Thanks, the attached patch adopted the advice. >>> >>> However, since PutMemoryContextsStatsTupleStore() used not only the name >>> but also the ident of the "parent", I could not help but adding similar >>> codes before calling the function. >>> The total amount of codes and complexity seem not to change so much. >>> >>> Any thoughts? Am I misunderstanding something? >> >> I was thinking that we can simplify the code as follows. >> That is, we can just pass "name" as the argument of >> PutMemoryContextsStatsTupleStore() >> since "name" indicates context->name or ident (if name is "dynahash"). >> >> for (child = context->firstchild; child != NULL; child = child->nextchild) >> { >> - const char *parentname; >> - >> - /* >> - * We labeled dynahash contexts with just the hash table name. >> - * To make it possible to identify its parent, we also use >> - * the hash table as its context name. >> - */ >> - if (context->ident && strcmp(context->name, "dynahash") == 0) >> - parentname = context->ident; >> - else >> - parentname = context->name; >> - >> PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >> - child, parentname, level + 1); >> + child, name, level + 1); > > I got it, thanks for the clarification! > > Attached a revised patch. Thanks for updating the patch! I pushed it. BTW, I guess that you didn't add the regression test for this view because the contents of the view are not stable. Right? But isn't it better to just add the "stable" test like SELECT name, ident, parent, level, total_bytes >= free_bytes FROM pg_backend_memory_contexts WHERE level = 0; rather than adding nothing? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020-08-19 15:48, Fujii Masao wrote: > On 2020/08/19 9:43, torikoshia wrote: >> On 2020-08-18 22:54, Fujii Masao wrote: >>> On 2020/08/18 18:41, torikoshia wrote: >>>> On 2020-08-17 21:19, Fujii Masao wrote: >>>>> On 2020/08/17 21:14, Fujii Masao wrote: >>>>>>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>>>>>> The following review has been posted through the commitfest >>>>>>>> application: >>>>>>>> make installcheck-world: tested, passed >>>>>>>> Implements feature: tested, passed >>>>>>>> Spec compliant: not tested >>>>>>>> Documentation: tested, passed >>>>>>>> >>>>>>>> I tested the latest >>>>>>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>>>>>> with the latest PG-version >>>>>>>> (199cec9779504c08aaa8159c6308283156547409) >>>>>>>> and test was passed. >>>>>>>> It looks good to me. >>>>>>>> >>>>>>>> The new status of this patch is: Ready for Committer >>>>>>> >>>>>>> Thanks for your testing! >>>>>> >>>>>> Thanks for updating the patch! Here are the review comments. >>>> >>>> Thanks for reviewing! >>>> >>>>>> >>>>>> + <row> >>>>>> + <entry><link >>>>>> linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >>>>>> + <entry>backend memory contexts</entry> >>>>>> + </row> >>>>>> >>>>>> The above is located just after pg_matviews entry. But it should >>>>>> be located >>>>>> just after pg_available_extension_versions entry. Because the rows >>>>>> in the table >>>>>> "System Views" should be located in alphabetical order. >>>>>> >>>>>> >>>>>> + <sect1 id="view-pg-backend-memory-contexts"> >>>>>> + >>>>>> <title><structname>pg_backend_memory_contexts</structname></title> >>>>>> >>>>>> Same as above. >>>> >>>> Modified both. >>>> >>>>>> >>>>>> >>>>>> + The view <structname>pg_backend_memory_contexts</structname> >>>>>> displays all >>>>>> + the local backend memory contexts. >>>>>> >>>>>> This description seems a bit confusing because maybe we can >>>>>> interpret this >>>>>> as "... displays the memory contexts of all the local backends" >>>>>> wrongly. Thought? >>>>>> What about the following description, instead? >>>> >>>>>> The view <structname>pg_backend_memory_contexts</structname> >>>>>> displays all >>>>>> the memory contexts of the server process attached to the >>>>>> current session. >>>> >>>> Thanks! it seems better. >>>> >>>>>> + const char *name = context->name; >>>>>> + const char *ident = context->ident; >>>>>> + >>>>>> + if (context == NULL) >>>>>> + return; >>>>>> >>>>>> The above check "context == NULL" is useless? If "context" is >>>>>> actually NULL, >>>>>> "context->name" would cause segmentation fault, so ISTM that the >>>>>> check >>>>>> will never be performed. >>>>>> >>>>>> If "context" can be NULL, the check should be performed before >>>>>> accessing >>>>>> to "contect". OTOH, if "context" must not be NULL per the >>>>>> specification of >>>>>> PutMemoryContextStatsTupleStore(), assertion test checking >>>>>> "context != NULL" should be used here, instead? >>>> >>>> Yeah, "context" cannot be NULL because "context" must be >>>> TopMemoryContext >>>> or it is already checked as not NULL as follows(child != NULL). >>>> >>>> I added the assertion check. >>> >>> Isn't it better to add AssertArg(MemoryContextIsValid(context)), >>> instead? >> >> Thanks, that's better. >> >>>> >>>> | for (child = context->firstchild; child != NULL; child = >>>> child->nextchild) >>>> | { >>>> | ... >>>> | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >>>> | child, >>>> parentname, level + 1); >>>> | } >>>> >>>>> Here is another comment. >>>>> >>>>> + if (parent == NULL) >>>>> + nulls[2] = true; >>>>> + else >>>>> + /* >>>>> + * We labeled dynahash contexts with just the hash >>>>> table name. >>>>> + * To make it possible to identify its parent, we >>>>> also display >>>>> + * parent's ident here. >>>>> + */ >>>>> + if (parent->ident && strcmp(parent->name, "dynahash") >>>>> == 0) >>>>> + values[2] = >>>>> CStringGetTextDatum(parent->ident); >>>>> + else >>>>> + values[2] = >>>>> CStringGetTextDatum(parent->name); >>>>> >>>>> PutMemoryContextsStatsTupleStore() doesn't need "parent" memory >>>>> context, >>>>> but uses only the name of "parent" memory context. So isn't it >>>>> better to use >>>>> "const char *parent" instead of "MemoryContext parent", as the >>>>> argument of >>>>> the function? If we do that, we can simplify the above code. >>>> >>>> Thanks, the attached patch adopted the advice. >>>> >>>> However, since PutMemoryContextsStatsTupleStore() used not only the >>>> name >>>> but also the ident of the "parent", I could not help but adding >>>> similar >>>> codes before calling the function. >>>> The total amount of codes and complexity seem not to change so much. >>>> >>>> Any thoughts? Am I misunderstanding something? >>> >>> I was thinking that we can simplify the code as follows. >>> That is, we can just pass "name" as the argument of >>> PutMemoryContextsStatsTupleStore() >>> since "name" indicates context->name or ident (if name is >>> "dynahash"). >>> >>> for (child = context->firstchild; child != NULL; child = >>> child->nextchild) >>> { >>> - const char *parentname; >>> - >>> - /* >>> - * We labeled dynahash contexts with just the hash table >>> name. >>> - * To make it possible to identify its parent, we also use >>> - * the hash table as its context name. >>> - */ >>> - if (context->ident && strcmp(context->name, "dynahash") == >>> 0) >>> - parentname = context->ident; >>> - else >>> - parentname = context->name; >>> - >>> PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >>> - child, parentname, level + 1); >>> + child, name, level + 1); >> >> I got it, thanks for the clarification! >> >> Attached a revised patch. > > Thanks for updating the patch! I pushed it. Thanks a lot! > > BTW, I guess that you didn't add the regression test for this view > because > the contents of the view are not stable. Right? But isn't it better to > just > add the "stable" test like > > SELECT name, ident, parent, level, total_bytes >= free_bytes FROM > pg_backend_memory_contexts WHERE level = 0; > > rather than adding nothing? Yes, I didn't add regression tests because of the unstability of the output. I thought it would be OK since other views like pg_stat_slru and pg_shmem_allocations didn't have tests for their outputs. I don't have strong objections for adding tests like you proposed, but I'm not sure whether there are special reasons to add tests compared with these existing views. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2020/08/19 17:40, torikoshia wrote: > On 2020-08-19 15:48, Fujii Masao wrote: >> On 2020/08/19 9:43, torikoshia wrote: >>> On 2020-08-18 22:54, Fujii Masao wrote: >>>> On 2020/08/18 18:41, torikoshia wrote: >>>>> On 2020-08-17 21:19, Fujii Masao wrote: >>>>>> On 2020/08/17 21:14, Fujii Masao wrote: >>>>>>>> On 2020-08-07 16:38, Kasahara Tatsuhito wrote: >>>>>>>>> The following review has been posted through the commitfest application: >>>>>>>>> make installcheck-world: tested, passed >>>>>>>>> Implements feature: tested, passed >>>>>>>>> Spec compliant: not tested >>>>>>>>> Documentation: tested, passed >>>>>>>>> >>>>>>>>> I tested the latest >>>>>>>>> patch(0007-Adding-a-function-exposing-memory-usage-of-local-backend.patch) >>>>>>>>> with the latest PG-version (199cec9779504c08aaa8159c6308283156547409) >>>>>>>>> and test was passed. >>>>>>>>> It looks good to me. >>>>>>>>> >>>>>>>>> The new status of this patch is: Ready for Committer >>>>>>>> >>>>>>>> Thanks for your testing! >>>>>>> >>>>>>> Thanks for updating the patch! Here are the review comments. >>>>> >>>>> Thanks for reviewing! >>>>> >>>>>>> >>>>>>> + <row> >>>>>>> + <entry><link linkend="view-pg-backend-memory-contexts"><structname>pg_backend_memory_contexts</structname></link></entry> >>>>>>> + <entry>backend memory contexts</entry> >>>>>>> + </row> >>>>>>> >>>>>>> The above is located just after pg_matviews entry. But it should be located >>>>>>> just after pg_available_extension_versions entry. Because the rows in the table >>>>>>> "System Views" should be located in alphabetical order. >>>>>>> >>>>>>> >>>>>>> + <sect1 id="view-pg-backend-memory-contexts"> >>>>>>> + <title><structname>pg_backend_memory_contexts</structname></title> >>>>>>> >>>>>>> Same as above. >>>>> >>>>> Modified both. >>>>> >>>>>>> >>>>>>> >>>>>>> + The view <structname>pg_backend_memory_contexts</structname> displays all >>>>>>> + the local backend memory contexts. >>>>>>> >>>>>>> This description seems a bit confusing because maybe we can interpret this >>>>>>> as "... displays the memory contexts of all the local backends" wrongly. Thought? >>>>>>> What about the following description, instead? >>>>> >>>>>>> The view <structname>pg_backend_memory_contexts</structname> displays all >>>>>>> the memory contexts of the server process attached to the current session. >>>>> >>>>> Thanks! it seems better. >>>>> >>>>>>> + const char *name = context->name; >>>>>>> + const char *ident = context->ident; >>>>>>> + >>>>>>> + if (context == NULL) >>>>>>> + return; >>>>>>> >>>>>>> The above check "context == NULL" is useless? If "context" is actually NULL, >>>>>>> "context->name" would cause segmentation fault, so ISTM that the check >>>>>>> will never be performed. >>>>>>> >>>>>>> If "context" can be NULL, the check should be performed before accessing >>>>>>> to "contect". OTOH, if "context" must not be NULL per the specification of >>>>>>> PutMemoryContextStatsTupleStore(), assertion test checking >>>>>>> "context != NULL" should be used here, instead? >>>>> >>>>> Yeah, "context" cannot be NULL because "context" must be TopMemoryContext >>>>> or it is already checked as not NULL as follows(child != NULL). >>>>> >>>>> I added the assertion check. >>>> >>>> Isn't it better to add AssertArg(MemoryContextIsValid(context)), instead? >>> >>> Thanks, that's better. >>> >>>>> >>>>> | for (child = context->firstchild; child != NULL; child = child->nextchild) >>>>> | { >>>>> | ... >>>>> | PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >>>>> | child, parentname, level + 1); >>>>> | } >>>>> >>>>>> Here is another comment. >>>>>> >>>>>> + if (parent == NULL) >>>>>> + nulls[2] = true; >>>>>> + else >>>>>> + /* >>>>>> + * We labeled dynahash contexts with just the hash table name. >>>>>> + * To make it possible to identify its parent, we also display >>>>>> + * parent's ident here. >>>>>> + */ >>>>>> + if (parent->ident && strcmp(parent->name, "dynahash") == 0) >>>>>> + values[2] = CStringGetTextDatum(parent->ident); >>>>>> + else >>>>>> + values[2] = CStringGetTextDatum(parent->name); >>>>>> >>>>>> PutMemoryContextsStatsTupleStore() doesn't need "parent" memory context, >>>>>> but uses only the name of "parent" memory context. So isn't it better to use >>>>>> "const char *parent" instead of "MemoryContext parent", as the argument of >>>>>> the function? If we do that, we can simplify the above code. >>>>> >>>>> Thanks, the attached patch adopted the advice. >>>>> >>>>> However, since PutMemoryContextsStatsTupleStore() used not only the name >>>>> but also the ident of the "parent", I could not help but adding similar >>>>> codes before calling the function. >>>>> The total amount of codes and complexity seem not to change so much. >>>>> >>>>> Any thoughts? Am I misunderstanding something? >>>> >>>> I was thinking that we can simplify the code as follows. >>>> That is, we can just pass "name" as the argument of >>>> PutMemoryContextsStatsTupleStore() >>>> since "name" indicates context->name or ident (if name is "dynahash"). >>>> >>>> for (child = context->firstchild; child != NULL; child = child->nextchild) >>>> { >>>> - const char *parentname; >>>> - >>>> - /* >>>> - * We labeled dynahash contexts with just the hash table name. >>>> - * To make it possible to identify its parent, we also use >>>> - * the hash table as its context name. >>>> - */ >>>> - if (context->ident && strcmp(context->name, "dynahash") == 0) >>>> - parentname = context->ident; >>>> - else >>>> - parentname = context->name; >>>> - >>>> PutMemoryContextsStatsTupleStore(tupstore, tupdesc, >>>> - child, parentname, level + 1); >>>> + child, name, level + 1); >>> >>> I got it, thanks for the clarification! >>> >>> Attached a revised patch. >> >> Thanks for updating the patch! I pushed it. > > Thanks a lot! > >> >> BTW, I guess that you didn't add the regression test for this view because >> the contents of the view are not stable. Right? But isn't it better to just >> add the "stable" test like >> >> SELECT name, ident, parent, level, total_bytes >= free_bytes FROM >> pg_backend_memory_contexts WHERE level = 0; >> >> rather than adding nothing? > > Yes, I didn't add regression tests because of the unstability of the output. > I thought it would be OK since other views like pg_stat_slru and pg_shmem_allocations > didn't have tests for their outputs. You're right. > I don't have strong objections for adding tests like you proposed, but I'm not sure > whether there are special reasons to add tests compared with these existing views. Ok, understood. So I'd withdraw my suggestion. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Wed, Aug 19, 2020 at 06:12:02PM +0900, Fujii Masao wrote: > On 2020/08/19 17:40, torikoshia wrote: >> Yes, I didn't add regression tests because of the unstability of the output. >> I thought it would be OK since other views like pg_stat_slru and pg_shmem_allocations >> didn't have tests for their outputs. > > You're right. If you can make a test with something minimal and with a stable output, adding a test is helpful IMO, or how can you make easily sure that this does not get broken, particularly in the event of future refactorings, or even with platform-dependent behaviors? By the way, I was looking at the code that has been committed, and I think that it is awkward to have a SQL function in mcxt.c, which is a rather low-level interface. I think that this new code should be moved to its own file, one suggestion for a location I have being src/backend/utils/adt/mcxtfuncs.c. -- Michael
Attachment
Hadn't been paying attention to this thread up till now, but ... Michael Paquier <michael@paquier.xyz> writes: > By the way, I was looking at the code that has been committed, and I > think that it is awkward to have a SQL function in mcxt.c, which is a > rather low-level interface. I think that this new code should be > moved to its own file, one suggestion for a location I have being > src/backend/utils/adt/mcxtfuncs.c. I agree with that, but I think this patch has a bigger problem: why bother at all? It seems like a waste of code space and future maintenance effort, because there is no use-case. In the situations where you need to know where the memory went, you are almost never in a position to leisurely execute a query and send the results over to your client. This certainly would be useless to figure out why an already-running query is eating space, for instance. The only situation I could imagine where this would have any use is where there is long-term (cross-query) bloat in, say, CacheMemoryContext --- but it's not even very helpful for that, since you can't examine anything finer-grain than a memory context. Plus you need to be running an interactive session, or else be willing to hack up your application to try to get it to inspect the view (and log the results somewhere) at useful times. On top of all that, the functionality has Heisenberg problems, because simply using it changes what you are trying to observe, in complex and undocumented ways (not that the documentation would be of any use to non-experts anyway). My own thoughts about improving the debugging situation would've been to create a way to send a signal to a session to make it dump its current memory map to the postmaster log (not the client, since the client is unlikely to be prepared to receive anything extraneous). But this is nothing like that. Given the lack of clear use-case, and the possibility (admittedly not strong) that this is still somehow a security hazard, I think we should revert it. If it stays, I'd like to see restrictions on who can read the view. regards, tom lane
Hi, On 2020-08-19 11:01:37 -0400, Tom Lane wrote: > Hadn't been paying attention to this thread up till now, but ... > > Michael Paquier <michael@paquier.xyz> writes: > > By the way, I was looking at the code that has been committed, and I > > think that it is awkward to have a SQL function in mcxt.c, which is a > > rather low-level interface. I think that this new code should be > > moved to its own file, one suggestion for a location I have being > > src/backend/utils/adt/mcxtfuncs.c. > > I agree with that, but I think this patch has a bigger problem: > why bother at all? It seems like a waste of code space and future > maintenance effort, because there is no use-case. In the situations > where you need to know where the memory went, you are almost never > in a position to leisurely execute a query and send the results over > to your client. This certainly would be useless to figure out why > an already-running query is eating space, for instance. I don't agree with this at all. I think there's plenty use cases. It's e.g. very common to try to figure out why the memory usage of a process is high. Is it memory not returned to the OS? Is it caches that have grown too much etc. I agree it's not perfect: > Plus you need to be running an interactive session, or else be willing > to hack up your application to try to get it to inspect the view (and > log the results somewhere) at useful times. and that we likely should address that by *also* allowing to view the memory usage of another process. Which obviously isn't entirely trivial. But some infrastructure likely could be reused. > My own thoughts about improving the debugging situation would've been > to create a way to send a signal to a session to make it dump its > current memory map to the postmaster log (not the client, since the > client is unlikely to be prepared to receive anything extraneous). > But this is nothing like that. That doesn't really work in a large number of environments, I'm afraid. Many many users don't have access to the server log. > If it stays, I'd like to see restrictions on who can read the view. As long as access is grantable rather than needing a security definer wrapper I'd be fine with that. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-08-19 11:01:37 -0400, Tom Lane wrote: >> I agree with that, but I think this patch has a bigger problem: >> why bother at all? It seems like a waste of code space and future >> maintenance effort, because there is no use-case. > I don't agree with this at all. I think there's plenty use cases. It's > e.g. very common to try to figure out why the memory usage of a process > is high. Is it memory not returned to the OS? Is it caches that have > grown too much etc. Oh, I agree completely that there are lots of use-cases for finding out what a process' memory map looks like. But this patch fails to address any of them in a usable way. >> My own thoughts about improving the debugging situation would've been >> to create a way to send a signal to a session to make it dump its >> current memory map to the postmaster log (not the client, since the >> client is unlikely to be prepared to receive anything extraneous). > That doesn't really work in a large number of environments, I'm > afraid. Many many users don't have access to the server log. My rationale for that was (a) it can be implemented without a lot of impact on the memory map, and (b) requiring access to the server log eliminates questions about whether you have enough privilege to examine the map. I'm prepared to compromise about (b), but less so about (a). Having to run a SQL query to find this out is a mess. regards, tom lane
Hi, On 2020-08-19 21:29:06 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2020-08-19 11:01:37 -0400, Tom Lane wrote: > >> I agree with that, but I think this patch has a bigger problem: > >> why bother at all? It seems like a waste of code space and future > >> maintenance effort, because there is no use-case. > > > I don't agree with this at all. I think there's plenty use cases. It's > > e.g. very common to try to figure out why the memory usage of a process > > is high. Is it memory not returned to the OS? Is it caches that have > > grown too much etc. > > Oh, I agree completely that there are lots of use-cases for finding > out what a process' memory map looks like. But this patch fails to > address any of them in a usable way. Even just being able to see the memory usage in a queryable way is a huge benefit. The difference over having to parse the log, then parse the memory usage dump, and then aggregate the data in there in a meaningful way is *huge*. We've been slacking around lowering our memory usage, and I think the fact that it's annoying to analyze is a partial reason for that. I totally agree that it's not *enough*, but in contrast to you I think it's a good step. Subsequently we should add a way to get any backends memory usage. It's not too hard to imagine how to serialize it in a way that can be easily deserialized by another backend. I am imagining something like sending a procsignal that triggers (probably at CFR() time) a backend to write its own memory usage into pg_memusage/<pid> or something roughly like that. Greetings, Andres Freund
On 2020/08/20 0:01, Tom Lane wrote: > Hadn't been paying attention to this thread up till now, but ... > > Michael Paquier <michael@paquier.xyz> writes: >> By the way, I was looking at the code that has been committed, and I >> think that it is awkward to have a SQL function in mcxt.c, which is a >> rather low-level interface. I think that this new code should be >> moved to its own file, one suggestion for a location I have being >> src/backend/utils/adt/mcxtfuncs.c. > > I agree with that, but I think this patch has a bigger problem: > why bother at all? It seems like a waste of code space and future > maintenance effort, because there is no use-case. In the situations > where you need to know where the memory went, you are almost never > in a position to leisurely execute a query and send the results over > to your client. This certainly would be useless to figure out why > an already-running query is eating space, for instance. > > The only situation I could imagine where this would have any use is > where there is long-term (cross-query) bloat in, say, CacheMemoryContext Yes, this feature is useful to check a cross-query memory bloat like the bloats of relcache, prepared statements, PL/pgSQL cache, SMgrRelationHash, etc. For example, several years before, my colleague investigated the cause of the memory bloat by using the almost same feature that pg_cheat_funcs extension provides, and then found that the cause was that the application forgot to release lots of SAVEPONT. > --- but it's not even very helpful for that, since you can't examine > anything finer-grain than a memory context. Yes, but even that information can be good hint when investigating the memory bloat. > Plus you need to be > running an interactive session, or else be willing to hack up your > application to try to get it to inspect the view (and log the > results somewhere) at useful times. > > On top of all that, the functionality has Heisenberg problems, > because simply using it changes what you are trying to observe, > in complex and undocumented ways (not that the documentation > would be of any use to non-experts anyway). > > My own thoughts about improving the debugging situation would've been > to create a way to send a signal to a session to make it dump its > current memory map to the postmaster log (not the client, since the > client is unlikely to be prepared to receive anything extraneous). > But this is nothing like that. I agree that we need something like this, i.e., the way to monitor the memory usage of any process even during query running. OTOH, I think that the added feature has a use case and is good as the first step. > Given the lack of clear use-case, and the possibility (admittedly > not strong) that this is still somehow a security hazard, I think > we should revert it. If it stays, I'd like to see restrictions > on who can read the view. For example, allowing only the role with pg_monitor to see this view? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/20 10:43, Andres Freund wrote: > Hi, > > On 2020-08-19 21:29:06 -0400, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> On 2020-08-19 11:01:37 -0400, Tom Lane wrote: >>>> I agree with that, but I think this patch has a bigger problem: >>>> why bother at all? It seems like a waste of code space and future >>>> maintenance effort, because there is no use-case. >> >>> I don't agree with this at all. I think there's plenty use cases. It's >>> e.g. very common to try to figure out why the memory usage of a process >>> is high. Is it memory not returned to the OS? Is it caches that have >>> grown too much etc. >> >> Oh, I agree completely that there are lots of use-cases for finding >> out what a process' memory map looks like. But this patch fails to >> address any of them in a usable way. > > Even just being able to see the memory usage in a queryable way is a > huge benefit. +1 > The difference over having to parse the log, then parse > the memory usage dump, and then aggregate the data in there in a > meaningful way is *huge*. We've been slacking around lowering our > memory usage, and I think the fact that it's annoying to analyze is a > partial reason for that. Agreed. > I totally agree that it's not *enough*, but in contrast to you I think > it's a good step. Subsequently we should add a way to get any backends > memory usage. > It's not too hard to imagine how to serialize it in a way that can be > easily deserialized by another backend. I am imagining something like > sending a procsignal that triggers (probably at CFR() time) a backend to > write its own memory usage into pg_memusage/<pid> or something roughly > like that. Sounds good. Maybe we can also provide the SQL-callable function or view to read pg_memusage/<pid>, to make the analysis easier. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/20 0:01, Tom Lane wrote: > The only situation I could imagine where this would have any use is > where there is long-term (cross-query) bloat in, say, CacheMemoryContext Yeah, in cases where a very large number of sessions are connected to the DB for very long periods of time, the memory consumption of the back-end processes may increase slowly, and such a feature is useful for analysis. And ,as Fujii said, this feature very useful to see which contexts are consuming a lot of memory and to narrow down the causes. On Thu, Aug 20, 2020 at 11:18 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > On 2020/08/20 10:43, Andres Freund wrote: > > Hi, > > Even just being able to see the memory usage in a queryable way is a > > huge benefit. > > +1 +1 I think this feature is very useful in environments where gdb is not available or access to server logs is limited. And it is cumbersome to extract and analyze the memory information from the very large server logs. > > I totally agree that it's not *enough*, but in contrast to you I think > > it's a good step. Subsequently we should add a way to get any backends > > memory usage. > > It's not too hard to imagine how to serialize it in a way that can be > > easily deserialized by another backend. I am imagining something like > > sending a procsignal that triggers (probably at CFR() time) a backend to > > write its own memory usage into pg_memusage/<pid> or something roughly > > like that. > > Sounds good. Maybe we can also provide the SQL-callable function > or view to read pg_memusage/<pid>, to make the analysis easier. +1 Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito _at_ gmail.com
Thanks for all your comments! Thankfully it seems that this feature is regarded as not meaningless one, I'm going to do some improvements. On Wed, Aug 19, 2020 at 10:56 PM Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Aug 19, 2020 at 06:12:02PM +0900, Fujii Masao wrote: >> On 2020/08/19 17:40, torikoshia wrote: >>> Yes, I didn't add regression tests because of the unstability of the >>> output. >>> I thought it would be OK since other views like pg_stat_slru and >>> pg_shmem_allocations >>> didn't have tests for their outputs. >> >> You're right. > > If you can make a test with something minimal and with a stable > output, adding a test is helpful IMO, or how can you make easily sure > that this does not get broken, particularly in the event of future > refactorings, or even with platform-dependent behaviors? OK. Added a regression test on sysviews.sql. (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) Fujii-san gave us an example, but I added more simple one considering the simplicity of other tests on that. On Thu, Aug 20, 2020 at 12:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael@paquier.xyz> writes: > > By the way, I was looking at the code that has been committed, and I > > think that it is awkward to have a SQL function in mcxt.c, which is a > > rather low-level interface. I think that this new code should be > > moved to its own file, one suggestion for a location I have being > > src/backend/utils/adt/mcxtfuncs.c. > > I agree with that, Thanks for pointing out. Added a patch for relocating the codes to mcxtfuncs.c. (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) > On Thu, Aug 20, 2020 at 11:09 AM Fujii Masao > <masao.fujii@oss.nttdata.com> wrote: > On 2020/08/20 0:01, Tom Lane wrote: >> Given the lack of clear use-case, and the possibility (admittedly >> not strong) that this is still somehow a security hazard, I think >> we should revert it. If it stays, I'd like to see restrictions >> on who can read the view. > For example, allowing only the role with pg_monitor to see this view? Attached a patch adding that restriction. (0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch) Of course, this restriction makes pg_backend_memory_contexts hard to use when the user of the target session is not granted pg_monitor because the scope of this view is session local. In this case, I imagine additional operations something like temporarily granting pg_monitor to that user. Thoughts? Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: > OK. Added a regression test on sysviews.sql. > (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) > > Fujii-san gave us an example, but I added more simple one considering > the simplicity of other tests on that. What you have sent in 0001 looks fine to me. A small test is much better than nothing. > Added a patch for relocating the codes to mcxtfuncs.c. > (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) The same code is moved around line-by-line. > Of course, this restriction makes pg_backend_memory_contexts hard to use > when the user of the target session is not granted pg_monitor because the > scope of this view is session local. > > In this case, I imagine additional operations something like temporarily > granting pg_monitor to that user. Hmm. I am not completely sure either that pg_monitor is the best fit here, because this view provides information about a bunch of internal structures. Something that could easily be done though is to revoke the access from public, and then users could just set up GRANT permissions post-initdb, with pg_monitor as one possible choice. This is the safest path by default, and this stuff is of a caliber similar to pg_shmem_allocations in terms of internal contents. It seems to me that you are missing one "REVOKE ALL on pg_backend_memory_contexts FROM PUBLIC" in patch 0003. By the way, if that was just for me, I would remove used_bytes, which is just a computation from the total and free numbers. I'll defer that point to Fujii-san. -- Michael
Attachment
On 2020-08-22 21:18, Michael Paquier wrote: Thanks for reviewing! > On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >> OK. Added a regression test on sysviews.sql. >> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >> >> Fujii-san gave us an example, but I added more simple one considering >> the simplicity of other tests on that. > > What you have sent in 0001 looks fine to me. A small test is much > better than nothing. > >> Added a patch for relocating the codes to mcxtfuncs.c. >> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) > > The same code is moved around line-by-line. > >> Of course, this restriction makes pg_backend_memory_contexts hard to >> use >> when the user of the target session is not granted pg_monitor because >> the >> scope of this view is session local. >> >> In this case, I imagine additional operations something like >> temporarily >> granting pg_monitor to that user. > > Hmm. I am not completely sure either that pg_monitor is the best fit > here, because this view provides information about a bunch of internal > structures. Something that could easily be done though is to revoke > the access from public, and then users could just set up GRANT > permissions post-initdb, with pg_monitor as one possible choice. This > is the safest path by default, and this stuff is of a caliber similar > to pg_shmem_allocations in terms of internal contents. I think this is a better way than what I did in 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. Attached a patch. > > It seems to me that you are missing one "REVOKE ALL on > pg_backend_memory_contexts FROM PUBLIC" in patch 0003. > > By the way, if that was just for me, I would remove used_bytes, which > is just a computation from the total and free numbers. I'll defer > that point to Fujii-san. > -- > Michael On 2020/08/20 2:59, Kasahara Tatsuhito wrote: >>> I totally agree that it's not *enough*, but in contrast to you I >>> think >>> it's a good step. Subsequently we should add a way to get any >>> backends >>> memory usage. >>> It's not too hard to imagine how to serialize it in a way that can be >>> easily deserialized by another backend. I am imagining something like >>> sending a procsignal that triggers (probably at CFR() time) a backend >>> to >>> write its own memory usage into pg_memusage/<pid> or something >>> roughly >>> like that. >> >> Sounds good. Maybe we can also provide the SQL-callable function >> or view to read pg_memusage/<pid>, to make the analysis easier. > +1 I'm thinking about starting a new thread to discuss exposing other backend's memory context. Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/08/24 13:01, torikoshia wrote: > On 2020-08-22 21:18, Michael Paquier wrote: > > Thanks for reviewing! > >> On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >>> OK. Added a regression test on sysviews.sql. >>> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >>> >>> Fujii-san gave us an example, but I added more simple one considering >>> the simplicity of other tests on that. >> >> What you have sent in 0001 looks fine to me. A small test is much >> better than nothing. +1 But as I proposed upthread, what about a bit complicated test as follows, e.g., to confirm that the internal logic for level works expectedly? SELECT name, ident, parent, level, total_bytes >= free_bytes FROM pg_backend_memory_contexts WHERE level = 0; >> >>> Added a patch for relocating the codes to mcxtfuncs.c. >>> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) Thanks for the patch! Looks good to me. Barring any objection, I will commit this patch at first. >> >> The same code is moved around line-by-line. >> >>> Of course, this restriction makes pg_backend_memory_contexts hard to use >>> when the user of the target session is not granted pg_monitor because the >>> scope of this view is session local. >>> >>> In this case, I imagine additional operations something like temporarily >>> granting pg_monitor to that user. >> >> Hmm. I am not completely sure either that pg_monitor is the best fit >> here, because this view provides information about a bunch of internal >> structures. Something that could easily be done though is to revoke >> the access from public, and then users could just set up GRANT >> permissions post-initdb, with pg_monitor as one possible choice. This >> is the safest path by default, and this stuff is of a caliber similar >> to pg_shmem_allocations in terms of internal contents. > > I think this is a better way than what I did in > 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. You mean 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch? > Attached a patch. Thanks for updating the patch! This also looks good to me. >> It seems to me that you are missing one "REVOKE ALL on >> pg_backend_memory_contexts FROM PUBLIC" in patch 0003. >> >> By the way, if that was just for me, I would remove used_bytes, which >> is just a computation from the total and free numbers. I'll defer >> that point to Fujii-san. Yeah, I was just thinking that displaying also used_bytes was useful, but this might be inconsistent with the other views' ways. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/24 13:13, Fujii Masao wrote: > > > On 2020/08/24 13:01, torikoshia wrote: >> On 2020-08-22 21:18, Michael Paquier wrote: >> >> Thanks for reviewing! >> >>> On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >>>> OK. Added a regression test on sysviews.sql. >>>> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >>>> >>>> Fujii-san gave us an example, but I added more simple one considering >>>> the simplicity of other tests on that. >>> >>> What you have sent in 0001 looks fine to me. A small test is much >>> better than nothing. > > +1 > > But as I proposed upthread, what about a bit complicated test as follows, > e.g., to confirm that the internal logic for level works expectedly? > > SELECT name, ident, parent, level, total_bytes >= free_bytes FROM pg_backend_memory_contexts WHERE level = 0; > > >>> >>>> Added a patch for relocating the codes to mcxtfuncs.c. >>>> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) > > Thanks for the patch! Looks good to me. > Barring any objection, I will commit this patch at first. As far as I know, utils/adt is the directory to basically include the files for a particular type or operator. So ISTM that mcxtfuncs.c doesn't fit to this directory. Isn't it better to put that in utils/mmgr ? The copyright line in new file mcxtfuncs.c should be changed as follows because it contains new code? - * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group - * Portions Copyright (c) 1994, Regents of the University of California + * Portions Copyright (c) 2020, PostgreSQL Global Development Group Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Mon, Aug 24, 2020 at 02:48:50PM +0900, Fujii Masao wrote: > As far as I know, utils/adt is the directory to basically include the files > for a particular type or operator. So ISTM that mcxtfuncs.c doesn't > fit to this directory. Isn't it better to put that in utils/mmgr ? We have also stuff like ruleutils.c, dbsize.c, genfile.c there which is rather generic, so I would rather leave utils/mmgr/ out of the business of this thread, and just keep inside all the lower-level APIs for memory context handling. I don't have a strong feeling for one being better than the other, so if you prefer more one way than the other, that's fine by me as long as the split is done as the new functions depend on nothing static in mcxt.c. And you are the committer of this feature. > The copyright line in new file mcxtfuncs.c should be changed as follows > because it contains new code? > - * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group > - * Portions Copyright (c) 1994, Regents of the University of California > + * Portions Copyright (c) 2020, PostgreSQL Global Development Group FWIW, I usually choose what's proposed in the patch as a matter of consistency, because it is a no-brainer and because you don't have to think about past references when it comes to structures or such. -- Michael
Attachment
On 2020-08-24 13:13, Fujii Masao wrote: > On 2020/08/24 13:01, torikoshia wrote: >> On 2020-08-22 21:18, Michael Paquier wrote: >> >> Thanks for reviewing! >> >>> On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >>>> OK. Added a regression test on sysviews.sql. >>>> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >>>> >>>> Fujii-san gave us an example, but I added more simple one >>>> considering >>>> the simplicity of other tests on that. >>> >>> What you have sent in 0001 looks fine to me. A small test is much >>> better than nothing. > > +1 > > But as I proposed upthread, what about a bit complicated test as > follows, > e.g., to confirm that the internal logic for level works expectedly? > > SELECT name, ident, parent, level, total_bytes >= free_bytes FROM > pg_backend_memory_contexts WHERE level = 0; OK! Attached an updated patch. > > >>> >>>> Added a patch for relocating the codes to mcxtfuncs.c. >>>> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) > > Thanks for the patch! Looks good to me. > Barring any objection, I will commit this patch at first. > > >>> >>> The same code is moved around line-by-line. >>> >>>> Of course, this restriction makes pg_backend_memory_contexts hard to >>>> use >>>> when the user of the target session is not granted pg_monitor >>>> because the >>>> scope of this view is session local. >>>> >>>> In this case, I imagine additional operations something like >>>> temporarily >>>> granting pg_monitor to that user. >>> >>> Hmm. I am not completely sure either that pg_monitor is the best fit >>> here, because this view provides information about a bunch of >>> internal >>> structures. Something that could easily be done though is to revoke >>> the access from public, and then users could just set up GRANT >>> permissions post-initdb, with pg_monitor as one possible choice. >>> This >>> is the safest path by default, and this stuff is of a caliber similar >>> to pg_shmem_allocations in terms of internal contents. >> >> I think this is a better way than what I did in >> 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. > > You mean > 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch? Oops, I meant 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch. > > >> Attached a patch. > > Thanks for updating the patch! This also looks good to me. > > >>> It seems to me that you are missing one "REVOKE ALL on >>> pg_backend_memory_contexts FROM PUBLIC" in patch 0003. >>> >>> By the way, if that was just for me, I would remove used_bytes, which >>> is just a computation from the total and free numbers. I'll defer >>> that point to Fujii-san. > > Yeah, I was just thinking that displaying also used_bytes was useful, > but this might be inconsistent with the other views' ways. > > Regards, Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Attachment
On 2020/08/24 17:09, Michael Paquier wrote: > On Mon, Aug 24, 2020 at 02:48:50PM +0900, Fujii Masao wrote: >> As far as I know, utils/adt is the directory to basically include the files >> for a particular type or operator. So ISTM that mcxtfuncs.c doesn't >> fit to this directory. Isn't it better to put that in utils/mmgr ? > > We have also stuff like ruleutils.c, dbsize.c, genfile.c there which > is rather generic, so I would rather leave utils/mmgr/ out of the > business of this thread, and just keep inside all the lower-level APIs > for memory context handling. Understood. So I will commit the latest patch 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/24 21:56, torikoshia wrote: > On 2020-08-24 13:13, Fujii Masao wrote: >> On 2020/08/24 13:01, torikoshia wrote: >>> On 2020-08-22 21:18, Michael Paquier wrote: >>> >>> Thanks for reviewing! >>> >>>> On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >>>>> OK. Added a regression test on sysviews.sql. >>>>> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >>>>> >>>>> Fujii-san gave us an example, but I added more simple one considering >>>>> the simplicity of other tests on that. >>>> >>>> What you have sent in 0001 looks fine to me. A small test is much >>>> better than nothing. >> >> +1 >> >> But as I proposed upthread, what about a bit complicated test as follows, >> e.g., to confirm that the internal logic for level works expectedly? >> >> SELECT name, ident, parent, level, total_bytes >= free_bytes FROM >> pg_backend_memory_contexts WHERE level = 0; > > OK! > Attached an updated patch. Thanks for updating the patch! Looks good to me. Barring any objection, I will commit this patch. > >> >> >>>> >>>>> Added a patch for relocating the codes to mcxtfuncs.c. >>>>> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) >> >> Thanks for the patch! Looks good to me. >> Barring any objection, I will commit this patch at first. >> >> >>>> >>>> The same code is moved around line-by-line. >>>> >>>>> Of course, this restriction makes pg_backend_memory_contexts hard to use >>>>> when the user of the target session is not granted pg_monitor because the >>>>> scope of this view is session local. >>>>> >>>>> In this case, I imagine additional operations something like temporarily >>>>> granting pg_monitor to that user. >>>> >>>> Hmm. I am not completely sure either that pg_monitor is the best fit >>>> here, because this view provides information about a bunch of internal >>>> structures. Something that could easily be done though is to revoke >>>> the access from public, and then users could just set up GRANT >>>> permissions post-initdb, with pg_monitor as one possible choice. This >>>> is the safest path by default, and this stuff is of a caliber similar >>>> to pg_shmem_allocations in terms of internal contents. >>> >>> I think this is a better way than what I did in >>> 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. >> >> You mean 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch? > > Oops, I meant 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch. This patch also looks good to me. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/08/25 11:39, Fujii Masao wrote: > > > On 2020/08/24 21:56, torikoshia wrote: >> On 2020-08-24 13:13, Fujii Masao wrote: >>> On 2020/08/24 13:01, torikoshia wrote: >>>> On 2020-08-22 21:18, Michael Paquier wrote: >>>> >>>> Thanks for reviewing! >>>> >>>>> On Fri, Aug 21, 2020 at 11:27:06PM +0900, torikoshia wrote: >>>>>> OK. Added a regression test on sysviews.sql. >>>>>> (0001-Added-a-regression-test-for-pg_backend_memory_contex.patch) >>>>>> >>>>>> Fujii-san gave us an example, but I added more simple one considering >>>>>> the simplicity of other tests on that. >>>>> >>>>> What you have sent in 0001 looks fine to me. A small test is much >>>>> better than nothing. >>> >>> +1 >>> >>> But as I proposed upthread, what about a bit complicated test as follows, >>> e.g., to confirm that the internal logic for level works expectedly? >>> >>> SELECT name, ident, parent, level, total_bytes >= free_bytes FROM >>> pg_backend_memory_contexts WHERE level = 0; >> >> OK! >> Attached an updated patch. > > Thanks for updating the patch! Looks good to me. > Barring any objection, I will commit this patch. > >> >>> >>> >>>>> >>>>>> Added a patch for relocating the codes to mcxtfuncs.c. >>>>>> (patches/0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch) >>> >>> Thanks for the patch! Looks good to me. >>> Barring any objection, I will commit this patch at first. >>> >>> >>>>> >>>>> The same code is moved around line-by-line. >>>>> >>>>>> Of course, this restriction makes pg_backend_memory_contexts hard to use >>>>>> when the user of the target session is not granted pg_monitor because the >>>>>> scope of this view is session local. >>>>>> >>>>>> In this case, I imagine additional operations something like temporarily >>>>>> granting pg_monitor to that user. >>>>> >>>>> Hmm. I am not completely sure either that pg_monitor is the best fit >>>>> here, because this view provides information about a bunch of internal >>>>> structures. Something that could easily be done though is to revoke >>>>> the access from public, and then users could just set up GRANT >>>>> permissions post-initdb, with pg_monitor as one possible choice. This >>>>> is the safest path by default, and this stuff is of a caliber similar >>>>> to pg_shmem_allocations in terms of internal contents. >>>> >>>> I think this is a better way than what I did in >>>> 0001-Rellocated-the-codes-for-pg_backend_memory_contexts-.patch. >>> >>> You mean 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch? >> >> Oops, I meant 0001-Restrict-the-access-to-pg_backend_memory_contexts-to.patch. > > This patch also looks good to me. Thanks! I pushed the proposed three patches. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION