Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers

From torikoshia
Subject Re: Enhancing Memory Context Statistics Reporting
Date
Msg-id ef750e59c0da6c8cfb8a39e1b7a788be@oss.nttdata.com
Whole thread Raw
In response to Re: Enhancing Memory Context Statistics Reporting  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers
On 2025-08-20 06:42, Rahila Syed wrote:
> PFA the fix.

Thanks for updating the patch!

Specifying a very small timeout value (such as 0 or 0.0001) and 
repeatedly executing the function seems to cause unexpected behavior. In 
some cases, it even leads to a crash.

For example:

   (session1)=# select pg_backend_pid();
    pg_backend_pid
   ----------------
             50917

   (session2)=# select pg_get_process_memory_contexts(50917, true,   
0.0001);
    pg_get_process_memory_contexts
   --------------------------------
   (0 rows)

   (session2)=# \watch 0.01

    pg_get_process_memory_contexts
   --------------------------------
   (,,???,,0,0,0,0,0,0,0)
   ...
(21 rows)

   (session2)=# \watch 0.01

  pg_get_process_memory_contexts
--------------------------------
(0 rows)

   ...

   server closed the connection unexpectedly
          This probably means the server terminated abnormally
          before or while processing the request.
   The connection to the server was lost. Attempting reset: Failed.


This issue occurs on my M1 Mac, but I couldn’t reproduce it on Ubuntu, 
so it might be environment-dependent.


Looking at the logs, Assert() is failing:

   2025-10-07 08:48:26.766 JST [local] psql [23626] WARNING:  01000: 
server process 23646 is processing previous request
   2025-10-07 08:48:26.766 JST [local] psql [23626] LOCATION:  
pg_get_process_memory_contexts, mcxtfuncs.c:476
   TRAP: failed Assert("victim->magic == FREE_PAGE_SPAN_LEADER_MAGIC"), 
File: "freepage.c", Line: 1379, PID: 23626
   0   postgres                            0x000000010357fdf4 
ExceptionalCondition + 216
   1   postgres                            0x00000001035cbe18 
FreePageManagerGetInternal + 684
   2   postgres                            0x00000001035cbb18 
FreePageManagerGet + 40
   3   postgres                            0x00000001035c84cc 
dsa_allocate_extended + 788
   4   postgres                            0x0000000103453af0 
pg_get_process_memory_contexts + 992
   5   postgres                            0x0000000103007e94 
ExecMakeFunctionResultSet + 616
   6   postgres                            0x00000001030506b8 
ExecProjectSRF + 304
   7   postgres                            0x0000000103050434 
ExecProjectSet + 268
   8   postgres                            0x0000000103003270 
ExecProcNodeFirst + 92
   9   postgres                            0x0000000102ffa398 
ExecProcNode + 60
   10  postgres                            0x0000000102ff5050 ExecutePlan 
+ 244
   11  postgres                            0x0000000102ff4ee0 
standard_ExecutorRun + 456
   12  postgres                            0x0000000102ff4d08 ExecutorRun 
+ 84
   13  postgres                            0x0000000103341c84 
PortalRunSelect + 296
   14  postgres                            0x0000000103341694 PortalRun + 
656
   15  postgres                            0x000000010333c4bc 
exec_simple_query + 1388
   16  postgres                            0x000000010333b5d0 
PostgresMain + 3252
   17  postgres                            0x0000000103332750 
BackendInitialize + 0
   18  postgres                            0x0000000103209e48 
postmaster_child_launch + 456
   19  postgres                            0x00000001032118c8 
BackendStartup + 304
   20  postgres                            0x000000010320f72c ServerLoop 
+ 372
   21  postgres                            0x000000010320e1e4 
PostmasterMain + 6448
   22  postgres                            0x0000000103094b0c main + 924
   23  dyld                                0x0000000199dc2b98 start + 
6076


Could you please check if you can reproduce this crash on your 
environment?


And a few minor comments on the patch itself:

> +        <parameter>stats_timestamp</parameter> 
> <type>timestamptz</type> )

As discussed earlier, I believe we decided to remove stats_timestamp,
but it seems it’s still mentioned here.


> + * Update timestamp and signal all the waiting client backends after 
> copying
> + * all the statistics.
> + */
> +static void
> +end_memorycontext_reporting(MemoryStatsDSHashEntry *entry, 
> MemoryContext oldcontext, HTAB *context_id_lookup)

Should “Update timestamp” in this comment also be removed for 
consistency?


The column order differs slightly from pg_backend_memory_contexts.
If there’s no strong reason for the difference, perhaps aligning the 
order might improve consistency:

   =# select * from pg_get_process_memory_contexts(pg_backend_pid(), 
true, 1) ;
   name             | TopMemoryContext
   ident            | [NULL]
   type             | AllocSet
   path             | {1}
   level            | 1
   total_bytes      | 222400

   =# select * from pg_backend_memory_contexts;
   name          | TopMemoryContext
   ident         | [NULL]
   type          | AllocSet
   level         | 1
   path          | {1}
   total_bytes   | 99232
   ...


Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: psql: tab-completion support for COPY ... TO/FROM STDIN, STDOUT, and PROGRAM
Next
From: David Rowley
Date:
Subject: Re: Teaching planner to short-circuit empty UNION/EXCEPT/INTERSECT inputs