Re: Parent/child context relation in pg_get_backend_memory_contexts() - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Parent/child context relation in pg_get_backend_memory_contexts()
Date
Msg-id ZTA3umHA9cArtuGv@tamriel.snowman.net
Whole thread Raw
In response to Re: Parent/child context relation in pg_get_backend_memory_contexts()  (Melih Mutlu <m.melihmutlu@gmail.com>)
Responses Re: Parent/child context relation in pg_get_backend_memory_contexts()
List pgsql-hackers
Greetings,

* Melih Mutlu (m.melihmutlu@gmail.com) wrote:
> Melih Mutlu <m.melihmutlu@gmail.com>, 16 Haz 2023 Cum, 17:03 tarihinde şunu
> yazdı:
>
> > With this change, here's a query to find how much space used by each
> > context including its children:
> >
> > > WITH RECURSIVE cte AS (
> > >     SELECT id, total_bytes, id as root, name as root_name
> > >     FROM memory_contexts
> > > UNION ALL
> > >     SELECT r.id, r.total_bytes, cte.root, cte.root_name
> > >     FROM memory_contexts r
> > >     INNER JOIN cte ON r.parent_id = cte.id
> > > ),
> > > memory_contexts AS (
> > >     SELECT * FROM pg_backend_memory_contexts
> > > )
> > > SELECT root as id, root_name as name, sum(total_bytes)
> > > FROM cte
> > > GROUP BY root, root_name
> > > ORDER BY sum DESC;
>
> Given that the above query to get total bytes including all children is
> still a complex one, I decided to add an additional info in
> pg_backend_memory_contexts.
> The new "path" field displays an integer array that consists of ids of all
> parents for the current context. This way it's easier to tell whether a
> context is a child of another context, and we don't need to use recursive
> queries to get this info.

Nice, this does seem quite useful.

> Here how pg_backend_memory_contexts would look like with this patch:
>
> postgres=# SELECT name, id, parent, parent_id, path
> FROM pg_backend_memory_contexts
> ORDER BY total_bytes DESC LIMIT 10;
>           name           | id  |      parent      | parent_id |     path
> -------------------------+-----+------------------+-----------+--------------
>  CacheMemoryContext      |  27 | TopMemoryContext |         0 | {0}
>  Timezones               | 124 | TopMemoryContext |         0 | {0}
>  TopMemoryContext        |   0 |                  |           |
>  MessageContext          |   8 | TopMemoryContext |         0 | {0}
>  WAL record construction | 118 | TopMemoryContext |         0 | {0}
>  ExecutorState           |  18 | PortalContext    |        17 | {0,16,17}
>  TupleSort main          |  19 | ExecutorState    |        18 | {0,16,17,18}
>  TransactionAbortContext |  14 | TopMemoryContext |         0 | {0}
>  smgr relation table     |  10 | TopMemoryContext |         0 | {0}
>  GUC hash table          | 123 | GUCMemoryContext |       122 | {0,122}
> (10 rows)
>
> An example query to calculate the total_bytes including its children for a
> context (say CacheMemoryContext) would look like this:
>
> WITH contexts AS (
> SELECT * FROM pg_backend_memory_contexts
> )
> SELECT sum(total_bytes)
> FROM contexts
> WHERE ARRAY[(SELECT id FROM contexts WHERE name = 'CacheMemoryContext')] <@
> path;

I wonder if we should perhaps just include
"total_bytes_including_children" as another column?  Certainly seems
like a very useful thing that folks would like to see.  We could do that
either with C, or even something as simple as changing the view to do
something like:

WITH contexts AS MATERIALIZED (
  SELECT * FROM pg_get_backend_memory_contexts()
)
SELECT
  *,
  coalesce
  (
    (
      (SELECT sum(total_bytes) FROM contexts WHERE ARRAY[a.id] <@ path)
      + total_bytes
    ),
    total_bytes
  ) AS total_bytes_including_children
FROM contexts a;

> We still need to use cte since ids are not persisted and might change in
> each run of pg_backend_memory_contexts. Materializing the result can
> prevent any inconsistencies due to id change. Also it can be even good for
> performance reasons as well.

I don't think we really want this to be materialized, do we?  Where this
is particularly interesting is when it's being dumped to the log ( ...
though I wish we could do better than that and hope we do in the future)
while something is ongoing in a given backend and if we do that a few
times we are able to see what's changing in terms of allocations,
whereas if we materialized it (when?  transaction start?  first time
it's asked for?) then we'd only ever get the one view from whenever the
snapshot was taken.

> Any thoughts?

Generally +1 from me for working on improving this.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Mikhail Gribkov
Date:
Subject: Avoid race condition for event_triggers regress test
Next
From: Stephen Frost
Date:
Subject: Re: pg_dump needs SELECT privileges on irrelevant extension table