Thread: Do we still need parent column in pg_backend_memory_context?

Do we still need parent column in pg_backend_memory_context?

From
Melih Mutlu
Date:
Hi hackers,

After the patch [1] that adds a path column to pg_backend_memory_context, the parent context can also be found in the path array. Since there are currently two ways to retrieve information related to the parent of a context, I wonder whether we still want to keep the parent column.

The path column represents the path from TopMemoryContext to the current memory context. There is always "level" number of elements in a path array for any memory context. The first element in the array is TopMemoryContext, and the last element (path[level]) is the current memory context. The path[level-1] element will simply show us the parent context ID. 

I understand that having the parent name instead of the transient parent context ID can be easier to use in some cases. While I suspect that the memory contexts most users are interested in are close to TopMemoryContext—which means their context IDs are much less likely to change with each execution—it's still not guaranteed.

I'm also unsure how common it is to use or rely on the parent column. I quickly searched here [2] to see how pg_backend_memory_context is used. There are a few places where the parent column is used in extensions. I believe these places should be easy to update if we decide to remove the parent column.

Attached is a patch to remove parent from the view.



Regards,
--
Melih Mutlu
Microsoft
Attachment

Re: Do we still need parent column in pg_backend_memory_context?

From
David Rowley
Date:
On Wed, 31 Jul 2024 at 05:19, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> After the patch [1] that adds a path column to pg_backend_memory_context, the parent context can also be found in the
patharray. Since there are currently two ways to retrieve information related to the parent of a context, I wonder
whetherwe still want to keep the parent column. 

My vote is to remove it.

I think the parent column is only maybe useful as a rough visual
indication of what the parent is.  It's dangerous to assume using it
is a reliable way to write a recursive query:

with recursive contexts as (
  select name, ident, level, path, parent from pg_backend_memory_contexts
),
c as (
  select path[level] as context_id, NULL::int as parent_id,* from
contexts where parent is null
  union all
  select c1.path[c1.level], c.context_id,c1.* from contexts c1 inner
join c on c.name = c1.parent
)
select count(*) as all_including_false_dups, count(distinct
context_id) as unique from c;

 all_including_false_dups | unique
--------------------------+--------
                      159 |    150

So, with the backend in the state I had it in during this query, the
recursive query shows 9 additional contexts because the recursive
query joining parent to name found a false parent with a name matching
the actual parent because the names are not unique. Given that I
didn't do anything special to create contexts with duplicate names, it
seems duplicates are not rare.

select name,count(*) from pg_backend_memory_contexts group by 1 order
by 2 desc limit 3;
    name     | count
-------------+-------
 index info  |    94
 dynahash    |    15
 ExprContext |     7
(3 rows)

I think the first two of the above won't have any children, but the
ExprContext ones can.

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 31 Jul 2024 at 05:19, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
>> After the patch [1] that adds a path column to pg_backend_memory_context, the parent context can also be found in
thepath array. Since there are currently two ways to retrieve information related to the parent of a context, I wonder
whetherwe still want to keep the parent column. 

> My vote is to remove it.

While it's certainly somewhat redundant now, removing it would break
any application queries that are using the column.  Simply adding
a column in a system view is a much easier sell than replacing or
removing one.

Perhaps you can make an argument that nobody would be depending
on that column, but I fear that's wishful thinking.  Or maybe you
can argue that any query using it is already broken --- but I
think that's only true if someone tries to do the specific sort
of recursive traversal that you illustrated.

            regards, tom lane



Re: Do we still need parent column in pg_backend_memory_context?

From
David Rowley
Date:
On Wed, 31 Jul 2024 at 12:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Perhaps you can make an argument that nobody would be depending
> on that column, but I fear that's wishful thinking.  Or maybe you
> can argue that any query using it is already broken --- but I
> think that's only true if someone tries to do the specific sort
> of recursive traversal that you illustrated.

It's true that people could be using it, I certainly don't dispute
that.  It's just we don't have any rule that we can't do this sort of
thing. Take f66e8bf87, for example. It removed relhaspkey from
pg_class.  It's true that doing that did upset at least one person
[1], but our response to that complaint was to reiterate that the
example query was broken.

I feel the bar is a bit lower for pg_backend_memory_contexts as it was
only added in v14, so it's not been around as long as pg_class had
been around in 2018 when we removed relhaspkey.  My concern here is
that the longer we leave the parent column in, the higher the bar gets
to remove it.  That's why I feel like it is worth considering this
now.

One thing we could do is remove it and see if anyone complains.  If we
did that today, there's about a year-long window for people to
complain where we could still reverse the decision.  Now is probably
the best time where we can consider this so I'd be sad if this
discussion ended on "someone might be using it.".

David

[1] https://postgr.es/m/CANu8Fiy2RZL+uVnnrzaCTJxMgcKBDOnAR7bDx3n0P=KycbSNhA@mail.gmail.com



David Rowley <dgrowleyml@gmail.com> writes:
> I feel the bar is a bit lower for pg_backend_memory_contexts as it was
> only added in v14, so it's not been around as long as pg_class had
> been around in 2018 when we removed relhaspkey.

Yeah, and also it's very much a developer-focused view with a limited
audience.  It's certainly possible that we could remove the column
and nobody would complain.  I just wanted to point out that there is
a compatibility worry here.

> One thing we could do is remove it and see if anyone complains.  If we
> did that today, there's about a year-long window for people to
> complain where we could still reverse the decision.

Seems like a plausible compromise.

            regards, tom lane



Re: Do we still need parent column in pg_backend_memory_context?

From
David Rowley
Date:
On Wed, 31 Jul 2024 at 13:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <dgrowleyml@gmail.com> writes:
> > One thing we could do is remove it and see if anyone complains.  If we
> > did that today, there's about a year-long window for people to
> > complain where we could still reverse the decision.
>
> Seems like a plausible compromise.

Does anyone object to making this happen? i.e. remove
pg_backend_memory_contexts.parent column and see if anyone complains?

If nobody comes up with any reasons against it, then I propose making
this happen.

David



Re: Do we still need parent column in pg_backend_memory_context?

From
David Rowley
Date:
On Tue, 6 Aug 2024 at 17:48, David Rowley <dgrowleyml@gmail.com> wrote:
> Does anyone object to making this happen? i.e. remove
> pg_backend_memory_contexts.parent column and see if anyone complains?
>
> If nobody comes up with any reasons against it, then I propose making
> this happen.

I made a few adjustments and pushed the patch. Let's see if anyone complains.

David



Re: Do we still need parent column in pg_backend_memory_context?

From
Melih Mutlu
Date:
David Rowley <dgrowleyml@gmail.com>, 12 Ağu 2024 Pzt, 06:44 tarihinde şunu yazdı:
I made a few adjustments and pushed the patch. Let's see if anyone complains.

Thanks David.