Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
Date
Msg-id 3399097.1709501969@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Alexander Lakhin <exclusion@gmail.com>)
Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-bugs
I wrote:
> I find this in [1]:
> 
>   The C language stack growth does an implicit mremap. If you want absolute
>   guarantees and run close to the edge you MUST mmap your stack for the 
>   largest size you think you will need. For typical stack usage this does
>   not matter much but it's a corner case if you really really care
> 
> Seems like we need to do some more work at startup to enforce that
> we have the amount of stack we think we do, if we're on Linux.

After thinking about that some more, I'm really quite unenthused about
trying to remap the stack for ourselves.  It'd be both platform- and
architecture-dependent, and I'm afraid it'd introduce as many failure
modes as it removes.  (Notably, I'm not sure we could guarantee
there's a guard page below the stack.)  Since we've not seen reports
of this failure from the wild, I doubt it's worth the trouble.

I do think it's probably worth reducing MemoryContextDelete's stack
usage to O(1), just to ensure we can't get into stack trouble during
transaction abort.  That's not hard at all, as attached.

I tried to make MemoryContextResetChildren work similarly, but that
doesn't work because if we're not removing child contexts then we
need extra state to tell which ones we've done already.  For the
same reason my idea for bounding the stack space needed by
MemoryContextStats doesn't seem to work.  We could possibly make it
work if we were willing to add a temporary-use pointer field to all
MemoryContext headers, but I'm unconvinced that'd be a good tradeoff.

            regards, tom lane

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 41f2390fb8..7cc220bca0 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -447,14 +447,37 @@ MemoryContextDelete(MemoryContext context)
 void
 MemoryContextDeleteChildren(MemoryContext context)
 {
+    MemoryContext cur,
+                next;
+
     Assert(MemoryContextIsValid(context));
 
     /*
-     * MemoryContextDelete will delink the child from me, so just iterate as
-     * long as there is a child.
+     * We iterate rather than recursing, so that cleaning up a deep nest of
+     * contexts doesn't require unbounded stack space.  (This avoids possible
+     * failure during transaction cleanup, which would be bad.)  This works
+     * because by the time we traverse back up to a parent context, it will
+     * have no remaining children and will be seen as deletable.
      */
-    while (context->firstchild != NULL)
-        MemoryContextDelete(context->firstchild);
+    cur = context->firstchild;
+    while (cur != NULL)
+    {
+        /* Descend until we find a childless context */
+        if (cur->firstchild != NULL)
+        {
+            cur = cur->firstchild;
+            continue;
+        }
+        /* We can delete cur, but first remember the next deletion target */
+        if (cur->nextchild != NULL)
+            next = cur->nextchild;
+        else if (cur->parent != context)
+            next = cur->parent;
+        else
+            next = NULL;
+        MemoryContextDelete(cur);
+        cur = next;
+    }
 }
 
 /*

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault
Next
From: Thomas Munro
Date:
Subject: Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker