Re: Memory context can be its own parent and child in replication command - Mailing list pgsql-hackers
From | Anthonin Bonnefoy |
---|---|
Subject | Re: Memory context can be its own parent and child in replication command |
Date | |
Msg-id | CAO6_XqqQxCW7xJvBc=obbrWVVG3DWYJrUg79=YaOmX34XsEmVQ@mail.gmail.com Whole thread Raw |
In response to | Re: Memory context can be its own parent and child in replication command (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Memory context can be its own parent and child in replication command
|
List | pgsql-hackers |
On Tue, Mar 11, 2025 at 1:09 AM Michael Paquier <michael@paquier.xyz> wrote: > It seems to me that you mean 1afe31f03cd2, no? Yes, that was a bad copy/paste from me. On Tue, Mar 11, 2025 at 2:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I dunno about this patch: it seems to me it's doing things exactly > backwards, namely trying to restore the CurrentMemoryContext after > a transaction abort to what it was just beforehand. With only > a slightly different set of suppositions, this is introducing > use of a deleted context rather than removing it. Yeah, I'm not super happy about the fix either. This leaves the issue that CurrentMemoryContext is set to a deleted context, even if temporarily. > I'm inclined to suspect that the real problem is somebody installed > the wrong context as current just before the transaction was started. > I don't have the energy to look closer right now, though. The context installed is the replication command context created at the beginning of exec_replication_command. This context is deleted at the end of the function while the transaction is still running. Maybe a better fix would be to not delete the Replication command context when a transaction was started to handle the export, and switch back to this context at the start of the next exec_replication_command? This way, the memory context referenced by priorContext would still be valid when the transaction is aborted. This would also require the Replication command context to not be attached under the MessageContext to avoid being deleted at the end of the query cycle. > Independently of where the actual bug is there, it seems not > nice that it's so easy to bollix the memory context data > structures. I wonder if we can make that more detectable > at reasonable cost. One thing that made it hard to trace the issue was that there's no way to know if a MemoryContext was deleted or not. Setting the MemoryContext's node type to T_Invalid during reset could be a way to tag the context as deleted. And this will be able to leverage the existing MemoryContextIsValid check and additional MemoryContextIsValid checks could be added before restoring the priorContext. --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -527,6 +527,7 @@ MemoryContextDeleteOnly(MemoryContext context) context->methods->delete_context(context); + context->type = T_Invalid; VALGRIND_DESTROY_MEMPOOL(context); } However, when testing this on my mac, it seems to trigger a heap corruption during initdb. postgres(9057,0x200690840) malloc: Heap corruption detected, free list is damaged at 0x60000322bb10 *** Incorrect guard value: 276707563012096 postgres(9057,0x200690840) malloc: *** set a breakpoint in malloc_error_break to debug While it ran fine on linux. I didn't have time to check why yet.
pgsql-hackers by date: