RE: Memory leak in WAL sender with pgoutput (v10~) - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Memory leak in WAL sender with pgoutput (v10~)
Date
Msg-id OS0PR01MB57166E9B00F42713B0CDB756943E2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Memory leak in WAL sender with pgoutput (v10~)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Memory leak in WAL sender with pgoutput (v10~)
List pgsql-hackers
On Wednesday, December 11, 2024 2:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> On Tue, Dec 10, 2024 at 1:16 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Tue, Dec 10, 2024 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> > > On Tue, Dec 10, 2024 at 8:54 AM vignesh C <vignesh21@gmail.com>
> wrote:
> > > >
> > > > On Tue, 10 Dec 2024 at 04:56, Michael Paquier <michael@paquier.xyz>
> wrote:
> > > > >
> > > > > On Mon, Dec 09, 2024 at 03:36:15PM +0530, Amit Kapila wrote:
> > > > > > It couldn't solve the problem completely even in back-branches. The
> > > > > > SQL API case I mentioned and tested by Hou-San in the email [1]
> won't
> > > > > > be solved.
> > > > > >
> > > > > > [1] -
> https://www.postgresql.org/message-id/OS0PR01MB57166A4DA0ABBB94F
> 2FBB28694362%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> > > > >
> > > > > Yeah, exactly (wanted to reply exactly that yesterday but lacked time,
> > > > > thanks!).
> > > >
> > > > Yes, that makes sense. How about something like the attached patch.
> > > >
> > >
> > > - oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> > > - if (data->publications)
> > > - {
> > > - list_free_deep(data->publications);
> > > - data->publications = NIL;
> > > - }
> > > + static MemoryContext pubctx = NULL;
> > > +
> > > + if (pubctx == NULL)
> > > + pubctx = AllocSetContextCreate(CacheMemoryContext,
> > > +    "logical replication publication list context",
> > > +    ALLOCSET_SMALL_SIZES);
> > > + else
> > > + MemoryContextReset(pubctx);
> > > +
> > > + oldctx = MemoryContextSwitchTo(pubctx);
> > >
> > > Considering the SQL API case, why is it okay to allocate this context
> > > under CacheMemoryContext?
> > >
> >
> > On further thinking, we can't allocate it under
> > LogicalDecodingContext->context because once that is freed at the end
> > of SQL API pg_logical_slot_get_changes(), pubctx will be pointing to a
> > dangling memory. One idea is that we use
> > MemoryContextRegisterResetCallback() to invoke a reset callback
> > function where we can reset pubctx but not sure if we want to go there
> > in back branches. OTOH, the currently proposed fix won't leak memory
> > on repeated calls to pg_logical_slot_get_changes(), so that might be
> > okay as well.
> >
> > Thoughts?
> 
> Alternative idea is to declare pubctx as a file static variable. And
> we create the memory context under LogicalDecodingContext->context in
> the startup callback and free it in the shutdown callback.

I think when an ERROR occurs during the execution of the pg_logical_slot_xx()
API, the shutdown callback function is not invoked. This would result in the
static variable not being reset, which, I think, is why Amit mentioned the use
of MemoryContextRegisterResetCallback().

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Michael Harris
Date:
Subject: Re: FileFallocate misbehaving on XFS
Next
From: Richard Guo
Date:
Subject: Re: Wrong results with right-semi-joins