Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder() - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder()
Date
Msg-id 20230821182732.t3qc75i5s5xvovls@awork3.anarazel.de
Whole thread Raw
In response to Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder()  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder()  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-bugs
Hi,

On 2023-08-21 19:07:16 +0530, Amit Kapila wrote:
> On Mon, Aug 21, 2023 at 2:35 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2023-08-18 04:21:53 +0000, Zhijie Hou (Fujitsu) wrote:
> > > From ee1dfccc0306812c356c84bbd7e2558f27d7d348 Mon Sep 17 00:00:00 2001
> > > From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
> > > Date: Thu, 17 Aug 2023 19:29:34 +0800
> > > Subject: [PATCH v4] cleanup decoding context in error cases
> > >
> > > Some of the management functions for logical decoding didn't clean up the
> > > decoding context when an error occurs during decoding. This can
> > > result in accessing unfreed resources and assert failure the next time we call
> > > these functions. Fix it by cleaning up the decoding context and slots in error
> > > cases.
> >
> > I don't think this is the right fix - at all. We shouldn't run arbitrary code
> > after a failure, which we do by calling the shutdown callback.
> >
> 
> But OTOH, it can prevent freeing some global memory like in the case
> of pgoutput_shutdown which frees some memory allocated in
> CacheMemoryContext. Also, as pointed out by Hou-San[1], it can lead to
> unwanted behavior as next time we can access some invalid entries.

To me the code in pgoutput is just bad.  For one, it uses a global variable
for non-global state - instead it should store the data in
LogicalDecodingContext->output_plugin_private. Secondly, it should not use
CacheMemoryContext, given that the cache apparently is local to a individual
pgoutput instance.


But: Is there actually a danger of invalid entries being accessed? It looks to
me like pgoutput uses invalidation callbacks etc, which would prevent the
cache from being stale?


> The other alternatives to fix are (a) Have some Reset* kind of
> function (similar to ResetReindexState) to reset the required
> variables and call at AbortTransaction time.

-1. This makes things global concerns that shouldn't be.

If we really need something to clean this up, I'd look at
MemoryContextRegisterResetCallback().

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder()
Next
From: Wyatt Alt
Date:
Subject: cache lookup failed dropping public schema with trgm index