Re: Fix crash when non-creator being an iteration on shared radix tree - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Fix crash when non-creator being an iteration on shared radix tree
Date
Msg-id CAD21AoDaNy3YuOtL7+a2seyLM-0NKAu6ZB+Csh4Jg+8DCuamwg@mail.gmail.com
Whole thread Raw
In response to Re: Fix crash when non-creator being an iteration on shared radix tree  (John Naylor <johncnaylorls@gmail.com>)
List pgsql-hackers
On Wed, Dec 18, 2024 at 10:32 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Dec 19, 2024 at 1:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Dec 17, 2024 at 11:12 PM John Naylor <johncnaylorls@gmail.com> wrote:
> > > +1 in general, but I wonder if instead the iter_context should be
> > > created within RT_BEGIN_ITERATE -- I imagine that would have less
> > > duplication and would be as safe, but I haven't tried it. Is there
> > > some reason not  to do that?
> >
> > I agree that it has less duplication. There is no strong reason I
> > didn't do that. I just didn't want to check 'if (!tree->iter_context)'
> > in RT_BEGIN_ITERATE for simplicity. I've changed the patch
> > accordingly.
>
> I see what you mean. For v17, a bit of duplication is probably worth
> it for simplicity, so I'd say v1 is fine there.
>
> However, I think on master we should reconsider some aspects of memory
> management more broadly:
>
> 1. The creator allocates the root of the tree in a new child context,
> but an attaching process allocates it in its current context, and we
> pfree it when the caller wants to detach. It seems like we could
> always allocate this small struct in CurrentMemoryContext for
> consistency.
>
> 2. The iter_context is separate because the creator's new context
> could be a bump context which doesn't support pfree. But above we
> assume we can pfree in the caller's context. Also, IIUC we only
> allocate small iter objects, and it'd be unusual to need more than one
> at a time per backend, so it's a bit strange to have an entire context
> for that. Since we use a standard pattern of "begin; while(iter);
> end;", it seems unlikely that someone will cause a leak because of a
> coding mistake in iteration.
>
> If these tiny admin structs were always, not sometimes, in the callers
> current context, I think it would be easier to reason about because
> then the creator's passed context would be used only for local memory,
> specifically only for leaves and the inner node child contexts.
> Thoughts?

Fair points. Given that we need only one iterator at a time per
backend, it would be simpler if the caller passes the pointer to an
iterator that is a stack variable to RT_BEGIN_ITEREATE(). For example,
TidStoreBeginIterate() would be like:

if (TidStoreIsShared(ts))
    shared_ts_begin_iterate(ts->tree.shared, &iter->tree_iter.shared);
else
   local_ts_begin_iterate(ts->tree.local, &iter->tree_iter.shared);

>
> Further,
>
> 3. I was never a fan of trying to second-guess the creator's new
> context and instead use slab for fixed-sized leaf allocations. If the
> creator passes a bump context, we say "no, no, no, use slab -- it's
> good for ya". Let's assume the caller knows what they're doing.

That's a valid argument but how can a user use the slab context for
leaf allocations? If the caller passes an allocset context to
RT_CREATE(), it still makes sense to usa slab context for leaf
allocation in terms of avoiding possible space wasting.

> 4. For local memory, an allocated "control object" serves no real
> purpose and wastes a few cycles on every access. I'm not sure it
> matters that much as a future micro-optimization, but I mention it
> here because if we did start allocating outer structs in the callers
> context, embedding would also remove the need to pfree it.

Using an allocated "control object" can simplify the codes for local
and shared trees. We cannot embed the control object into
RT_RADIX_TREE in shared cases. I agree to embed the control data if we
can implement that cleanly.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Converting SetOp to read its two inputs separately
Next
From: David Rowley
Date:
Subject: Re: JIT compilation per plan node