Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CANWCAZYG2XWQtEx1L_+bypwL1k1HGprUEBGmhgGWCrTE9YXP_g@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Wed, Jan 17, 2024 at 8:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 17, 2024 at 9:20 AM John Naylor <johncnaylorls@gmail.com> wrote:
> >
> > On Tue, Jan 16, 2024 at 1:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > Just changing "items" to be the local tidstore struct could make the
> > > code tricky a bit, since max_bytes and num_items are on the shared
> > > memory while "items" is a local pointer to the shared tidstore.
> >
> > Thanks for trying it this way! I like the overall simplification but
> > this aspect is not great.
> > Hmm, I wonder if that's a side-effect of the "create" functions doing
> > their own allocations and returning a pointer. Would it be less tricky
> > if the structs were declared where we need them and passed to "init"
> > functions?
>
> Seems worth trying. The current RT_CREATE() API is also convenient as
> other data structure such as simplehash.h and dshash.c supports a
> similar

I don't happen to know if these paths had to solve similar trickiness
with some values being local, and some shared.

> > That may be a good idea for other reasons. It's awkward that the
> > create function is declared like this:
> >
> > #ifdef RT_SHMEM
> > RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, Size max_bytes,
> > dsa_area *dsa,
> > int tranche_id);
> > #else
> > RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, Size max_bytes);
> > #endif
> >
> > An init function wouldn't need these parameters: it could look at the
> > passed struct to know what to do.
>
> But the init function would initialize leaf_ctx etc,no? Initializing
> leaf_ctx needs max_bytes that is not stored in RT_RADIX_TREE.

I was more referring to the parameters that were different above
depending on shared memory. My first thought was that the tricky part
is because of the allocation in local memory, but it's certainly
possible I've misunderstood the problem.

> The same
> is true for dsa. I imagined that an init function would allocate a DSA
> memory for the control object.

Yes:

...
//  embedded in VacDeadItems
  TidStore items;
};

// NULL DSA in local case, etc
dead_items->items.area = dead_items_dsa;
dead_items->items.tranche_id = FOO_ID;

TidStoreInit(&dead_items->items, vac_work_mem);

That's how I imagined it would work (leaving out some details). I
haven't tried it, so not sure how much it helps. Maybe it has other
problems, but I'm hoping it's just a matter of programming.

If we can't make this work nicely, I'd be okay with keeping the tid
store control object. My biggest concern is unnecessary
double-locking.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: introduce dynamic shared memory registry
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Random pg_upgrade test failure on drongo