Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CAFBsxsH=99oJ81HTF7JbqimA2WZSBGu=dq07zkfUsvikzOJ+hw@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On Thu, Feb 16, 2023 at 11:44 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-02-16 16:22:56 +0700, John Naylor wrote:
> > On Thu, Feb 16, 2023 at 10:24 AM Masahiko Sawada <sawada.mshk@gmail.com>
> > > Right. TidStore is implemented not only for heap, so loading
> > > out-of-order TIDs might be important in the future.
> >
> > That's what I was probably thinking about some weeks ago, but I'm having a
> > hard time imagining how it would come up, even for something like the
> > conveyor-belt concept.
>
> We really ought to replace the tid bitmap used for bitmap heap scans. The
> hashtable we use is a pretty awful data structure for it. And that's not
> filled in-order, for example.

I took a brief look at that and agree we should sometime make it work there as well.

v26 tidstore_add_tids() appears to assume that it's only called once per blocknumber. While the order of offsets doesn't matter there for a single block, calling it again with the same block would wipe out the earlier offsets, IIUC. To do an actual "add tid" where the order doesn't matter, it seems we would need to (acquire lock if needed), read the current bitmap and OR in the new bit if it exists, then write it back out.

That sounds slow, so it might still be good for vacuum to call a function that passes a block and an array of offsets that are assumed ordered (as in v28), but with a more accurate name, like tidstore_set_block_offsets().

--

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_walinspect memory leaks
Next
From: Michael Paquier
Date:
Subject: Re: recovery modules