Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CAD21AoCRjnVxsKuyL8QUt8x=qz=MmA2Q-GyOwoCO5KM=HevvWQ@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers
On Fri, Apr 7, 2023 at 6:55 PM John Naylor <john.naylor@enterprisedb.com> wrote:
>
> On Thu, Feb 16, 2023 at 11:44 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > We really ought to replace the tid bitmap used for bitmap heap scans. The
> > hashtable we use is a pretty awful data structure for it. And that's not
> > filled in-order, for example.
>
> I spent some time studying tidbitmap.c, and not only does it make sense to use a radix tree there, but since it has
morecomplex behavior and stricter runtime requirements, it should really be the thing driving the design and tradeoffs,
notvacuum: 
>
> - With lazy expansion and single-value leaves, the root of a radix tree can point to a single leaf. That might get
ridof the need to track TBMStatus, since setting a single-leaf tree should be cheap. 
>

Instead of introducing single-value leaves to the radix tree as
another structure, can we store pointers to PagetableEntry as values?

> - Fixed-size PagetableEntry's are pretty large, but the tid compression scheme used in this thread (in addition to
beingcomplex) is not a great fit for tidbitmap because it makes it more difficult to track per-block metadata (see also
nextpoint). With the "combined pointer-value slots" technique, if a page's max tid offset is 63 or less, the offsets
canbe stored directly in the pointer for the exact case. The lowest bit can tag to indicate a pointer to a single-value
leaf.That would complicate operations like union/intersection and tracking "needs recheck", but it would reduce memory
useand node-traversal in common cases. 
>
> - Managing lossy storage. With pure blocknumber keys, replacing exact storage for a range of 256 pages amounts to
replacinga last-level node with a single leaf containing one lossy PagetableEntry. The leader could iterate over the
nodes,and rank the last-level nodes by how much storage they (possibly with leaf children) are using, and come up with
anoptimal lossy-conversion plan. 
>
> The above would address the points (not including better iteration and parallel bitmap index scans) raised in
>
> https://www.postgresql.org/message-id/CAPsAnrn5yWsoWs8GhqwbwAJx1SeLxLntV54Biq0Z-J_E86Fnng@mail.gmail.com
>
> Ironically, by targeting a more difficult use case, it's easier since there is less freedom. There are many ways to
beata binary search, but fewer good ways to improve bitmap heap scan. I'd like to put aside vacuum for some time and
trykilling two birds with one stone, building upon our work thus far. 
>
> Note: I've moved the CF entry to the next CF, and set to waiting on author for now. Since no action is currently
requiredfrom Masahiko, I've added myself as author as well. If tackling bitmap heap scan shows promise, we could RWF
andresurrect at a later time. 

Thanks. I'm going to continue researching the memory limitation and
try lazy path expansion until PG17 development begins.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: longfin missing gssapi_ext.h
Next
From: Stephen Frost
Date:
Subject: Re: longfin missing gssapi_ext.h