Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CAD21AoC2eA_OdwMmBvj7XDxwmdHU4_tBac8G1GgF2OFzZZ7SRA@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum
List pgsql-hackers
Hi,

On Thu, Jun 16, 2022 at 4:30 PM John Naylor
<john.naylor@enterprisedb.com> wrote:
>
> On Thu, Jun 16, 2022 at 11:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated version patch that changes the configure
> > script. I'm still studying how to support AVX2 on msvc build. Also,
> > added more regression tests.
>
> Thanks for the update, I will take a closer look at the patch in the
> near future, possibly next week.

Thanks!

> For now, though, I'd like to question
> why we even need to use 32-byte registers in the first place. For one,
> the paper referenced has 16-pointer nodes, but none for 32 (next level
> is 48 and uses a different method to find the index of the next
> pointer). Andres' prototype has 32-pointer nodes, but in a quick read
> of his patch a couple weeks ago I don't recall a reason mentioned for
> it.

I might be wrong but since AVX2 instruction set is introduced in
Haswell microarchitecture in 2013 and the referenced paper is
published in the same year, the art didn't use AVX2 instruction set.
32-pointer nodes are better from a memory perspective as you
mentioned. Andres' prototype supports both 16-pointer nodes and
32-pointer nodes (out of 6 node types). This would provide better
memory usage but on the other hand, it would also bring overhead of
switching the node type. Anyway, it's an important design decision to
support which size of node to support. It should be done based on
experiment results and documented.

> Even if 32-pointer nodes are better from a memory perspective, I
> imagine it should be possible to use two SSE2 registers to find the
> index. It'd be locally slightly more complex, but not much. It might
> not even cost much more in cycles since AVX2 would require indirecting
> through a function pointer. It's much more convenient if we don't need
> a runtime check.

Right.

> There are also thermal and power disadvantages when
> using AXV2 in some workloads. I'm not sure that's the case here, but
> if it is, we'd better be getting something in return.

Good point.

> One more thing in general: In an earlier version, I noticed that
> Andres used the slab allocator and documented why. The last version of
> your patch that I saw had the same allocator, but not the "why".
> Especially in early stages of review, we want to document design
> decisions so it's more clear for the reader.

Indeed. I'll add comments in the next version patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add header support to text format and matching feature
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH] Completed unaccent dictionary with many missing characters