Hi,
On 2021-08-19 14:47:42 +0200, Matthias van de Meent wrote:
> I tried to implement this 'compact attribute access descriptor' a few
> months ago in my effort to improve btree index performance.
cool
> The patch allocates an array of 'TupleAttrAlignData'-structs at the
> end of the attrs-array, fills it with the correct data upon
> TupleDesc-creation, and uses this TupleAttrAlign-data for constructing
> and destructing tuples.
> One main difference from what you described was that I used a union
> for storing attbyval and attstorage, as the latter is only applicable
> to attlen < 0, and the first only for attlen >= 0. This keeps the
> whole structure in 8 bytes, whilst also being useable in both tuple
> forming and deforming.
That's why I just talked about the naive way - it's clearly possible to
do better... ;)
> I hope this can is useful, otherwise sorry for the noise.
It is!
I haven't looked at your patch in detail, but I suspect that one reason
that you didn't see performance benefits is that you added overhead as
well. The computation of the "compact" memory location now will need a
few more instructions than before, and I suspect the compiler may not
even be able to optimize out some of the redundant accesses in loops.
It'd be interesting to see what you'd get if you stored the compact
array as the flexible-array and stored a pointer to the "full" attrs
array (while still keeping it allocated together).
Another reason is that it looks like you didn't touch
slot_deform_heap_tuple(), which is I think the hottest of the deforming
routines...
Greetings,
Andres Freund