Re: NAMEDATALEN increase because of non-latin languages - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: NAMEDATALEN increase because of non-latin languages
Date
Msg-id CAEze2Wh8-metSryZX_Ubj-uv6kb+2YnzHAejmEdubjhmGusBAg@mail.gmail.com
Whole thread Raw
In response to Re: NAMEDATALEN increase because of non-latin languages  (Andres Freund <andres@anarazel.de>)
Responses Re: NAMEDATALEN increase because of non-latin languages
List pgsql-hackers
On Thu, 19 Aug 2021 at 14:58, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-08-19 14:47:42 +0200, Matthias van de Meent wrote:
> > I tried to implement this 'compact attribute access descriptor' a few
> > months ago in my effort to improve btree index performance.
>
> cool
>
>
> > The patch allocates an array of 'TupleAttrAlignData'-structs at the
> > end of the attrs-array, fills it with the correct data upon
> > TupleDesc-creation, and uses this TupleAttrAlign-data for constructing
> > and destructing tuples.
>
> > One main difference from what you described was that I used a union
> > for storing attbyval and attstorage, as the latter is only applicable
> > to attlen < 0, and the first only for attlen >= 0. This keeps the
> > whole structure in 8 bytes, whilst also being useable in both tuple
> > forming and deforming.
>
> That's why I just talked about the naive way - it's clearly possible to
> do better... ;)
>
>
> > I hope this can is useful, otherwise sorry for the noise.
>
> It is!

Great!

> I haven't looked at your patch in detail, but I suspect that one reason
> that you didn't see performance benefits is that you added overhead as
> well. The computation of the "compact" memory location now will need a
> few more instructions than before, and I suspect the compiler may not
> even be able to optimize out some of the redundant accesses in loops.
>
> It'd be interesting to see what you'd get if you stored the compact
> array as the flexible-array and stored a pointer to the "full" attrs
> array (while still keeping it allocated together).

Yes, I remember testing swapping the order of the compact array with the FormData_pg_attribute array as well, with no clear results.

I think this can partially be attributed to the split access methods of the data in the attribute descriptor: some of it is 'give me the name', some of it is 'does this attribute exist, what type description does it have?' (atttypid, attnum, atttypmod, , and others are only interested in the physical representation information. Prioritizing some over the other might work, but I think to make full use of that it'd need a lot of work.

> Another reason is that it looks like you didn't touch
> slot_deform_heap_tuple(), which is I think the hottest of the deforming
> routines...

That might be for normal operations, but I'm not certain that code is in the hot path for (btree) indexing workloads, due to the relatively high number of operations on each tuple whilst sorting, or finding an insertion point or scan start point.

Anyway, after some digging I found the final state of this patch before I stopped working on it, and after polishing it up a bit with your suggestions it now passes check-world on the latest head (8d2d6ec7).


Kind regards,

Matthias van de Meent
Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option
Next
From: Alexander Pyhalov
Date:
Subject: Re: Push down time-related SQLValue functions to foreign server