Re: Speeding up GIST index creation for tsvectors - Mailing list pgsql-hackers

From John Naylor
Subject Re: Speeding up GIST index creation for tsvectors
Date
Msg-id CAFBsxsHBJsuCqe1kd2OVm-M4i=0eEQ=h0ESRp3fvrU_S9FXsUA@mail.gmail.com
Whole thread Raw
In response to Re: Speeding up GIST index creation for tsvectors  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers

On Sun, Aug 1, 2021 at 11:41 PM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> > FWIW, I anticipate some push back from the community because of the fact that the optimization relies on statistical phenomena.
>
> I dug into this issue for tsvector type. Found out that it's the way
> in which the sign array elements are arranged that is causing the pointers to
> be misaligned:
[...]
> If siglen is not a multiple of 8 (say 700), cache[j].sign will in some
> cases point to non-8-byte-aligned addresses, as you can see in the
> above code snippet.
>
> Replacing siglen by MAXALIGN64(siglen) in the above snippet gets rid
> of the misalignment. This change applied over the 0001-v3 patch gives
> additional ~15% benefit. MAXALIGN64(siglen) will cause a bit more
> space, but for not-so-small siglens, this looks worth doing. Haven't
> yet checked into types other than tsvector.

Sounds good.

> Will get back with your other review comments. I thought, meanwhile, I
> can post the above update first.

Thinking some more, my discomfort with inline functions that call a global function doesn't make logical sense, so feel free to do it that way if you like.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: "tanghy.fnst@fujitsu.com"
Date:
Subject: [PATCH]Comment improvement in publication.sql
Next
From: David Rowley
Date:
Subject: Re: Use generation context to speed up tuplesorts