Re: Speeding up GIST index creation for tsvectors - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: Speeding up GIST index creation for tsvectors
Date
Msg-id CAJ3gD9chdDRpz_1rAwfoF0okEznN9o6O89HdZbB0NXHkFSp7Ag@mail.gmail.com
Whole thread Raw
In response to Re: Speeding up GIST index creation for tsvectors  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Tue, 15 Dec 2020 at 20:34, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Sun, 13 Dec 2020 at 9:28 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > +1
> > This will make all INSERTs and UPDATES for tsvector's GiSTs.
>
> Oh, I didn't realize that this code is getting used in GIST index
> insertion and creation too. Will check there.

I ran some insert and update tests; they show only marginal
improvement. So looks like the patch is mainly improving index builds.

> > Meanwhile there are at least 4 incarnation of hemdistsign() functions that are quite similar. I'd propose to
refactorthem somehow...
 
>
> Yes, I hope we get the benefit there also. Before that, I thought I
> should post the first use-case to get some early comments. Thanks for
> your encouraging comments :)

The attached v2 version of 0001 patch extends the hemdistsign()
changes to the other use cases like  intarray, ltree and hstore. I see
the same index build improvement for all these types.

Since for the gist index creation of some of these types the default
value for siglen is small (8-20), I tested with small siglens. For
siglens <= 20, particularly for values that are not multiples of 8
(e.g. 10, 13, etc), I see a  1-7 % reduction in speed of index
creation. It's probably because of
an extra function call for pg_xorcount(); and also might be due to the
extra logic in pg_xorcount() which becomes prominent for shorter
traversals. So for siglen less than 32, I kept the existing method
using byte-by-byte traversal.

--
Thanks,
-Amit Khandekar
Huawei Technologies

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Key management with tests
Next
From: Heikki Linnakangas
Date:
Subject: Re: Online checksums patch - once again