Re: Speeding up GIST index creation for tsvectors - Mailing list pgsql-hackers

From John Naylor
Subject Re: Speeding up GIST index creation for tsvectors
Date
Msg-id CAFBsxsHNctpF697qz4VgJTuzYLg3UbVdKtSTbxtGWvVa6jNi0g@mail.gmail.com
Whole thread Raw
In response to Re: Speeding up GIST index creation for tsvectors  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: Speeding up GIST index creation for tsvectors
List pgsql-hackers
On Fri, Mar 19, 2021 at 8:57 AM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Thu, 11 Mar 2021 at 04:25, John Naylor <john.naylor@enterprisedb.com> wrote:
> > Okay, so it's hard-coded in various functions in contrib modules. In that
> > case, let's just keep the existing coding for those. In fact, the comments
> > that got removed by your patch specifically say: /* Using the popcount
> > functions here isn't likely to win */ ...which your testing confirmed. The
> > new function should be used only for Gist and possibly Intarray, whose
> > default is 124. That means we can't get rid of hemdistsign(), but that's
> > fine.
>
> The comment is there for all types. Since I get the performance better
> on all the types, I have kept the pg_xorcount() call for all these
> contrib modules.  I understand that since for some types default
> siglen is small, we won't get benefit. But I think we should call
> pg_xorcount() for the benefit of non-default siglen case.

The 1-7% degradation measured was from an earlier version, when pg_xorcount_long had a lot of finicky branching and computation. Is it still true in v3? We should answer that first. I'm interested in what happens if you now use pg_xorcount_long in the call sites, at least in the worst case 7% test.

> Have replaced hemdistsign() by pg_xorcount() everywhere; but with
> that, the call looks a bit clumsy because of having to type-cast the
> parameters to const unsigned char *.  I noticed that the cast to
> "unsigned char *" is required only when we use the value in the
> pg_number_of_ones array. Elsewhere we can safely use 'a' and 'b' as
> "char *". So I changed the pg_xorcount() parameters from unsigned char
> * to char *.

That matches the style of that file, so +1.

> > I think the way to go is a simplified version of your 0001 (not 0002), with
> > only a single function, for gist and intarray only, and a style that better
> > matches the surrounding code. If you look at my xor functions in the attached
> > text file, you'll get an idea of what it should look like. Note that it got
> > the above performance without ever trying to massage the pointer alignment.
> > I'm a bit uncomfortable with the fact that we can't rely on alignment, but
> > maybe there's a simple fix somewhere in the gist code.
>
> In the attached 0001-v3 patch, I have updated the code to match it
> with surrounding code; specifically the usage of *a++ rather than
> a[i].
>
> Regarding the alignment changes... I have removed the code that
> handled the leading identically unaligned bytes, for lack of evidence
> that percentage of such cases is significant. Like I noted earlier,
> for the tsearch data I used, identically unaligned cases were only 6%.
> If I find scenarios where these cases can be significant after all and
> if we cannot do anything in the gist index code, then we might have to
> bring back the unaligned byte handling. I didn't get a chance to dig
> deeper into the gist index implementation to see why they are not
> always 8-byte aligned.

I find it stranger that something equivalent to char* is not randomly misaligned, but rather only seems to land on 4-byte boundaries.

[thinks] I'm guessing it's because of VARHDRSZ, but I'm not positive.

FWIW, I anticipate some push back from the community because of the fact that the optimization relies on statistical phenomena.

> I have kept the 0002 patch as-is. Due to significant *additional*
> speedup, over and above the 0001 improvement, I think the code
> re-arrangement done is worth it for non-x86 platforms.

For the amount of code uglification involved, we should be getting full asm popcount support on Arm, not an attempt to kluge around the current implementation. I'd be happy to review such an effort for PG15, by the way.

Readability counts, and as it stands I don't feel comfortable asking a committer to read 0002.
--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Next
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] Custom compression methods