Re: Reduce build times of pg_trgm GIN indexes - Mailing list pgsql-hackers
| From | Heikki Linnakangas |
|---|---|
| Subject | Re: Reduce build times of pg_trgm GIN indexes |
| Date | |
| Msg-id | 9ac3931a-180e-4283-a7a8-05eb66099206@iki.fi Whole thread Raw |
| In response to | Reduce build times of pg_trgm GIN indexes (David Geier <geidav.pg@gmail.com>) |
| Responses |
Re: Reduce build times of pg_trgm GIN indexes
|
| List | pgsql-hackers |
On 05/01/2026 17:01, David Geier wrote:
> The script I used for testing is attached. I ran CREATE INDEX three
> times and took the fastest run. I'm getting the following results on my
> i9-13950HX dev laptop in release build:
>
> Data set | Patched (ms) | Master (ms) | Speedup
> --------------------|--------------|--------------|----------
> movies(plot) | 3,409 | 10,311 | 3.02x
> lineitem(l_comment) | 161,569 | 256,986 | 1.59x
>
Impressive speedup!
> The attached patches do the following:
>
> - v1-0001-Inline-ginCompareAttEntries.patch: Inline
> ginCompareAttEntries() which is very frequently called by the GIN code.
Looks good.
> - v1-0002-Optimized-comparison-functions.patch: Use FunctionCallInvoke()
> instead of FunctionCall2Coll(). This saves a bunch of per-comparison
> setup code, such as calling InitFunctionCallInfoData().
You lose the check for NULL result with this. That's probably still
worth checking.
> - v1-0003-Use-sort_template.h.patch: Use sort_template.h instead of
> qsort(), to inline calls to the sort comparator. This is an interim step
> that is further improved on by patch 0006.
ok
> - v1-0004-Avoid-dedup-and-sort-in-ginExtractEntries.patch
> ginExtractEntries() deduplicates and sorts the entries returned from the
> extract value function. In case of pg_trgm, that is completely redundant
> because the trigrams are already deduplicated and sorted. The current
> version of this patch is just to demonstrate the potential. We need to
> think about what we want here. Ideally, we would require the extraction
> function to provide the entries deduplicated and sorted. Alternatively,
> we could indicate to ginExtractEntries() if the entries are already
> deduplicated and sorted. If we don't want to alter the signature of the
> extract value function, we could e.g. use the MSB of the nentries argument.
Yeah, this seems wrong as it is. You're assuming that if the extract
function returns nullFlags == NULL, the array is already sorted and deduped.
> - v1-0005-Make-btint4cmp-branchless.patch: Removes branches from
> btint4cmp(), which is heavily called from the GIN code. This might as
> well have benefit in other parts of the code base.
Seems reasonable.
> v1-0006-Use-radix-sort.patch: Replace the sort_template.h-based qsort()
> with radix sort. For the purpose of demonstrating the possible gains,
> I've only replaced the signed variant for now. I've also tried using
> simplehash.h for deduplicating followed by a sort_template.h-based sort.
> But that was slower.
Ok.
> v1-0007-Faster-qunique-comparator.patch: qunique() doesn't require a
> full sort comparator (-1 = less, 0 = equal, 1 = greater) but only a
> equal/unequal comparator (e.g. 0 = equal and 1 = unequal). The same
> optimization can be done in plenty of other places in our code base.
> Likely, in most of them the gains are insignificant.
Makes sense. I'm a little disappointed the compiler won't do that
optimization for us..
Perhaps we should introduce a new qunique_eq() function with a different
callback signature:
/* like qunique(), but the callback function returns true/false rather
than int */
static inline size_t
qunique_eq(void *array, size_t elements, size_t width,
bool (*equal) (const void *, const void *))
> v1-0008-Add-ASCII-fastpath-to-generate_trgm_only.patch: Typically lots
> of text is actually ASCII. Hence, we provide a fast path for this case
> which is exercised if the MSB of the current character is unset.
This uses pg_ascii_tolower() when for ASCII characters when built with
the IGNORECASE. I don't think that's correct, if the proper collation
would do something more complicated for than what pg_ascii_tolower() does.
Did you measure how big is the impact from each individual patch?
Patches 1 and 2 seem pretty much ready to be committed, but I wonder if
they make any difference on their own.
- Heikki
pgsql-hackers by date: