Re: Reduce build times of pg_trgm GIN indexes - Mailing list pgsql-hackers

From David Geier
Subject Re: Reduce build times of pg_trgm GIN indexes
Date
Msg-id 5650bf75-dcb8-446d-8cba-e626eb44594b@gmail.com
Whole thread Raw
In response to Re: Reduce build times of pg_trgm GIN indexes  (David Geier <geidav.pg@gmail.com>)
List pgsql-hackers
On 13.04.2026 17:06, David Geier wrote:
>> I squashed 0002 and 0004 into one commit, and did some more refactoring:
>> I created a trigram_qsort() helper function that calls the signed or
>> unsigned variant, so that that logic doesn't need to be duplicated in
>> the callers. For symmetry, I also added a trigram_qunique() helper
>> function which just calls qunique() with the new, faster CMPTRGM_EQ
>> comparator. Pushed these as commit 9f3755ea07.
> 
> Thanks for committing these patches.

Attached are the remaining patches (previously 0003 and 0005) rebased on
latest master. Currently, there's no radix sort variant for the unsigned
char case. Do we care about this case or is it fine if that case runs
slower?

The following perf profiles show that trigram_qsort() goes from ~34%
down to ~7% with the radix sort optimization. The optimized run also
includes the btint4cmp() optimization. Without that the result would be
even better.

With that change we could move on and tackle optimizing

1. 41.52% generate_trgm_only() by e.g. using an ASCII fast-patch
2. 32.72% ginInsertBAEntries() by no longer using the RB-tree but
   e.g. also the radix sort

master




   - heapam_index_build_range_scan



      - 99.40% ginBuildCallback



         - ginHeapTupleBulkInsert



            - 66.55% ginExtractEntries



               - 65.29% FunctionCall3Coll



                  - gin_extract_value_trgm



                     - 62.80% generate_trgm



                        + 34.33% trigram_qsort (inlined)



                        + 26.20% generate_trgm_only



                        + 2.23% trigram_qunique (inlined)



                     + 1.74% detoast_attr



               + 1.19% qsort_arg_entries



            + 32.72% ginInsertBAEntries




patched




   - heapam_index_build_range_scan



      - 99.42% ginBuildCallback



         - 95.95% ginHeapTupleBulkInsert



            - 59.11% ginExtractEntries



               - 56.93% FunctionCall3Coll



                  - gin_extract_value_trgm



                     - 52.19% generate_trgm



                        + 41.52% generate_trgm_only



                        + 7.14% trigram_qsort (inlined)



                        + 3.53% trigram_qunique (inlined)



                     + 4.08% detoast_attr



               + 2.13% qsort_arg_entries



            + 36.78% ginInsertBAEntries

--
David Geier
Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] Refactor *_abbrev_convert() functions
Next
From: jian he
Date:
Subject: Re: Bug: COPY FORMAT JSON includes generated columns unlike text/CSV