Re: Reduce build times of pg_trgm GIN indexes - Mailing list pgsql-hackers

From David Geier
Subject Re: Reduce build times of pg_trgm GIN indexes
Date
Msg-id 10395cb6-a804-49ac-b2f1-3f98d7204359@gmail.com
Whole thread
In response to Re: Reduce build times of pg_trgm GIN indexes  (David Geier <geidav.pg@gmail.com>)
List pgsql-hackers
On 14.04.2026 16:24, David Geier wrote:
> Attached are the remaining patches (previously 0003 and 0005) rebased on
> latest master. Currently, there's no radix sort variant for the unsigned
> char case. Do we care about this case or is it fine if that case runs
> slower?
> 
> The following perf profiles show that trigram_qsort() goes from ~34%
> down to ~7% with the radix sort optimization. The optimized run also
> includes the btint4cmp() optimization. Without that the result would be
> even better.
> 
> With that change we could move on and tackle optimizing
> 
> 1. 41.52% generate_trgm_only() by e.g. using an ASCII fast-patch
> 2. 32.72% ginInsertBAEntries() by no longer using the RB-tree but
>    e.g. also the radix sort

Attached is the rebased patch set as well as a new patch that optimizes
ginInsertBAEntries(). Performance improvements are as follows, measured
with the same benchmark I used in the first mail of this thread.
Runtimes and deltas are in milliseconds.

Code                               | movies | delta  | lineitem | delta
-----------------------------------|--------|--------|------------------
master                             | 11,160 | -      | 248,146  | -
v7-0001-Make-btint4cmp-branchless  |  9,509 | 1,651  | 236,760  | 11,386
v7-0002-Use-radix-sort             |  6,123 | 3,386  | 214,632  | 22,128
v7-0003-Replace-RB-tree            |  4,755 | 1,368  | 144,252  | 70,380

I optimized ginInsertBAEntries() by replacing the red-black tree by a
hash map (simplehash.h) on the keys (e.g. trigrams), where each hash map
entry contains an item pointer list with the row TIDs the key occurs in.
This is in the spirit of the original red-black tree implementation but
this way the deduplication of each key is handled in O(1), instead of
O(log(num_unique_keys)). This reduces the overall runtime complexity from

O(num_total_keys * log(num_unique_keys))
  ->
O(num_total_keys + num_unique_keys * log(num_unique_keys))

As there are normally a lot less keys than rows, this is complexity-wise
preferable. And even if all keys are unique, the rebalancing operations
are pretty expensive and in reality outweigh the slightly better
complexity in the worst-case.

- The sorting of the item pointer lists for each key stays as is, except
  for that I switched to sort_template.h.

- The red-black tree code in rbtree.c/.h is now completely unused and
  could be removed. Thoughts?

- The size on disk changed very slightly. I think this is because the
  memory accounting is slightly different and with that slightly more or
  less keys / item pointers can be processed in one go.

- FWICS, we can use datumIsEqual() and datum_image_hash() in the hash
  map because as of today, the GIN index code doesn't work properly with
  non-deterministic collations / non-image-equal types, e.g. see [1].

- I also simplified ginInsertBAEntries(). Previously, it used an
  insertion order that minimizes the number of RB-tree rebalancing
  operations for sorted inputs. With the hash map approach, it's better
  to insert in sort order as this increases the likelihood of hitting
  the same hash map entry again for equal keys.

- This patch set also improves parallel GIN index builds as the parallel
  code builds on top of ginInsertBAEntries().

- Regression tests pass and gin_index_check() from amcheck couldn't find
  an error in the data structures.

To be fair: 0001 is less interesting after removing the RB-tree because
a lot less key comparisons are being performed. I still think it makes
sense to merge it as it should also accelerate e.g. B-tree traversals.

With these changes the number one bottleneck becomes the trigram
generation in generate_trgm_only(). The following perf profile nicely
shows that:

-   99.89%     0.00%  postgres  postgres           [.] ginbuild
     ginbuild
     table_index_build_scan (inlined)
   - heapam_index_build_range_scan
      - 94.05% ginBuildCallback
         - 86.48% ginHeapTupleBulkInsert
            - 68.57% ginExtractEntries
               - 64.69% FunctionCall3Coll
                  - gin_extract_value_trgm
                     - 62.75% generate_trgm
                        - 39.57% generate_trgm_only
                           + 18.53% str_tolower
                           + 16.91% find_word (inlined)
                           + 1.36% make_trigrams (inlined)
                             0.66% __strlen_avx2
                        + 18.49% trigram_qsort (inlined)
                        + 4.17% trigram_qunique (inlined)
                       0.79% trgm2int
               + 3.33% qsort_arg_entries
            - 17.28% ginInsertBAEntries
               - ginInsertBAEntry (inlined)
                  - 8.04% ginbuild_insert (inlined)
                     - 4.89% ginbuild_insert_hash_internal (inlined)
                          1.25% gin_equal_key (inlined)
                          0.68% ginbuild_initial_bucket (inlined)
                     + 3.14% gin_hash_key (inlined)
                  + 2.45% AllocSetRealloc
                  + 0.85% asm_exc_page_fault
                    0.52% getDatumCopy (inlined)
         + 6.00% ginEntryInsert
         + 1.19% ginBeginBAScan
      + 2.99% heap_getnext

[1]
https://www.postgresql.org/message-id/flat/8ef4899c4acfebca45cc6c042a6dc611d25ffab1.camel%40cybertec.at

--
David Geier
Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Reduce build times of pg_trgm GIN indexes
Next
From: Tom Lane
Date:
Subject: Re: [RFC][PATCH] Order qual clauses by combined cost and selectivity