Re: Reduce build times of pg_trgm GIN indexes - Mailing list pgsql-hackers

From David Geier
Subject Re: Reduce build times of pg_trgm GIN indexes
Date
Msg-id a90ebbbd-0d77-49c7-b222-3dbffa4e3b14@gmail.com
Whole thread Raw
In response to Re: Reduce build times of pg_trgm GIN indexes  (David Geier <geidav.pg@gmail.com>)
Responses Re: Reduce build times of pg_trgm GIN indexes
List pgsql-hackers
On 23.01.2026 11:18, David Geier wrote:
> Hi Matthias,
> 
> On 21.01.2026 21:50, Matthias van de Meent wrote:
>> On Wed, 21 Jan 2026 at 16:45, David Geier <geidav.pg@gmail.com> wrote:
>>>
>>> How do we usually go about such backwards-compatibility breaking
>>> changes?
>>
>> When it concerns a bug, we mention the change in the release notes
>> with a warning to reindex affected indexes to be sure no known
>> corruption remains. See e.g. the final entry in the PG18 release
>> notes' migration section here:
>> https://www.postgresql.org/docs/18/release-18.html#RELEASE-18-MIGRATION.
>>
>>> Could we have pg_upgrade reindex all GIN indexes? Would that be
>>> acceptable?
>>
>> No. We'd handle this like any other collation/opclass fixes; we ask
>> users to reindex their indexes in their own time after they've
>> upgraded their cluster. Note that in this case it concerns an issue
>> with just one GIN opclass, not all GIN indexes; so even if we were to
>> address this in pg_upgrade it wouldn't be a correct choice to reindex
>> every GIN index, as only a subset of those would be affected by this
>> issue.
>>
>> Generally speaking, pg_upgrade doesn't concern itself with the
>> validity of the data structures that are described by the catalogs
>> that it upgrades, it only concerns itself with that it correctly
>> transcribes the catalogs from one version to another, and that the
>> data files of the old cluster are transfered correctly without
>> changes.
> 
> Thanks for the clarifications and the link to the release notes. That's
> very helpful. Then I know how to move on and will update the patch
> accordingly.

Attached are the patches rebased on latest master.

I've removed the ASCII fast-path patch 0006 as it turned out to be more
complicated to make work than expected.

I kept the radix sort patch because it gives a decent speedup but I
would like to focus for now on getting patches 0001 - 0004 merged.
They're all simple and, the way I see it, uncontroversial.

I remeasured the savings of 0001 - 0004, which comes on top of the
already committed patch that inlined the comparison function, which gave
another ~5%:

Data set            | Patched (ms) | Master (ms)  | Speedup
--------------------|--------------|--------------|----------
movies(plot)        |   8,058      |  10,311      | 1.27x
lineitem(l_comment) | 223,233      | 256,986      | 1.19x

I've also registered the change at the commit fest, see
https://commitfest.postgresql.org/patch/6418/.

--
David Geier
Attachment

pgsql-hackers by date:

Previous
From: Oleg Tselebrovskiy
Date:
Subject: psql's 001_basic.pl test could fail on very slow machines
Next
From: Tomas Vondra
Date:
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3