On 03/03/2026 19:31, David Geier wrote:
>> Attached are the patches rebased on latest master.
>>
>> I've removed the ASCII fast-path patch 0006 as it turned out to be more
>> complicated to make work than expected.
>>
>> I kept the radix sort patch because it gives a decent speedup but I
>> would like to focus for now on getting patches 0001 - 0004 merged.
>> They're all simple and, the way I see it, uncontroversial.
>>
>> I remeasured the savings of 0001 - 0004, which comes on top of the
>> already committed patch that inlined the comparison function, which gave
>> another ~5%:
>>
>> Data set | Patched (ms) | Master (ms) | Speedup
>> --------------------|--------------|--------------|----------
>> movies(plot) | 8,058 | 10,311 | 1.27x
>> lineitem(l_comment) | 223,233 | 256,986 | 1.19x
>>
>> I've also registered the change at the commit fest, see
>> https://commitfest.postgresql.org/patch/6418/.
>
> Attached is v5 that removes an incorrect assertion from the radix sort code.
>
> v5-0001-Optimize-sort-and-deduplication-in-ginExtractEntr.patch
> v5-0002-Optimize-generate_trgm-with-sort_template.h.patch
> v5-0003-Make-btint4cmp-branchless.patch
> v5-0004-Faster-qunique-comparator-in-generate_trgm.patch
> v5-0005-Optimize-generate_trgm-with-radix-sort.patch
Pushed 0001 as commit 6f5ad00ab7.
I squashed 0002 and 0004 into one commit, and did some more refactoring:
I created a trigram_qsort() helper function that calls the signed or
unsigned variant, so that that logic doesn't need to be duplicated in
the callers. For symmetry, I also added a trigram_qunique() helper
function which just calls qunique() with the new, faster CMPTRGM_EQ
comparator. Pushed these as commit 9f3755ea07.
Patch 0003 gives me pause. It's a tiny patch:
> @@ -203,12 +204,7 @@ btint4cmp(PG_FUNCTION_ARGS)
> int32 a = PG_GETARG_INT32(0);
> int32 b = PG_GETARG_INT32(1);
>
> - if (a > b)
> - PG_RETURN_INT32(A_GREATER_THAN_B);
> - else if (a == b)
> - PG_RETURN_INT32(0);
> - else
> - PG_RETURN_INT32(A_LESS_THAN_B);
> + PG_RETURN_INT32(pg_cmp_s32(a, b));
> }
But the comments on the pg_cmp functions say:
> * NB: If the comparator function is inlined, some compilers may produce
> * worse code with these helper functions than with code with the
> * following form:
> *
> * if (a < b)
> * return -1;
> * if (a > b)
> * return 1;
> * return 0;
> *
So, uh, is that really a universal improvement? Is that comment about
producing worse code outdated?
- Heikki