Re: Reduce build times of pg_trgm GIN indexes - Mailing list pgsql-hackers
| From | David Geier |
|---|---|
| Subject | Re: Reduce build times of pg_trgm GIN indexes |
| Date | |
| Msg-id | 5dce2293-441e-4034-9c54-da87d2e631f5@gmail.com Whole thread Raw |
| In response to | Re: Reduce build times of pg_trgm GIN indexes (David Geier <geidav.pg@gmail.com>) |
| List | pgsql-hackers |
On 13.04.2026 17:05, David Geier wrote: > On 08.04.2026 04:15, John Naylor wrote: >> On Tue, Apr 7, 2026 at 6:27 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> But the comments on the pg_cmp functions say: >>> >>>> * NB: If the comparator function is inlined, some compilers may produce >>>> * worse code with these helper functions than with code with the >>>> * following form: >>>> * >>>> * if (a < b) >>>> * return -1; >>>> * if (a > b) >>>> * return 1; >>>> * return 0; >>>> * >>> >>> So, uh, is that really a universal improvement? Is that comment about >>> producing worse code outdated? > > Well spotted. Thanks! > >> >> No, it's quite recent: >> >> https://www.postgresql.org/message-id/20240212230423.GA3519%40nathanxps13 FWICS, this would only matter if btint4cmp() would get inlined somewhere, where the compiler could actually make use of understanding that parts of the if-cascade are not needed. Andres' example was return DO_COMPARE(a, b) < 0 ? (DO_COMPARE(b, c) < 0 ? b : (DO_COMPARE(a, c) < 0 ? c : a)) : (DO_COMPARE(b, c) > 0 ? b : (DO_COMPARE(a, c) < 0 ? a : c)); In the case of btint4cmp(), it's only ever invoked from the function manager, where it cannot be inlined. Or are there ways to invoke btint4cmp() that can be inlined, which I'm unaware of? > In my original benchmarks it was faster. I'll rebase the remaining > commits and do some more analysis. Here is the disassembly and the perf top output of master vs patched. I compiled with GCC 15.2.0. The unpatched version of btint4cmp() contains a conditional jump, which is mispredicted frequently in the sort. The patched version is completely branchless. master ====== Dump of assembler code for function btint4cmp: 0x00005aa9e33ccdb0 <+0>: endbr64 0x00005aa9e33ccdb4 <+4>: mov 0x20(%rdi),%edx 0x00005aa9e33ccdb7 <+7>: mov $0x1,%eax 0x00005aa9e33ccdbc <+12>: cmp %edx,0x30(%rdi) 0x00005aa9e33ccdbf <+15>: jl 0x5aa9e33ccdca <btint4cmp+26> 0x00005aa9e33ccdc1 <+17>: setne %al 0x00005aa9e33ccdc4 <+20>: movzbl %al,%eax 0x00005aa9e33ccdc7 <+23>: neg %rax 0x00005aa9e33ccdca <+26>: ret 37.22% pg_trgm.so [.] trigram_qsort_signed.constprop.0 7.99% postgres [.] cmpEntryAccumulator 6.60% postgres [.] ginCombineData 6.03% postgres [.] FunctionCall2Coll 3.19% postgres [.] btint4cmp 2.30% postgres [.] rbt_insert 2.29% pg_trgm.so [.] generate_trgm 2.24% postgres [.] pg_mblen_range 1.77% libc.so.6 [.] __towlower_l 1.73% pg_trgm.so [.] trigram_qsort_signed_med3 1.56% postgres [.] pg_utf2wchar_with_len Patched ======= Dump of assembler code for function btint4cmp: 0x000055a69e87bdb0 <+0>: endbr64 0x000055a69e87bdb4 <+4>: mov 0x20(%rdi),%eax 0x000055a69e87bdb7 <+7>: cmp %eax,0x30(%rdi) 0x000055a69e87bdba <+10>: setl %al 0x000055a69e87bdbd <+13>: setg %dl 0x000055a69e87bdc0 <+16>: movzbl %dl,%edx 0x000055a69e87bdc3 <+19>: movzbl %al,%eax 0x000055a69e87bdc6 <+22>: sub %edx,%eax 0x000055a69e87bdc8 <+24>: cltq 0x000055a69e87bdca <+26>: ret 38.07% pg_trgm.so [.] trigram_qsort_signed.constprop.0 7.69% postgres [.] cmpEntryAccumulator 6.96% postgres [.] ginCombineData 3.90% postgres [.] FunctionCall2Coll 2.54% postgres [.] pg_mblen_range 2.40% postgres [.] btint4cmp 2.38% pg_trgm.so [.] generate_trgm 1.86% postgres [.] rbt_insert 1.80% libc.so.6 [.] __towlower_l 1.73% pg_trgm.so [.] trigram_qsort_signed_med3 1.66% postgres [.] pg_utf2wchar_with_len -- David Geier
pgsql-hackers by date: