Hi,
For long strings, iterate_word_similarity() can run into long-running
tight-loops without honouring interrupts or statement_timeouts. For
example:
postgres=# set statement_timeout='1s';
SET
postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
?column?
----------
(0 rows)
Time: 29615.842 ms (00:29.616)
The associated perf report:
+ 99.98% 0.00% postgres postgres [.] ExecQual
+ 99.98% 0.00% postgres postgres [.] ExecEvalExprSwitchContext
+ 99.98% 0.00% postgres pg_trgm.so [.] strict_word_similarity_commutator_op
+ 99.98% 0.00% postgres pg_trgm.so [.] calc_word_similarity
+ 99.68% 99.47% postgres pg_trgm.so [.] iterate_word_similarity
0.21% 0.03% postgres postgres [.] pg_qsort
0.16% 0.00% postgres [kernel.kallsyms] [k] asm_sysvec_apic_timer_interrupt
0.16% 0.00% postgres [kernel.kallsyms] [k] sysvec_apic_timer_interrupt
0.16% 0.11% postgres [kernel.kallsyms] [k] __softirqentry_text_start
0.16% 0.00% postgres [kernel.kallsyms] [k] irq_exit_rcu
Adding CHECK_FOR_INTERRUPTS() ensures that such queries respond to
statement_timeout & Ctrl-C signals. With the patch applied, the
above query will interrupt more quickly:
postgres=# select 1 where repeat('1.1',80000) %>> 'Lorem ipsum dolor sit amet';
ERROR: canceling statement due to statement timeout
Time: 1000.768 ms (00:01.001)
Please find the patch attached. The patch does not show any performance
regressions when run against the above use-case. Thanks to SQLSmith
for indirectly leading me to this scenario.
-
Robins Tharakan
Amazon Web Services