Hello Hackers!
I have extended my patch by introducing levenshtein_less_equal function. This function have additional argument max_d and stops calculating when distance exceeds max_d. With low values of max_d function works much faster than original one.
The example of original levenshtein function usage:
test=# select word, levenshtein(word, 'consistent') as dist from words where levenshtein(word, 'consistent') <= 2 order by dist;
word | dist
-------------+------
consistent | 0
insistent | 2
consistency | 2
coexistent | 2
consistence | 2
(5 rows)
test=# explain analyze select word, levenshtein(word, 'consistent') as dist from words where levenshtein(word, 'consistent') <= 2 order by dist;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Sort (cost=2779.13..2830.38 rows=20502 width=8) (actual time=203.652..203.658 rows=5 loops=1)
Sort Key: (levenshtein(word, 'consistent'::text))
Sort Method: quicksort Memory: 25kB
-> Seq Scan on words (cost=0.00..1310.83 rows=20502 width=8) (actual time=19.019..203.601 rows=5 loops=1)
Filter: (levenshtein(word, 'consistent'::text) <= 2)
Total runtime: 203.723 ms
(6 rows)
Example of levenshtein_less_equal usage in this case:
test=# select word, levenshtein_less_equal(word, 'consistent', 2) as dist from words where levenshtein_less_equal(word, 'consistent', 2) <= 2 order by dist;
word | dist
-------------+------
consistent | 0
insistent | 2
consistency | 2
coexistent | 2
consistence | 2
test=# explain analyze select word, levenshtein_less_equal(word, 'consistent', 2) as dist from words where levenshtein_less_equal(word, 'consistent', 2) <= 2 order by dist;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Sort (cost=2779.13..2830.38 rows=20502 width=8) (actual time=42.198..42.203 rows=5 loops=1)
Sort Key: (levenshtein_less_equal(word, 'consistent'::text, 2))
Sort Method: quicksort Memory: 25kB
-> Seq Scan on words (cost=0.00..1310.83 rows=20502 width=8) (actual time=5.391..42.143 rows=5 loops=1)
Filter: (levenshtein_less_equal(word, 'consistent'::text, 2) <= 2)
Total runtime: 42.292 ms
(6 rows)
In the example above levenshtein_less_equal works about 5 times faster.
With best regards,
Alexander Korotkov.