Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Date
Msg-id AANLkTikZsCsFnzjzhggbV9brfPrn+nYnZo1D=2i3VktH@mail.gmail.com
Whole thread Raw
In response to Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
List pgsql-hackers
2010/10/4 Alexander Korotkov <aekorotkov@gmail.com>:
> I've reworked patch with your suggestion. In this version I found a little
> slowdown in comparison with previous version:
> SELECT * FROM words WHERE levenshtein_less_equal(a, 'extensize', 2) <= 2;
> 48,069 ms => 57,875 ms
> SELECT * FROM words2 WHERE levenshtein_less_equal(a, 'клубничный', 3) <= 2;
> 100,073 ms => 113,975 ms
> select * from phrases where levenshtein_less_equal('kkkknucklehead
> courtliest   sapphires be coniferous emolument antarctic Laocoon''s deadens
> unseemly', a, 10) <= 10;
> 22,876 ms => 24,721 ms
> test=# select * from phrases2 where levenshtein_less_equal('таяй
> раскупорившийся передислоцируется юлианович праздничный лачужка присыхать
> опппливший ффехтовальный уууудобряющий', a, 10) <= 10;
> 55,405 ms => 57,760 ms
> I think it is caused by multiplication operation for each bound
> movement. Probably, this slowdown is ignorable or there is some way
> to achieve the same performance.

This patch doesn't apply cleanly.  It also seems to revert some recent
commits to fuzzystrmatch.c.  Can you please send a corrected version?

[rhaas pgsql]$ patch -p1 < ~/Downloads/levenshtein_less_equal-0.3.patch
patching file contrib/fuzzystrmatch/fuzzystrmatch.c
Reversed (or previously applied) patch detected!  Assume -R? [n]
Apply anyway? [n] y
Hunk #1 FAILED at 5.
Hunk #8 FAILED at 317.
Hunk #9 succeeded at 543 (offset 10 lines).
Hunk #10 succeeded at 567 (offset 10 lines).
Hunk #11 succeeded at 578 (offset 10 lines).
2 out of 11 hunks FAILED -- saving rejects to file
contrib/fuzzystrmatch/fuzzystrmatch.c.rej
patching file contrib/fuzzystrmatch/fuzzystrmatch.sql.in
patching file doc/src/sgml/fuzzystrmatch.sgml

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch
Next
From: Andrew Dunstan
Date:
Subject: Re: Git cvsserver serious issue