On Tue, Oct 09, 2012 at 02:10:26PM +0200, Willy-Bas Loos wrote:
> Hi,
>
> I need a *language unaware* text comparison algorithm
[. . .]
> (i want to use it for *"did you mean ...?"* for approx 6-10 character codes
> or 8-20 letter words of mixed languages)
I don't think this is going to do what you want, at least from the
user's point of view.
The character codes case probably would work in a language-unaware
way.
But for the mixed languages case, surely it's not _any_ mixed
language? Are you mixing Arabic, Farsi, Chinese, and Hindi, for
instance?
If not, then you're not really language unaware, but instead
constrained by a subset of languages. That is a more tractable
problem (for instance, you may not have to worry about direction
changes, which vastly simplifies the problem).
Best,
A
--
Andrew Sullivan
ajs@crankycanuck.ca