Tom Lane writes:
> I wonder why you didn't propose Boyer-Moore instead, as that would have
> some advantage for natural language text as well.
> The difficulty with B-M is the need for a table indexed by character
> code, which at first glance looks impractical for wchars. But it seems
> to me that we could use "wchar % 256" as the table index, meaning that
> wchars with the same trailing byte share the same table entry. That
> would lose some efficiency compared to an exact implementation, but the
> limited table size would outweigh that except in the most pathological
> cases.
hash table?
The main difficulty with BM is verification and understanding "good
suffix shift" (the second part of BM) (I don't understand it entirely).
----
Ajtkulov Pavel
ajtkulov@acm.org