Re: strpos() && KMP - Mailing list pgsql-patches

From Pavel Ajtkulov
Subject Re: strpos() && KMP
Date
Msg-id 656957725.20070808235113@acm.org
Whole thread Raw
In response to Re: strpos() && KMP  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: strpos() && KMP  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Tom Lane writes:

> I wonder why you didn't propose Boyer-Moore instead, as that would have
> some advantage for natural language text as well.

> The difficulty with B-M is the need for a table indexed by character
> code, which at first glance looks impractical for wchars.  But it seems
> to me that we could use "wchar % 256" as the table index, meaning that
> wchars with the same trailing byte share the same table entry.  That
> would lose some efficiency compared to an exact implementation, but the
> limited table size would outweigh that except in the most pathological
> cases.

hash table?

The main difficulty with BM is verification and understanding "good
suffix shift" (the second part of BM) (I don't understand it entirely).

----
Ajtkulov Pavel
ajtkulov@acm.org



pgsql-patches by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Default location for docbook stylesheets in Gentoo
Next
From: "Brendan Jurd"
Date:
Subject: Re: Default location for docbook stylesheets in Gentoo