Ühel kenal päeval, R, 2006-05-19 kell 11:20, kirjutas Jim C. Nasby:
> On Thu, May 18, 2006 at 06:49:38PM -0700, Mark Dilger wrote:
> > > I would think that the worst-case times would be fairly improbable.
> > > I'm disinclined to push something as complicated as Boyer-Moore matching
> > > into this function without considerable evidence that it's a performance
> > > bottleneck for real applications.
> >
> > A common approach in biological data applications is to store nucleic and amino
> > acid sequences as text in a relational database. The smaller alphabet sizes and
> > the tendency for redundancy in these sequences increases the likelihood of a
> > performance problem. I have solved this problem by writing my own data types
> > with their own functions for sequence comparison and alignment, and I used
> > boyer-moore for some of that work. Whether the same technique should be used
> > for the text and varchar types was unclear to me, hence the question.
>
> Perhaps it would be best to add a seperate set of functions that use
> boyer-moore, and reference them in appropriate places in the
> documentation. Unless someone has a better idea on how we can find out
> what people are actually doing in the field...
I guess our regex implementation already uses boyer-moore or similar.
Why not just expose the match position of substring('text' in 'regex')
using some function, called match_position(int searched_text, int
regex, int matchnum) ?
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia
Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com