Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker) - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker)
Date
Msg-id E737D69A2FDE4F79832CDAF8FC6C1DEC@amd64
Whole thread Raw
In response to Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCHES] TODO item: Implement Boyer-Moore searching (First time hacker)
List pgsql-hackers
Tom Lane Wrote:
> "David Rowley" <dgrowley@gmail.com> writes:
> > I've made the discussed changes. Also updated the benchmark results.
> > http://www.unixbeast.com/~fat/8.3_test_v1.3.xls

> Applied with revisions; mostly cosmetic except for one point.  I
> realized after studying the code a bit more that B-M cannot possibly win
> for a single-character pattern (needle), since the skip distance must
> always be 1 in that case.  The fact that it seemed to keep up at that
> length has to be because the original coding included a strncmp call
> inside the innermost loop, which likely prevents the compiler from
> optimizing that loop really tightly.  But the strncmp wasn't doing
> anything anyway for the case of pattern length = 1. So what I committed
> special-cases pattern length 1 to be a naive search with a *very* tight
> inner loop.  I think it's worth troubling over this case because a
> common usage is split_to_array and suchlike with single-character
> delimiters.

I had thought about this, but failed to think about string_to_array.
Probably worth the extra code for that.

Well simple enough patch. I've learned quite a bit about the postgresql
source even just for working in varlena.c. I've got a very long way to go
though.

Thanks for all the reviews and suggestions.

David.






pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Noisy CVS updates
Next
From: Simon Riggs
Date:
Subject: Re: Withdraw PL/Proxy from commitfest