Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker) - Mailing list pgsql-patches

From David Rowley
Subject Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
Date
Msg-id C848A7ECB1E346388B9BD7F7966C0DDC@amd64
Whole thread Raw
In response to Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Heikki Linnakangas wrote:
> The skip table really should be constructed only once in
> text_position_start and stored in TextPositionState. That would make a
> big difference to the performance of those functions that call
> text_position_next repeatedly: replace_text, split_text and text_to_array.

I Wrote:
> Of course you are right. That will help for replace and the like. I'll
> update the patch tonight.

I've made and attached the changes Heikki recommended.
Also updated benchmark spreadsheet. Here ->
http://www.unixbeast.com/~fat/8.3_test_v1.2.xls

Previously there was an error with "Test 6", the test that benchmarked
replace(). I kept this one so not to affect the summary result in the new
sheet. I then added the sheet "Replace Test" to show more accurate results.
I had failed to notice that the optimizer was helping me out more than I
wanted it to.

My tested replace() script runs in 91% of the time than the 8.3 version.
I've not tested with the CVS head.

Now that the skip table is a member of TextPositionState, I was not quite
sure if I should #define a size for it. It would certainly look neater, only
the code that defines the skip table size in text_position_start assumes 256
elements.

Any thoughts on this?

David.

Attachment

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: libpq events patch (with sgml docs)
Next
From: "Alex Hunsaker"
Date:
Subject: Re: hash index improving v3