Re: TODO item: Implement Boyer-Moore searching in LIKE queries - Mailing list pgsql-hackers

From Karan Sikka
Subject Re: TODO item: Implement Boyer-Moore searching in LIKE queries
Date
Msg-id CALkFZpeHnyBZWTWxC+ospSWLumOwV2-QeK=iE=X9zOySGiL=uA@mail.gmail.com
Whole thread Raw
In response to Re: TODO item: Implement Boyer-Moore searching in LIKE queries  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: TODO item: Implement Boyer-Moore searching in LIKE queries  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
<div dir="ltr">I agree, should we remove it from the TODO list?</div><div class="gmail_extra"><br /><div
class="gmail_quote">OnMon, Aug 1, 2016 at 6:13 PM, Robert Haas <span dir="ltr"><<a
href="mailto:robertmhaas@gmail.com"target="_blank">robertmhaas@gmail.com</a>></span> wrote:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, Aug 1,
2016at 1:19 PM, Karan Sikka <<a href="mailto:karanssikka@gmail.com">karanssikka@gmail.com</a>> wrote:<br /> >
Followingthe patch to implement strpos with Boyer-Moore-Horspool,<br /> > it was proposed we bring BMH to LIKE as
well.<br/> ><br /> > Original thread:<br /> > <a
href="https://www.postgresql.org/message-id/flat/27645.1220635769%40sss.pgh.pa.us#27645.1220635769@sss.pgh.pa.us"
rel="noreferrer"
target="_blank">https://www.postgresql.org/message-id/flat/27645.1220635769%40sss.pgh.pa.us#27645.1220635769@sss.pgh.pa.us</a><br
/>><br /> > I'm a first time hacker and I found this task on the TODO list. It seemed<br /> > interesting and
isolatedenough. But after looking at the code in<br /> > like_match.c, it's clearly a non-trivial task.<br />
><br/> > How desirable is this feature? To begin answering that question,<br /> > I did some initial
benchmarkingwith an English text corpus to find how much<br /> > faster BMH (Boyer-Moore-Horspool) would be compared
toLIKE, and the results<br /> > were as follows: BMH can be up to 20% faster than LIKE, but for short<br /> >
substrings,it's roughly comparable or slower.<br /> ><br /> > Here are the results visualized:<br /> ><br />
><a href="http://ctrl-c.club/~ksikka/pg/like-strpos-short-1469975400.png" rel="noreferrer"
target="_blank">http://ctrl-c.club/~ksikka/pg/like-strpos-short-1469975400.png</a><br/> > <a
href="http://ctrl-c.club/~ksikka/pg/like-strpos-long-1469975400.png"rel="noreferrer"
target="_blank">http://ctrl-c.club/~ksikka/pg/like-strpos-long-1469975400.png</a><br/><br /></span>Based on these
results,this looks to me like a pretty unexciting<br /> thing upon which to spend time.<br /><span class="HOEnZb"><font
color="#888888"><br/> --<br /> Robert Haas<br /> EnterpriseDB: <a href="http://www.enterprisedb.com" rel="noreferrer"
target="_blank">http://www.enterprisedb.com</a><br/> The Enterprise PostgreSQL Company<br
/></font></span></blockquote></div><br/></div> 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostmasterContext survives into parallel workers!?
Next
From: Alvaro Herrera
Date:
Subject: Re: New version numbering practices