I have created a patch to enable the Boyer-More-Horspool search
algorithm (B-M-H) for LIKE queries.
B-M-H needs to initialize the skip table and keep it during SQL execution.
In this patch, flinfo->fn_extra is used to keep the skip table.
The conditions under which B-M-H can be used are as follows.
(1) B-M-H in LIKE search supports only single-byte character sets and UTF8.
Multibyte character sets does not support, because it may contain another
characters in the byte sequence. For UTF-8, it works fine, because in
UTF-8 the byte sequence of one character cannot contain another character.
(2) The pattern string should be stable parameter, because B-M-H needs to keep
the skip table generated from the pattern string during the execution of
the query.
(3) The pattern string should be at least 4 characters.
For example, '%AB%' can use B-M-H.
(4) The first and last character of the pattern string should be '%'.
(5) Characters other than the first and last of the pattern string
should not be '%', '_'. However, escaped characters such as
'\%', '\_' are available.
Also, this patch changes the collation validity check in functions
(such as textlike) to be performed at the first execution of the query,
instead of each function execution.
I have measured the performance with the following query.
---------- ---------- ---------- ---------- ---------- ---------- ----------
SET client_min_messages TO notice;
\timing
DO $$
DECLARE
cnt integer := 0;
total integer := 0;
BEGIN
FOR i IN 1..500 LOOP
select count(*) into cnt
from pg_catalog.pg_description d
where d.description like '%greater%';
total := total + cnt;
END LOOP;
RAISE NOTICE 'TOTAL: %', total;
END
$$
;
---------- ---------- ---------- ---------- ---------- ---------- ----------
Result
Without patch: 257.504ms
With patch: 191.638ms
Regards,
Atsushi Ogawa