Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries
Date
Msg-id CAPpHfdtFuaootX48f1kbLeMcuYOo16JBpO94agKCqF8VJMngLA@mail.gmail.com
Whole thread Raw
In response to Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Fri, May 31, 2013 at 9:41 PM, Amit Langote <amitlangote09@gmail.com> wrote:
On Sat, Jun 1, 2013 at 1:48 AM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
> On Fri, May 31, 2013 at 11:16 AM, Amit Langote <amitlangote09@gmail.com> wrote:
>> 2) And if that is so, is there problem in gin_extract_query_trgm(),
>> that is while generating trigrams from a query search term that causes
>> trigrams (stored in the index if answer to (1) is yes) NOT to be used
>> in such a partial matching case?
>
> it means that we can't use trigrams in case of partial matching
> because trigrams (stored in index) are converted to different
> value(CRC).
> right?
>

When I debugged a partial match case such as  " column like '%st%'
", it appears that get_wildcard_trigrams return no trigrams for
wildcard part 'st' since charlen < 3. Hence, GIN_SEARCH_MODE_ALL mode
is used and results in  full index scan instead of trigrams being
used. This happens for multibyte case too. The problem is that for
wildcard part consisting of less than 3 characters,
get_wildcard_trigrams return nothing.

Partial match for LIKE queries is not implemented at all. With current index structure it could be possible to implement it with guarantee that all indexed strings don't contain multibyte characters (i.e. single-byte encoding). 
Also, at it was mentioned, it's possible to implement operator class with text storage type which would support partial match in all the cases. However, I doubt it's reasonable to implement it based on pg_trgm, because of many hard-wired assumptions that trigram is fixed length and compatibility.

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: detecting binary backup in progress
Next
From: Josh Berkus
Date:
Subject: Re: removing PD_ALL_VISIBLE