Re: String searching - Mailing list pgsql-general

From Andy Colson
Subject Re: String searching
Date
Msg-id 546B5995.1030907@squeakycode.net
Whole thread Raw
In response to Re: String searching  (Jonathan Vanasco <postgres@2xlp.com>)
List pgsql-general
On 11/17/2014 7:54 PM, Jonathan Vanasco wrote:
>
> On Nov 17, 2014, at 12:55 PM, Robert DiFalco wrote:
>
>>      SELECT * FROM MyTable WHERE upper(FullName) LIKE upper('%John%');
>>
>> That said, which would be the best extension module to use? A "gist" index on the uppercased column? Or something
else?Thanks! 
>
> Performance wise, I think a function index would probably be the best:
>
>     CREATE INDEX mytable_lower_fullname_idx ON mytable(lower(fullname));
>
>     SELECT * FROM mytable WHERE lower(fullname) LIKE lower('%john%');
>
> The only reason why I use `lower` and not `upper` is that it's easier to look at when dealing with debugging and
samplequeries. 
>
> I'd bench against GIN and GIST, but I think this will work the best.
>
> The reason is that GIN/GIST use language patterns to simplify the index.  so they work great on "words"
>
>     select plainto_tsquery('doing watching reading programming');
>     'watch' & 'read' & 'program'
>
> but not so great on "names":
>
>     select plainto_tsquery('john doe');
>      'john' & 'doe'
>
>     select plainto_tsquery('jon doe');
>      'jon' & 'doe
>
> So you'll get a bit more overhead on the match and you won't get a smaller index (which is why they're great for
fulltext)
>
> The search execution might turn out to be much faster.  If so, i'd love to know.  But doing a lower() search on a
lower()function index has always been ridiculously fast for me. 
>
> This only goes for names though.  If you're searching other fields, then another search method might be considerably
better.
>

Full Text Search has another awesome benefit.  Aliases.  Bob == Robert.
  I do address searches, and I've created a custom dictionary that says
st == street, n == north, etc.

So when a person searches for 1st ne, they find all combinations of
1 street north east.

Its indexes, so its fast.  Ram and disk are cheap, who cares how big it is.

-Andy


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: About the tps explanation of pgbench, please help
Next
From: Jonathan Vanasco
Date:
Subject: Re: String searching