Re: Searching for "bare" letters - Mailing list pgsql-general

From Uwe Schroeder
Subject Re: Searching for "bare" letters
Date
Msg-id 201110011920.08784.uwe@oss4u.com
Whole thread Raw
In response to Searching for "bare" letters  ("Reuven M. Lerner" <reuven@lerner.co.il>)
Responses Re: Searching for "bare" letters  ("Reuven M. Lerner" <reuven@lerner.co.il>)
List pgsql-general

> Hi, everyone.  I'm working on a project on PostgreSQL 9.0 (soon to be
> upgraded to 9.1, given that we haven't yet launched).  The project will
> involve numerous text fields containing English, Spanish, and Portuguese.
> Some of those text fields will be searchable by the user.  That's easy
> enough to do; for our purposes, I was planning to use some combination of
> LIKE searches; the database is small enough that this doesn't take very
> much time, and we don't expect the number of searchable records (or
> columns within those records) to be all that large. The thing is, the
> people running the site want searches to work on what I'm calling (for
> lack of a better term) "bare" letters.  That is, if the user searches for
> "n", then the search should also match Spanish words containing "ñ".  I'm
> told by Spanish-speaking members of the team that this is how they would
> expect searches to work.  However, when I just did a quick test using a
> UTF-8 encoded 9.0 database, I found that PostgreSQL didn't  see the two
> characters as identical.  (I must say, this is the behavior that I would
> have expected, had the Spanish-speaking team member not said anything on
> the subject.) So my question is whether I can somehow wrangle PostgreSQL
> into thinking that "n" and "ñ" are the same character for search purposes,
> or if I need to do something else -- use regexps, keep a "naked,"
> searchable version of each column alongside the native one, or something
> else entirely -- to get this to work. Any ideas?
> Thanks,
> Reuven



What kind of "client" are the users using?  I assume you will have some kind
of user interface. For me this is a typical job for a user interface. The
number of letters with "equivalents" in different languages are extremely
limited, so a simple matching routine in the user interface should give you a
way to issue the proper query.

Uwe

pgsql-general by date:

Previous
From: planas
Date:
Subject: Re: Searching for "bare" letters
Next
From: Cody Caughlan
Date:
Subject: Re: Searching for "bare" letters