> Hi, everyone. I'm working on a project on PostgreSQL 9.0 (soon to be
> upgraded to 9.1, given that we haven't yet launched). The project will
> involve numerous text fields containing English, Spanish, and Portuguese.
> Some of those text fields will be searchable by the user. That's easy
> enough to do; for our purposes, I was planning to use some combination of
> LIKE searches; the database is small enough that this doesn't take very
> much time, and we don't expect the number of searchable records (or
> columns within those records) to be all that large. The thing is, the
> people running the site want searches to work on what I'm calling (for
> lack of a better term) "bare" letters. That is, if the user searches for
> "n", then the search should also match Spanish words containing "ñ". I'm
> told by Spanish-speaking members of the team that this is how they would
> expect searches to work. However, when I just did a quick test using a
> UTF-8 encoded 9.0 database, I found that PostgreSQL didn't see the two
> characters as identical. (I must say, this is the behavior that I would
> have expected, had the Spanish-speaking team member not said anything on
> the subject.) So my question is whether I can somehow wrangle PostgreSQL
> into thinking that "n" and "ñ" are the same character for search purposes,
> or if I need to do something else -- use regexps, keep a "naked,"
> searchable version of each column alongside the native one, or something
> else entirely -- to get this to work. Any ideas?
> Thanks,
> Reuven
What kind of "client" are the users using? I assume you will have some kind
of user interface. For me this is a typical job for a user interface. The
number of letters with "equivalents" in different languages are extremely
limited, so a simple matching routine in the user interface should give you a
way to issue the proper query.
Uwe