Thread: Searching accented words
Hi! I have, in the same column, accented words and not. But I don´t want to worry about it. Imagine the table Person: CREATE TABLE PERSON (name TEXT) INSERT INTO PERSON VALUES ('José') INSERT INTO PERSON VALUES ('Jose') The following statement SELECT * FROM PERSON WHERE NAME like 'José' would return only the first row, because 'José' is an accented word. How can I perform a query that return the two rows, no matter I pass 'José' or 'Jose' as parameter? Thanks, JP _______________________________________________________________________ Yahoo! Encontros O lugar certo para encontrar a sua alma gêmea. http://br.encontros.yahoo.com/
=?iso-8859-1?q?Jo=E3o=20Paulo=20Batistella?= <batistellabr@yahoo.com.br> writes: > How can I perform a query that return the two rows, no > matter I pass 'Jos�' or 'Jose' as parameter? If your locale is set up correctly, perhaps upper() would return JOSE for both, and then you could search on upper(name) = 'JOSE'. (You can make this fast with an index on upper(name).) regards, tom lane
On Wed, 2002-07-24 at 21:20, João Paulo Batistella wrote: SELECT * FROM PERSON WHERE NAME ilike 'José' make sure your encoding is set to LATIN9 Cheers Tony Grant -- RedHat Linux on Sony Vaio C1XD/S http://www.animaproductions.com/linux2.html Macromedia UltraDev with PostgreSQL http://www.animaproductions.com/ultra.html
On Wed, Jul 24, 2002 at 04:49:09PM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote a message of 14 lines which said: > If your locale is set up correctly, perhaps upper() would return > JOSE for both, I would not say it is set up correctly in that case! In French, upper('Stéphane') is 'STÉPHANE', not 'STEPHANE'.
On Wed, Jul 24, 2002 at 04:20:10PM -0300, João Paulo Batistella <batistellabr@yahoo.com.br> wrote a message of 29 lines which said: > How can I perform a query that return the two rows, no > matter I pass 'José' or 'Jose' as parameter? A la Altavista, uh? I believe you have to write your own function. Case-folding fuzzy matching rules are quite dependent on the language (not on the script, we both use Latin-1, on the language).
João Paulo Batistella dijo: > Hi! > > I have, in the same column, accented words and not. > But I don´t want to worry about it. > > Imagine the table Person: > CREATE TABLE PERSON (name TEXT) > > INSERT INTO PERSON VALUES ('José') > INSERT INTO PERSON VALUES ('Jose') > > The following statement > SELECT * FROM PERSON WHERE NAME like 'José' > would return only the first row, because 'José' is an > accented word. I think you have two ways of solving this: 1. using regular expressions with character classes where an accented letter is found: SELECT * FROM PERSON WHERE name ~* '^Jos[eé]$' (note the anchoring to make it equivalent to the absence of % in LIKE) 2. using a function to convert the accented letters in strings. Then use it like SELECT * FROM PERSON WHERE drop_accents(name) LIKE drop_accents('José') -- Alvaro Herrera (<alvherre[a]atentus.com>) "El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)