Thread: Seaching with and without diacritical marks
Hello, I have a multilingual portal running on PostgreSQL 7.4.2. My clients come from spain,portugal, latin america and germany (mainly). The main feature of the site is a search engine that retrieves bibliographic data, which is stored in my database (unicode!) with diacritical marks (e.g. Panamá,América); when users enter their search terms with diacriticalmarks postgres will find the requested records, but if a german user enter Panama or America (without diacriticalmarks), the search fails. Is there any extension for postgres that allows for both modes of searching on the same data? Yours, Ralf Ullrich ______________________ Ralf Ullrich, M.A. Virtuelle Fachbibliothek Ibero-Amerikanisches Institut Preußischer Kulturbesitz Potsdamer Str. 37 D-10785 Berlin Germany Phone: +49 +30 266-2512 Fax: +49 +30 266-2503 E-Mail: ullrich@iai.spk-berlin.de http://www.iai.spk-berlin.de ______________________
Attachment
Automatically I don't believe so. You can modify your queries though. IF default search returns 0 records then SET CLIENT_ENCODING TO 'some_other_encodin'; SELECT..... END IF; Maybe search for those special characters before issuing a SELECT statement. HTH. On Tue, Jul 13, 2004 at 02:20:48PM +0200, Ullrich Ralf wrote: > Hello, > > I have a multilingual portal running on PostgreSQL 7.4.2. > My clients come from spain,portugal, latin america and germany (mainly). > The main feature of the site is a search engine that retrieves bibliographic data, which is stored in > my database (unicode!) with diacritical marks (e.g. Panamá,América); when users enter their search terms with diacriticalmarks postgres will find the requested records, but if a german user enter Panama or America (without diacriticalmarks), the search fails. > Is there any extension for postgres that allows for both modes of searching on the same data? > > Yours, > > Ralf Ullrich > > ______________________ > Ralf Ullrich, M.A. > Virtuelle Fachbibliothek > Ibero-Amerikanisches Institut > Preußischer Kulturbesitz > Potsdamer Str. 37 > D-10785 Berlin > Germany > Phone: +49 +30 266-2512 > Fax: +49 +30 266-2503 > E-Mail: ullrich@iai.spk-berlin.de > http://www.iai.spk-berlin.de > ______________________ > > Content-Description: Ullrich Ralf.vcf > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings
on 7/13/04 8:20 AM, Ullrich Ralf at Ullrich@iai.spk-berlin.de wrote: > I have a multilingual portal running on PostgreSQL 7.4.2. > My clients come from spain,portugal, latin america and germany (mainly). > The main feature of the site is a search engine that retrieves bibliographic > data, which is stored in > my database (unicode!) with diacritical marks (e.g. Panamá,América); when > users enter their search terms with diacritical marks postgres will find the > requested records, but if a german user enter Panama or America (without > diacritical marks), the search fails. > Is there any extension for postgres that allows for both modes of searching on > the same data? I've been wrestling with this issue too - I'm working on an art gallery database which includes work from a number of French-Canadian artists, plus a few from other countries where names and image titles typically involve accents as well. What I've been tentatively planning to do is to handle it in the PHP frontend rather than the database itself, by setting up a function with strtr() (string translate) that would strip out the accents while searching so that results would come up regardless of whether users entered the right accent, the wrong accent or no accent at all. The strtr() function allows you to specify a number of pairs of strings to translate, so I could make up a list of all the commonly used accented characters and have it translate all search text with those. I'd apply it to both the search terms entered and the text found, so that any "a" would match any other "a", regardless of whether it was really an à, á, ä, â or just plain a (let's see if those accents show up in anyone's e-mail...). It's kind of the way I'm handling case sensitivity now. Lynna -- Resource Centre Database Coordinator Gallery 44: www.gallery44.org Database Project: www.gallery44db.org