unaccent - Mailing list pgsql-hackers

From nngodinh@tiscali.it
Subject unaccent
Date
Msg-id 3D6DC6360001FB32@mail-1.tiscalinet.it
Whole thread Raw
Responses Re: unaccent  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-hackers
Greetings,

As far as I use the txtidx data structure in conjunction with gist indexing
to make a word indexing of a very large UNICODE db, I've implemented a PostgreSQL
function that uses libunac to unaccent TEXT fileds.

The resulting text is in UTF-8, but you can modify it in the sources with
an appropriate value (using iconv charset names).

Get libunac from: http://www.nongnu.org/unac/ (it uses iconv)

Extract the archive, compile it (make). Move pg_unac.so to your postgresql
shared libraries dir.

Link it in postgresql:

CREATE FUNCTION unac(TEXT) RETURNS TEXT AS 'path_to_pg_unac.so' LANGUAGE
C;

What about integrating unaccent libraries directly in tsearch? It is useful
for french search engines (for instance).

Bye.

Nhan NGO DINH


__________________________________________________________________
Tiscali Ricaricasa
la prima prepagata per navigare in Internet a meno di un'urbana e
risparmiare su tutte le tue telefonate. Acquistala on line e non avrai
nessun costo di attivazione né di ricarica!
http://ricaricasaonline.tiscali.it/




Attachment

pgsql-hackers by date:

Previous
From: iafmgc@unileon.es
Date:
Subject: genetic algorithm in PostgreSQL
Next
From: nngodinh@tiscali.it
Date:
Subject: strip a character from text