Re: Full text index without accents - Mailing list pgsql-general
From | Jonathan Bond-Caron |
---|---|
Subject | Re: Full text index without accents |
Date | |
Msg-id | 005d01c8ed9d$138fd1e0$3aaf75a0$@com Whole thread Raw |
In response to | Re: Full text index without accents ("Fco. Mario Barcala" Rodríguez<lbarcala@freeresearch.org>) |
Responses |
Re: Full text index without accents
|
List | pgsql-general |
This would probably help: CREATE OR REPLACE FUNCTION norm_text_latin(character varying) RETURNS character varying AS $BODY$ declare p_str alias for $1; v_str varchar; begin select translate(p_str, 'ÀÁÂÃÄÅ', 'AAAAAA') into v_str; select translate(v_str, 'ÉÈËÊ', 'EEEE') into v_str; select translate(v_str, 'ÌÍÎÏ', 'IIII') into v_str; select translate(v_str, 'ÌÍÎÏ', 'IIII') into v_str; select translate(v_str, 'ÒÓÔÕÖ', 'OOOOO') into v_str; select translate(v_str, 'ÙÚÛÜ', 'UUUU') into v_str; select translate(v_str, 'àáâãäå', 'aaaaaa') into v_str; select translate(v_str, 'èéêë', 'eeee') into v_str; select translate(v_str, 'ìíîï', 'iiii') into v_str; select translate(v_str, 'òóôõö', 'ooooo') into v_str; select translate(v_str, 'ùúûü', 'uuuu') into v_str; select translate(v_str, 'Çç', 'Cc') into v_str; return v_str; end;$BODY$ LANGUAGE 'plpgsql' VOLATILE; There's also o useful functions here: http://www.project-open.org/doc/intranet-search-pg/intranet-search-pg-create.sql -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Fco. Mario Barcala Rodríguez Sent: July 24, 2008 4:47 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Full text index without accents Finally I create a function like: CREATE OR REPLACE FUNCTION nonsensible (text) RETURNS text AS $$ DECLARE var1 varchar; BEGIN var1=replace($1, 'á', 'a'); var1=replace(var1, 'é', 'e'); var1=replace(var1, 'í', 'i'); var1=replace(var1, 'ó', 'o'); var1=replace(var1, 'ú', 'u'); var1=replace(var1, 'Á', 'A'); var1=replace(var1, 'É', 'E'); var1=replace(var1, 'Í', 'I'); var1=replace(var1, 'Ó', 'O'); var1=replace(var1, 'Ú', 'U'); return var1; END $$LANGUAGE plpgsql immutable; Then, create text indexes, one for sensible queries and other for unsensible ones: CREATE INDEX textindex ON document USING gin(to_tsvector('spanish',text)); CREATE INDEX textindexn ON document USING gin(to_tsvector('spanish',nonsensible(text))); And then make a query sensible or unsensible to accents doing: SELECT id FROM document WHERE to_tsvector('spanish',text) @@ to_tsquery('spanish','word_with_accent'); or: SELECT id FROM document WHERE to_tsvector('spanish',nonsensible(text)) @@ to_tsquery('spanish',nonsensible('word_with_accent')); respectively. I think postgreSQL uses both indexes as necessary. I believe to remember reading something about it in the documentation. Thank you very much, Mario Barcala > Here is an example > > CREATE FUNCTION dropatsymbol(text) RETURNS text > AS 'select replace($1, ''@'', '' '');' > LANGUAGE SQL; > > arxiv=# select to_tsvector('english',dropatsymbol('oleg@sai.msu.su')); > to_tsvector > ------------------------- > 'oleg':1 'sai.msu.su':2 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
pgsql-general by date: