Home > mailing lists

Re: Accent insensitive search - Mailing list pgsql-general

From	PFC
Subject	Re: Accent insensitive search
Date	June 21, 2007 06:57:40
Msg-id	op.tt9m8upxcigqcu@apollo13 Whole thread Raw
In response to	Accent insensitive search (Diego Manilla Suárez <diego.manilla@xeridia.com>)
Responses	Re: Accent insensitive search
List	pgsql-general

Tree view

> Hi. I have a few databases created with UNICODE encoding, and I would
> like to be able to search with accent insensitivity. There's something
> in Oracle (NLS_COMP, NLS_SORT) and SQL Server (don't remember) to do
> this, but I found nothing in PostgreSQL, just the 'to_ascii' function,
> which AFAIK, doesn't work with UNICODE.

    The easiest way is to create an extra column which will hold a copy of
your text, with all accents removed. You can also convert it to lowercase
and remove apostrophes, punctuation etc. Said column is kept up to date
with a trigger.
    Python is suitable for this (use unicodedata.normalize).
    Keeping a copy of the processed data will speed up search versus WHERE
remove_accents( blah ) = 'text', even with a function index.
    Note that this function could be written in C and use a table on the
first 64K unicode symbols for speedup.

    See attached file.

Attachment

create_ft_functions.sql

pgsql-general by date:

From: Richard Huxton
Date: 21 June 2007, 06:57:01
Subject: Re: Recovery/Restore and Roll Forward Question.

From: Vincenzo Romano
Date: 21 June 2007, 07:00:55
Subject: [PGSQL 8.2.x] INSERT+INSERT

Re: Accent insensitive search - Mailing list pgsql-general

Attachment

Previous

Next