Thread: Accent-insensitive

Accent-insensitive

From
"Pedro Igor"
Date:
Does have PostgreSQL some option to allow me execute selects accent-insensitive ?
I can´t find any reference, including the manual ....
 
Pedro Igor

Re: Accent-insensitive

From
Cédric Dufour (public)
Date:
Use the 'to_ascii' function to convert your string to the "no accent" ASCII equivalent before accomplishing your comparison
 
SELECT foo FROM table WHERE ( to_ascii(foo) ILIKE to_ascii('caractères accentués') );
 
This does not work with all database locale (LATIN1 is OK, but LATIN9 is not).
 
I was actually wondering if this is efficient enough or if there is any more efficient method to accomplish this...
 
    Cédric D.
-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Pedro Igor
Sent: Friday, December 06, 2002 14:31
To: pgsql-sql@postgresql.org
Subject: [SQL] Accent-insensitive

Does have PostgreSQL some option to allow me execute selects accent-insensitive ?
I can´t find any reference, including the manual ....
 
Pedro Igor

Re: Accent-insensitive

From
Joel Burton
Date:
On Fri, Dec 06, 2002 at 09:33:10PM +0100, Cédric Dufour (public) wrote:
> Use the 'to_ascii' function to convert your string to the "no accent" ASCII
> equivalent before accomplishing your comparison
> 
> SELECT foo FROM table WHERE ( to_ascii(foo) ILIKE to_ascii('caractères
> accentués') );
> 
> This does not work with all database locale (LATIN1 is OK, but LATIN9 is
> not).
> 
> I was actually wondering if this is efficient enough or if there is any more
> efficient method to accomplish this...

I'd think that something like:

CREATE FUNCTION lower_ascii (text) RETURNS text AS ' BEGIN   RETURN lower(to_ascii($1)); END
' language 'plpgsql';

CREATE INDEX table_lower_ascii ON table(lower_ascii(field));

would perform better, since we can now use this index, whereas we
couldn't do this with ILIKE to_ascii(...).

Also, not sure it's a good idea to use ILIKE simply to get
lower-case-matching. If the user string ends with '%', for instance, it
will match everything-starting-with, which is probably not what the user
meant. Better the check against lower().

There might be a better way specifically-oriented toward
de-accentuation; this is just generic PG advice.

- J.
-- 

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant


Re: Accent-insensitive

From
Joel Burton
Date:
On Sat, Dec 07, 2002 at 07:06:45PM -0300, Pedro Igor wrote:
> Thanks, you know if some possible release would have some internal to deal
> with this ?
> 
> Abraços
> Pedro Igor
> >
> > CREATE FUNCTION lower_ascii (text) RETURNS text AS '
> >   BEGIN
> >     RETURN lower(to_ascii($1));
> >   END
> > ' language 'plpgsql';
> >
> > CREATE INDEX table_lower_ascii ON table(lower_ascii(field));

Pedro --

Please keep conversations on the list -- other people may know things I
don't (actually, they certainly will!), and it allows other people to
follow the conversation.

As for this being internal, I have no idea -- you could submit it as a
suggestion.

Given how easy it is to implement in plpgsql, I suspect this kind of thing will
stay out of the internals. Keep in mind that if you define this function
in your template1 database, you can have it created automatically in all
new databases you create, so it's one less thing to worry about.

HTH.

-- 

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant