Thread: ilike and utf-8

ilike and utf-8

From
"Raphael Bauduin"
Date:
Hi,

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?
I've had remarks of a user (of http://myowndb.com, a web database)
with text in cyrillic that his searches are not case insensitive,
although I use the ilke operator in the code. And it works perfectly
for my data (that are not in cyrillic).

Thanks

Raph

Re: ilike and utf-8

From
Martijn van Oosterhout
Date:
On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:
> Hi,
>
> Does the ilike operator work fine with cyrillic text put in a UTF-8
> encoded database?
> I've had remarks of a user (of http://myowndb.com, a web database)
> with text in cyrillic that his searches are not case insensitive,
> although I use the ilke operator in the code. And it works perfectly
> for my data (that are not in cyrillic).

UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: ilike and utf-8

From
"Tomi NA"
Date:

On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:
> Hi,
>
> Does the ilike operator work fine with cyrillic text put in a UTF-8
> encoded database?
> I've had remarks of a user (of http://myowndb.com, a web database)
> with text in cyrillic that his searches are not case insensitive,
> although I use the ilke operator in the code. And it works perfectly
> for my data (that are not in cyrillic).

UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?

I'd like to know the same thing. I'm using GNU/linux and ISO-8859-2 (when UTF-8 isn't an option).

Tomislav

Re: ilike and utf-8

From
"Raphael Bauduin"
Date:
It's a Debian GNU/Linux, with a self-compiled 8.1.3 postgresql.

Raph

On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:
> > Hi,
> >
> > Does the ilike operator work fine with cyrillic text put in a UTF-8
> > encoded database?
> > I've had remarks of a user (of http://myowndb.com, a web database)
> > with text in cyrillic that his searches are not case insensitive,
> > although I use the ilke operator in the code. And it works perfectly
> > for my data (that are not in cyrillic).
>
> UTF-8 support for case-comparison is operatnig system dependant. What
> systems are we comparing here?
>
> Have a nice day,
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
>
> iD8DBQFEP6gDIB7bNG8LQkwRAgyUAJsGusLIxrdkiaDg11727770bquYCgCfWgCZ
> /SYTVp84hAf/jx8pO+js8pY=
> =afee
> -----END PGP SIGNATURE-----
>
>
>

Re: ilike and utf-8

From
Tom Lane
Date:
"Raphael Bauduin" <rblists@gmail.com> writes:
> Does the ilike operator work fine with cyrillic text put in a UTF-8
> encoded database?

If you've initdb'd in an appropriate locale (probably named something
like ru_RU.utf8) then it should work.  I wouldn't expect a random
non-Russian locale to necessarily know about Cyrillic case conversions,
however.

Martijn's nearby comment about OS dependency really boils down to the
fact that different OSes may have different definitions for similarly
named locales.  We need to know what locale you're using (try "SHOW
LC_CTYPE") as well as the OS.

            regards, tom lane

Re: ilike and utf-8

From
"Raphael Bauduin"
Date:
On 4/14/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Raphael Bauduin" <rblists@gmail.com> writes:
> > Does the ilike operator work fine with cyrillic text put in a UTF-8
> > encoded database?
>
> If you've initdb'd in an appropriate locale (probably named something
> like ru_RU.utf8) then it should work.  I wouldn't expect a random
> non-Russian locale to necessarily know about Cyrillic case conversions,
> however.

The problem is that the system is serving, at the same time, content
for different locales, so I can't set it at the environment level.
Maybe I should set a user setting so a user can choose which locale to
use.

Thanks for the help!

Raph

>
> Martijn's nearby comment about OS dependency really boils down to the
> fact that different OSes may have different definitions for similarly
> named locales.  We need to know what locale you're using (try "SHOW
> LC_CTYPE") as well as the OS.
>
>                         regards, tom lane
>

Re: ilike and utf-8

From
Balazs.Klein@t-online.hu
Date:
I have a similar problem that I raised here (see link) but I don't have
the solution yet.
I received several ideas, but so far not a solution that would actually
work for me.
You may want to give the function that you find in this thread a try.
It didn't work for me, but maybe it will for you - let me know please
if it does, I am still looking for an answer.

http://groups.google.com/group/pgsql.general/browse_thread/thread/20aed89ab0e19e3d/4771fb1be397afea#4771fb1be397afea

Regards,

Balázs