Thread: ilike and utf-8
Hi, Does the ilike operator work fine with cyrillic text put in a UTF-8 encoded database? I've had remarks of a user (of http://myowndb.com, a web database) with text in cyrillic that his searches are not case insensitive, although I use the ilke operator in the code. And it works perfectly for my data (that are not in cyrillic). Thanks Raph
On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote: > Hi, > > Does the ilike operator work fine with cyrillic text put in a UTF-8 > encoded database? > I've had remarks of a user (of http://myowndb.com, a web database) > with text in cyrillic that his searches are not case insensitive, > although I use the ilke operator in the code. And it works perfectly > for my data (that are not in cyrillic). UTF-8 support for case-comparison is operatnig system dependant. What systems are we comparing here? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote:
On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:
> Hi,
>
> Does the ilike operator work fine with cyrillic text put in a UTF-8
> encoded database?
> I've had remarks of a user (of http://myowndb.com, a web database)
> with text in cyrillic that his searches are not case insensitive,
> although I use the ilke operator in the code. And it works perfectly
> for my data (that are not in cyrillic).
UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?
I'd like to know the same thing. I'm using GNU/linux and ISO-8859-2 (when UTF-8 isn't an option).
Tomislav
It's a Debian GNU/Linux, with a self-compiled 8.1.3 postgresql. Raph On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote: > > Hi, > > > > Does the ilike operator work fine with cyrillic text put in a UTF-8 > > encoded database? > > I've had remarks of a user (of http://myowndb.com, a web database) > > with text in cyrillic that his searches are not case insensitive, > > although I use the ilke operator in the code. And it works perfectly > > for my data (that are not in cyrillic). > > UTF-8 support for case-comparison is operatnig system dependant. What > systems are we comparing here? > > Have a nice day, > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > > tool for doing 5% of the work and then sitting around waiting for someone > > else to do the other 95% so you can sue them. > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQFEP6gDIB7bNG8LQkwRAgyUAJsGusLIxrdkiaDg11727770bquYCgCfWgCZ > /SYTVp84hAf/jx8pO+js8pY= > =afee > -----END PGP SIGNATURE----- > > >
"Raphael Bauduin" <rblists@gmail.com> writes: > Does the ilike operator work fine with cyrillic text put in a UTF-8 > encoded database? If you've initdb'd in an appropriate locale (probably named something like ru_RU.utf8) then it should work. I wouldn't expect a random non-Russian locale to necessarily know about Cyrillic case conversions, however. Martijn's nearby comment about OS dependency really boils down to the fact that different OSes may have different definitions for similarly named locales. We need to know what locale you're using (try "SHOW LC_CTYPE") as well as the OS. regards, tom lane
On 4/14/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Raphael Bauduin" <rblists@gmail.com> writes: > > Does the ilike operator work fine with cyrillic text put in a UTF-8 > > encoded database? > > If you've initdb'd in an appropriate locale (probably named something > like ru_RU.utf8) then it should work. I wouldn't expect a random > non-Russian locale to necessarily know about Cyrillic case conversions, > however. The problem is that the system is serving, at the same time, content for different locales, so I can't set it at the environment level. Maybe I should set a user setting so a user can choose which locale to use. Thanks for the help! Raph > > Martijn's nearby comment about OS dependency really boils down to the > fact that different OSes may have different definitions for similarly > named locales. We need to know what locale you're using (try "SHOW > LC_CTYPE") as well as the OS. > > regards, tom lane >
I have a similar problem that I raised here (see link) but I don't have the solution yet. I received several ideas, but so far not a solution that would actually work for me. You may want to give the function that you find in this thread a try. It didn't work for me, but maybe it will for you - let me know please if it does, I am still looking for an answer. http://groups.google.com/group/pgsql.general/browse_thread/thread/20aed89ab0e19e3d/4771fb1be397afea#4771fb1be397afea Regards, Balázs