Thread: unicode and =
= is not working on a char(30) coloumn for me. I want to find rows with equal name. I have my database set to unicode. SQL1 SELECT h1.key,h1.name,h2.key,h2.name FROM table1 as h1, table1 as h2 WHERE h1.name=h2.name and h1.OID = 730716 produces result rows where name doe not match name is multibyte UTF-8 values. SQL1 SELECT h1.key,h1.name,h2.key,h2.name FROM table1 as h1, table1 as h2 WHERE h1.key=h2.key and h1.OID = 730716 produces correct results. key is single byte UTF-8 values only (digits only) I have a hash index on name, I dropped it and got a different but still wrong result. key is part of a multicolumn primary kay version 8.0.3 - gcc 3.4.3 fedora 3 Any suggestion on how to match multibyte characters? Do I need to use a differnt comparison operator? Thanks, Grant
"Grant Morgan" <grant@ryuuguu.com> writes: > = is not working on a char(30) coloumn for me. > I want to find rows with equal name. > I have my database set to unicode. I'll bet you are running the postmaster in a locale that isn't expecting utf-8 encoding. The locale and encoding have to match or you're going to get very strange behavior. regards, tom lane
I am not sure what locale I was running as I had not set it when doing initdb. I created a new DB with --locale=en_US.utf8 -E UNICODE and imported my data from original source (not copied from old DB) and still have the smae problem that UNICODE strings withdouble byte characters that are not equal get selected as equal. to test things further md5(h1.name)=md5(h2.name) works and only matches equal values. h1.name=h2.name match un equal values. Anyone have any other ideas? or is en_US.utf8 not a proper utf8 locale ( I got the name by doing locale -a ) I am not so concerned about sorting on this project just equality, but general solution would be apreciated. Thanks, Grant On Mon, 20 Jun 2005 10:13:39 +0900, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Grant Morgan" <grant@ryuuguu.com> writes: >> = is not working on a char(30) coloumn for me. >> I want to find rows with equal name. >> I have my database set to unicode. > > I'll bet you are running the postmaster in a locale that isn't expecting > utf-8 encoding. The locale and encoding have to match or you're going > to get very strange behavior. > > regards, tom lane > >