Re: UTF-8 and LIKE vs = - Mailing list pgsql-general

From David Wheeler
Subject Re: UTF-8 and LIKE vs =
Date
Msg-id F5FA9A60-F557-11D8-990D-000A95972D84@kineticode.com
Whole thread Raw
In response to Re: UTF-8 and LIKE vs =  (Markus Bertheau <twanger@bluetwanger.de>)
List pgsql-general
On Aug 23, 2004, at 3:46 PM, Markus Bertheau wrote:

> The collation rules of your (and my) locale say that these strings are
> the same:
>
> [markus@teetnang markus]$ cat > t
> 국방비
> 북한의
> [markus@teetnang markus]$ uniq t
> 국방비
> [markus@teetnang markus]$

Interesting.

> Make sure that you have initdb'd the database under the right locale.
> There's not much PostgreSQL can do if strcoll() says that the strings
> are equal.

Well, I have data from a number of different locales in the same
database. I'm hoping that setting the locale to "C" will do the trick.
It seems to work properly on my Mac:

sharky=# select * from keyword where name = '국방비';
  id |  name  | screen_name | sort_name | active
----+--------+-------------+-----------+--------
   0 | 국방비 | 국방비      | 국방비    |      1
(1 row)

sharky=# select * from keyword where name = '북한의';
  id | name | screen_name | sort_name | active
----+------+-------------+-----------+--------
(0 rows)

sharky=# select * from keyword where name like '북한의';
  id | name | screen_name | sort_name | active
----+------+-------------+-----------+--------
(0 rows)

sharky=# select * from keyword where lower(name) like '국방비';
  id |  name  | screen_name | sort_name | active
----+--------+-------------+-----------+--------
   0 | 국방비 | 국방비      | 국방비    |      1
(1 row)

Regards,

David
Attachment

pgsql-general by date:

Previous
From: David Wheeler
Date:
Subject: Re: UTF-8 and LIKE vs =
Next
From: Tom Lane
Date:
Subject: Re: UTF-8 and LIKE vs =