Re: UTF-8 and LIKE vs = - Mailing list pgsql-general

From Ian Barwick
Subject Re: UTF-8 and LIKE vs =
Date
Msg-id 1d581afe040823163456af8598@mail.gmail.com
Whole thread Raw
In response to UTF-8 and LIKE vs =  (David Wheeler <david@kineticode.com>)
Responses Re: UTF-8 and LIKE vs =
Re: UTF-8 and LIKE vs =
Re: UTF-8 and LIKE vs =
List pgsql-general
On Tue, 24 Aug 2004 00:46:50 +0200, Markus Bertheau
<twanger@bluetwanger.de> wrote:
>
>
> В Пнд, 23.08.2004, в 23:04, David Wheeler пишет:
> > On Aug 23, 2004, at 1:58 PM, Ian Barwick wrote:
> >
> > > er, the characters in "name" don't seem to match the characters in the
> > > query - '국방비' vs. '북한의' - does that have any bearing?
> >
> > Yes, it means that = is doing the wrong thing!!
>
> The collation rules of your (and my) locale say that these strings are
> the same:
>
> [markus@teetnang markus]$ cat > t
> 국방비
> 북한의
> [markus@teetnang markus]$ uniq t
> 국방비
> [markus@teetnang markus]$

wild speculation in need of a Korean speaker, but:

ian@linux:~/tmp> cat j.txt
テスト
환경설
전검색
웹문서
국방비
북한의
てすと
ian@linux:~/tmp> uniq  j.txt
テスト
환경설
てすと

All but the first and last lines are random Korean (Hangul)
characters. Evidently our respective locales think all Hangul strings
of the same length are identical, which is very probably not the
case...

Ian Barwick

pgsql-general by date:

Previous
From: Josué Maldonado
Date:
Subject: Re: pg_dump/psql fails on win32 beta 8.0
Next
From: Tom Lane
Date:
Subject: Re: UTF-8 and LIKE vs =