Re: UTF-8 and LIKE vs = - Mailing list pgsql-general

From Joel
Subject Re: UTF-8 and LIKE vs =
Date
Msg-id 20040824105027.1B8D.REES@ddcom.co.jp
Whole thread Raw
In response to Re: UTF-8 and LIKE vs =  (Ian Barwick <barwick@gmail.com>)
List pgsql-general
On Tue, 24 Aug 2004 01:34:46 +0200
Ian Barwick <barwick@gmail.com> wrote

> ...
> wild speculation in need of a Korean speaker, but:
>
> ian@linux:~/tmp> cat j.txt
> 繝㋚せ繝\x88
> 紇俾イス牕、
> 琊⁇イ\x80lラ
> 珖ケ橖ク牕\x9C
> 弶ュ𣝣ゥ欄\x84
> 櫤≶復珣\x98
> 縺ヲ縺吶→
> ian@linux:~/tmp> uniq  j.txt
> 繝㋚せ繝\x88
> 紇俾イス牕、
> 縺ヲ縺吶→
>
> All but the first and last lines are random Korean (Hangul)
> characters. Evidently our respective locales think all Hangul strings
> of the same length are identical, which is very probably not the
> case...

My browser just nicely botched replying on those, but looking at Ian's
post, the first and last lines looked like "test" written in Japanese,
the first line in katakana and the last line in hiragana.

The following should end up posted as shift-JIS, but

テスト
and
てすと

should collate the same under some contexts, since it's more-or-less
equivalent to a variation in case.

--
Joel <rees@ddcom.co.jp>


pgsql-general by date:

Previous
From: David Wheeler
Date:
Subject: Re: UTF-8 and LIKE vs =
Next
From: jimworke@inbox.lv
Date:
Subject: Re: Unsupported 3rd-party solutions (Was: Few questions