Home > mailing lists

Re: utf-8 and cultural sensitive sorting - Mailing list pgsql-general

From	Alex Stapleton
Subject	Re: utf-8 and cultural sensitive sorting
Date	July 12, 2005 12:22:05
Msg-id	D0EEABC7-E3BF-45D6-BF02-341CAC1DE632@advfn.com Whole thread Raw
In response to	utf-8 and cultural sensitive sorting (<sknipe@tucows.com>)
Responses	Re: utf-8 and cultural sensitive sorting
List	pgsql-general

Tree view

It depends what language you want to sort. Lots of languages do not
have a sort alphabet. For example, Japanese. It can be quite
difficult to sort unusual languages like this. I am not aware of any
standard technique for sorting Japanese text other than keeping an
arbitrarily sorted dictionary (courtesy of whatever the most popular
Japanese dictionary at the time happens to be perhaps) and then doing
hash lookups in the for indexing values. As you can imagine, this is
not particularly fast. I have not actually tried this, but I expect
PosgreSQL will simply sort in a fairly binary fashion. As in, it gets
sorted in according to the binary value of the characters, or the
UTF-8 offsets, or something like that.

On 12 Jul 2005, at 15:48, <sknipe@tucows.com> <sknipe@tucows.com> wrote:

> Our product will be storing its character data in utf-8 format
> (unicode encoding).
>
> What is the best way to achive cultural sensitive sorting using the
> utf-8 data?
>
> Is it possible have the locale apply to a connection?
>
> If so, is the cultural sorting support mature in PostgreSQL?
>
> What type of performance can be expected as compared with the
> normal c locale sorting?
>
> Thanks very much,
>
> Steve.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that
> your
>       message can get through to the mailing list cleanly
>

pgsql-general by date:

From: Richard Huxton
Date: 12 July 2005, 12:15:46
Subject: Re: utf-8 and cultural sensitive sorting

From: Roman Neuhauser
Date: 12 July 2005, 12:35:45
Subject: Re: Update more than one table

Re: utf-8 and cultural sensitive sorting - Mailing list pgsql-general

Previous

Next