Thread: ORDER BY 'criteria'

ORDER BY 'criteria'

From
Dennis Gearon
Date:
Scenario:
    A table that has recoreds in every language under the sun in it,
    in UTF-8 format,
    A column which identifies which language a record is in,
    An INT4 column indentifying the items which are stored in
        multiple languages.

    Examples:
        CtryID    Country    Language
        1        Russia    US-EN
        1        Руссйя    RU        <russian spelled in Russian, sort of>

I want to pull out groups of records by language and the country, and then
    SORT THE RECORDS IN THE LANGUAGE SELECTED.

Is there any way to say which language to ORDER BY in an SQL statement? Is there in the SQL
standard? If not, this is a serious deficiency for global companies.



Re: ORDER BY 'criteria'

From
Ian Barwick
Date:
On Thursday 23 January 2003 18:02, Dennis Gearon wrote:
> Scenario:
>     A table that has recoreds in every language under the sun in it,
>     in UTF-8 format,
>     A column which identifies which language a record is in,
>     An INT4 column indentifying the items which are stored in
>         multiple languages.
>
>     Examples:
>         CtryID    Country    Language
>         1        Russia    US-EN
>         1        Руссйя    RU        <russian spelled in Russian, sort of>
>
> I want to pull out groups of records by language and the country, and then
>     SORT THE RECORDS IN THE LANGUAGE SELECTED.
>
> Is there any way to say which language to ORDER BY in an SQL statement? Is
> there in the SQL standard?

You mean "is there any way to determine the sort order used by
ORDER BY", i.e. for records defined as Russian and containing Cyrillic UTF-8
characters use Russian sorting order etc.?

The short answer is "no" (I am happy to be proven wrong of course ;-).
As far as PostgreSQL is concerned, the locale used to determine sorting
order is predefined at initdb and can't be changed dynamically.
I don't know what the SQL standards say about this.

There is probably no practical way of doing what you are looking for,
especially when you consider the sorting order of some languages
cannot be algorithmically determined. Japanese for example would
require an extra field containing a phonetic key for each record.
For example, in the following two words the first character is the same,
but the first word is pronounce "Kobe" and the second "Kanda":
  神戸
  神田
(the first character can also be pronounced "shin" or "kami").

> If not, this is a serious deficiency for global companies.

I suspect global companies maintain local dependencies which make
sure customer lists or whatever are generated in the correct order...

Ian Barwick
barwick@gmx.net


Re: ORDER BY 'criteria'

From
Dennis Gearon
Date:
The init_db listed, does this init a particular DATABASE, or the WHOLE ENGINE to a particular
language encoding?

If it's the former that takes place, then spreading the schema out over databases would solve
the problem if the application code kept track of everything. (Pain in the butt)

>> I want to pull out groups of records by language and the country, and then
>>     SORT THE RECORDS IN THE LANGUAGE SELECTED.
>>
>> Is there any way to say which language to ORDER BY in an SQL statement? Is
>> there in the SQL standard?
>
>You mean "is there any way to determine the sort order used by
>ORDER BY", i.e. for records defined as Russian and containing Cyrillic UTF-8
>characters use Russian sorting order etc.?
>
>The short answer is "no" (I am happy to be proven wrong of course ;-).
>As far as PostgreSQL is concerned, the locale used to determine sorting
>order is predefined at initdb and can't be changed dynamically.
>I don't know what the SQL standards say about this.
>
>There is probably no practical way of doing what you are looking for,
>especially when you consider the sorting order of some languages
>cannot be algorithmically determined. Japanese for example would
>require an extra field containing a phonetic key for each record.
>For example, in the following two words the first character is the same,
>but the first word is pronounce "Kobe" and the second "Kanda":
>  神戸
>  神田
>(the first character can also be pronounced "shin" or "kami").
>
>> If not, this is a serious deficiency for global companies.




Re: ORDER BY 'criteria'

From
Tom Lane
Date:
Ian Barwick <barwick@gmx.net> writes:
> The short answer is "no" (I am happy to be proven wrong of course ;-).
> As far as PostgreSQL is concerned, the locale used to determine sorting
> order is predefined at initdb and can't be changed dynamically.
> I don't know what the SQL standards say about this.

The SQL spec envisions attaching a locale specification to each
individual column.  We don't support that ... yet.  (There's been
a good deal of talk about it, and we'd like to handle it someday.)
At present, Ian is right: initdb determines the sort order for the
database cluster it creates.

            regards, tom lane

Re: ORDER BY 'criteria'

From
Dennis Gearon
Date:
What's a database cluster, Tom?
--

Carpe Dancem ;-)
-----------------------------------------------------------------
Remember your friends while they are alive
-----------------------------------------------------------------
                         Sincerely, Dennis Gearon

Re: ORDER BY 'criteria'

From
Curt Sampson
Date:
On Thu, 23 Jan 2003, Dennis Gearon wrote:

> What's a database cluster, Tom?

It's all of the the data in a PostgreSQL data directory created with
initdb(1). (You can create many databases in it using the CREATE
DATABASE command.) You might also call this an "instance."

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: ORDER BY 'criteria'

From
Dennis Gearon
Date:
I think I will write a custom C function in the future this year, that
writes a replacement value into a designated query result column that
can be the target of a binary ORDER BY based on translation from a
collation table.

Don't know how the pictogram languages work, (old egyptian, modern
chinese, korean, japanese, others), but the character based languages
would sort fine.

Example

create table ctrys_in_all_languages( -- not syntactically correct below
ID int4 primary key,
iso_lang_code blob,
ctry_name blob );

fill table; -- pseudo code

select ctry_name, sorting_column = ' function here (ctry_name, 'fr')'
from ctrys_in_all_languages
where ( iso_lang_code = 'fr' )
order by sorting_column;

discard ctry_name;
--

Carpe Dancem ;-)
-----------------------------------------------------------------
Remember your friends while they are alive
-----------------------------------------------------------------
                         Sincerely, Dennis Gearon

Re: ORDER BY 'criteria'

From
Bruce Momjian
Date:
Cluster is kind of overloaded, as initdb, replication (clustering), and
CLUSTER command.  Not sure what we can do about that, though.

---------------------------------------------------------------------------

Curt Sampson wrote:
> On Thu, 23 Jan 2003, Dennis Gearon wrote:
>
> > What's a database cluster, Tom?
>
> It's all of the the data in a PostgreSQL data directory created with
> initdb(1). (You can create many databases in it using the CREATE
> DATABASE command.) You might also call this an "instance."
>
> cjs
> --
> Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
>     Don't you know, in this new Dark Age, we're all light.  --XTC
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: ORDER BY 'criteria'

From
Curt Sampson
Date:
On Mon, 27 Jan 2003, Bruce Momjian wrote:

> Cluster is kind of overloaded, as initdb, replication (clustering), and
> CLUSTER command.  Not sure what we can do about that, though.

Well, since we mostly communicate by e-mail, anyway, we can just use
different spellings: "cluster," "kluster," "clustar," "klustar," etc.
With an appropriate amount of creativity, I reckon this will support
many more meanings than we currently have, leaving lots of room for
future expansion.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: ORDER BY 'criteria'

From
"codeWarrior"
Date:
This sounds like it's a "localization" issue --> aka: L18N --> ANd -- I
believe that you gave yourself the answer in your question...

It occurs to me that you should be able to GROUP BY CtryID, ORDER BY
fieldcontainingthedata;


"Dennis Gearon" <gearond@cvc.net> wrote in message
news:1U72ZWSKFON95EBVQDA2JD7QKPIF.3e302042@cal-lab...
> Scenario:
> A table that has recoreds in every language under the sun in it,
> in UTF-8 format,
> A column which identifies which language a record is in,
> An INT4 column indentifying the items which are stored in
> multiple languages.
>
> Examples:
> CtryID Country Language
> 1 Russia US-EN
> 1 � �fссйя RU <russian spelled in Russian, sort of>
>
> I want to pull out groups of records by language and the country, and then
> SORT THE RECORDS IN THE LANGUAGE SELECTED.
>
> Is there any way to say which language to ORDER BY in an SQL statement? Is
there in the SQL
> standard? If not, this is a serious deficiency for global companies.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)