Thread: ORDER BY 'criteria'
Scenario: A table that has recoreds in every language under the sun in it, in UTF-8 format, A column which identifies which language a record is in, An INT4 column indentifying the items which are stored in multiple languages. Examples: CtryID Country Language 1 Russia US-EN 1 Руссйя RU <russian spelled in Russian, sort of> I want to pull out groups of records by language and the country, and then SORT THE RECORDS IN THE LANGUAGE SELECTED. Is there any way to say which language to ORDER BY in an SQL statement? Is there in the SQL standard? If not, this is a serious deficiency for global companies.
On Thursday 23 January 2003 18:02, Dennis Gearon wrote: > Scenario: > A table that has recoreds in every language under the sun in it, > in UTF-8 format, > A column which identifies which language a record is in, > An INT4 column indentifying the items which are stored in > multiple languages. > > Examples: > CtryID Country Language > 1 Russia US-EN > 1 Руссйя RU <russian spelled in Russian, sort of> > > I want to pull out groups of records by language and the country, and then > SORT THE RECORDS IN THE LANGUAGE SELECTED. > > Is there any way to say which language to ORDER BY in an SQL statement? Is > there in the SQL standard? You mean "is there any way to determine the sort order used by ORDER BY", i.e. for records defined as Russian and containing Cyrillic UTF-8 characters use Russian sorting order etc.? The short answer is "no" (I am happy to be proven wrong of course ;-). As far as PostgreSQL is concerned, the locale used to determine sorting order is predefined at initdb and can't be changed dynamically. I don't know what the SQL standards say about this. There is probably no practical way of doing what you are looking for, especially when you consider the sorting order of some languages cannot be algorithmically determined. Japanese for example would require an extra field containing a phonetic key for each record. For example, in the following two words the first character is the same, but the first word is pronounce "Kobe" and the second "Kanda": 神戸 神田 (the first character can also be pronounced "shin" or "kami"). > If not, this is a serious deficiency for global companies. I suspect global companies maintain local dependencies which make sure customer lists or whatever are generated in the correct order... Ian Barwick barwick@gmx.net
The init_db listed, does this init a particular DATABASE, or the WHOLE ENGINE to a particular language encoding? If it's the former that takes place, then spreading the schema out over databases would solve the problem if the application code kept track of everything. (Pain in the butt) >> I want to pull out groups of records by language and the country, and then >> SORT THE RECORDS IN THE LANGUAGE SELECTED. >> >> Is there any way to say which language to ORDER BY in an SQL statement? Is >> there in the SQL standard? > >You mean "is there any way to determine the sort order used by >ORDER BY", i.e. for records defined as Russian and containing Cyrillic UTF-8 >characters use Russian sorting order etc.? > >The short answer is "no" (I am happy to be proven wrong of course ;-). >As far as PostgreSQL is concerned, the locale used to determine sorting >order is predefined at initdb and can't be changed dynamically. >I don't know what the SQL standards say about this. > >There is probably no practical way of doing what you are looking for, >especially when you consider the sorting order of some languages >cannot be algorithmically determined. Japanese for example would >require an extra field containing a phonetic key for each record. >For example, in the following two words the first character is the same, >but the first word is pronounce "Kobe" and the second "Kanda": > 神戸 > 神田 >(the first character can also be pronounced "shin" or "kami"). > >> If not, this is a serious deficiency for global companies.
Ian Barwick <barwick@gmx.net> writes: > The short answer is "no" (I am happy to be proven wrong of course ;-). > As far as PostgreSQL is concerned, the locale used to determine sorting > order is predefined at initdb and can't be changed dynamically. > I don't know what the SQL standards say about this. The SQL spec envisions attaching a locale specification to each individual column. We don't support that ... yet. (There's been a good deal of talk about it, and we'd like to handle it someday.) At present, Ian is right: initdb determines the sort order for the database cluster it creates. regards, tom lane
What's a database cluster, Tom? -- Carpe Dancem ;-) ----------------------------------------------------------------- Remember your friends while they are alive ----------------------------------------------------------------- Sincerely, Dennis Gearon
On Thu, 23 Jan 2003, Dennis Gearon wrote: > What's a database cluster, Tom? It's all of the the data in a PostgreSQL data directory created with initdb(1). (You can create many databases in it using the CREATE DATABASE command.) You might also call this an "instance." cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
I think I will write a custom C function in the future this year, that writes a replacement value into a designated query result column that can be the target of a binary ORDER BY based on translation from a collation table. Don't know how the pictogram languages work, (old egyptian, modern chinese, korean, japanese, others), but the character based languages would sort fine. Example create table ctrys_in_all_languages( -- not syntactically correct below ID int4 primary key, iso_lang_code blob, ctry_name blob ); fill table; -- pseudo code select ctry_name, sorting_column = ' function here (ctry_name, 'fr')' from ctrys_in_all_languages where ( iso_lang_code = 'fr' ) order by sorting_column; discard ctry_name; -- Carpe Dancem ;-) ----------------------------------------------------------------- Remember your friends while they are alive ----------------------------------------------------------------- Sincerely, Dennis Gearon
Cluster is kind of overloaded, as initdb, replication (clustering), and CLUSTER command. Not sure what we can do about that, though. --------------------------------------------------------------------------- Curt Sampson wrote: > On Thu, 23 Jan 2003, Dennis Gearon wrote: > > > What's a database cluster, Tom? > > It's all of the the data in a PostgreSQL data directory created with > initdb(1). (You can create many databases in it using the CREATE > DATABASE command.) You might also call this an "instance." > > cjs > -- > Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org > Don't you know, in this new Dark Age, we're all light. --XTC > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Mon, 27 Jan 2003, Bruce Momjian wrote: > Cluster is kind of overloaded, as initdb, replication (clustering), and > CLUSTER command. Not sure what we can do about that, though. Well, since we mostly communicate by e-mail, anyway, we can just use different spellings: "cluster," "kluster," "clustar," "klustar," etc. With an appropriate amount of creativity, I reckon this will support many more meanings than we currently have, leaving lots of room for future expansion. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
This sounds like it's a "localization" issue --> aka: L18N --> ANd -- I believe that you gave yourself the answer in your question... It occurs to me that you should be able to GROUP BY CtryID, ORDER BY fieldcontainingthedata; "Dennis Gearon" <gearond@cvc.net> wrote in message news:1U72ZWSKFON95EBVQDA2JD7QKPIF.3e302042@cal-lab... > Scenario: > A table that has recoreds in every language under the sun in it, > in UTF-8 format, > A column which identifies which language a record is in, > An INT4 column indentifying the items which are stored in > multiple languages. > > Examples: > CtryID Country Language > 1 Russia US-EN > 1 � �fссйя RU <russian spelled in Russian, sort of> > > I want to pull out groups of records by language and the country, and then > SORT THE RECORDS IN THE LANGUAGE SELECTED. > > Is there any way to say which language to ORDER BY in an SQL statement? Is there in the SQL > standard? If not, this is a serious deficiency for global companies. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)