ORDER BY different locales for 8.0 - Mailing list pgsql-hackers

From Honza Pazdziora
Subject ORDER BY different locales for 8.0
Date
Msg-id 20050419071801.GB23363@anxur.fi.muni.cz
Whole thread Raw
Responses Re: ORDER BY different locales for 8.0
List pgsql-hackers
Hello,

the nls_string function that makes it possible to sort by arbitrary
locale has been updated to reflect the changes in error handling in
PostgreSQL 8.0, due to users using the nls_string sorting on 7.4 and
requesting it for 8.0 as well. The distribution can be downloaded from
 http://www.fi.muni.cz/~adelton/l10n/
http://www.fi.muni.cz/~adelton/l10n/postgresql-nls-string/postgresql-nls-string-8.01.tar.gz

I'll appreciate any comments.

The README is as follows:

-----------------
Why this function:

PostgreSQL, at least until version 8.0, has rather weak support for
various collating sequences -- what you get when you do
 select ... order by column.

The sorting is closely tied to indexes used throughout the database
cluster and is specified by locale settings at the initdb time.
Yet, people asked for ways of specifying the collating rules at runtime,
even if the sorting will not use indexes. Just take the records and
sort them. It is reasonable request to want one select to order by
using English rules, another one to run with German rules and yet
another with Czech ones, without having to dump, initdb, restore.

------------
How it works:

In this distribution you will find file nls_string.c. It contains the
definition of function nls_string(text, text) which takes a string
parameter and a locale name and returns string describing the ordering.
So you can run
 select * from table order by nls_string(name, 'en_US.UTF-8')

or
 select * from table order by nls_string(name, 'cs_CZ.UTF-8')

or
 select * from table order by nls_string(name, 'C')

and get what you expect -- the result is sorted the same way as it
would be with LC_COLLATE=locate sort on the command line.

Internally, the function sets the locale for LC_COLLATE category, runs
strxfrm on the first parameter and encodes the result as octal values.
Thus, it depends on your PostgreSQL collate setting (that which you
did upon initdb, you can check it with show lc_collate) to sort
numbers in the natural way. I believe this is reasonable assumption.

------------
Installation:

Please check the INSTALL file.

---------
Versions:

This version of nls_string targets PostgreSQL server in version 8.0+.
To use nls_string on version 7.4, download nls_string 0.53.

-------------
Bugs and ToDo:

If your default collation settings does not sort numbers in the
natural way (eg., 0123 is not sorted before 1234), the nls_string
will not work.

Nonetheless, the function does the work for me.

--------------------
Support, bug reports:

This piece of software is provided as-is, in the hope that you will
find it useful. However, no warranty is provided.

I appreciate any bug reports, enhancement suggestions and patches.
Please, _please_, use a meaningful Subject line and describe the
situation in detail. Also make sure you've read and understood this
README and the PostgreSQL documentation concerning C-language
functions. I will not be helpful with installation problems if you did
not read the documentation.

-------------------
If it works for you:

If the function works for you, I'd appreciate a message from you.
Just curious for what tasks people use the software.

---------
Available:
 http://www.fi.muni.cz/~adelton/l10n/

------
Author:

Copyright: (c) 2004--2005 Jan Pazdziora, adelton@fi.muni.cz. All
rights reserved. Permission to use, distribute, modify, an copy this
software and this documentation for any purpose is hereby granted.

Contributors:
Karel Zak <zakkr@zff.jcu.cz>ftp://ftp2.zf.jcu.cz/users/zakkr/pg/

-- 
------------------------------------------------------------------------Honza Pazdziora | adelton@fi.muni.cz |
http://www.fi.muni.cz/~adelton/.project:Perl, mod_perl, DBI, Oracle, large Web systems, XML/XSL, ...    Only
self-confidentpeople can be simple.
 


pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Problem with PITR recovery
Next
From: Olivier Thauvin
Date:
Subject: Re: SETOF function call