Re: Multilingual application, ORDER BY w/ different locales? - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Multilingual application, ORDER BY w/ different locales?
Date
Msg-id 3BF75BAE.1070905@sid.tm.ee
Whole thread Raw
In response to Re: Multilingual application, ORDER BY w/ different locales?  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-hackers

Stephan Szabo wrote:

>On Sat, 17 Nov 2001, Tom Lane wrote:
>
>>Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>>
>>>Would it be possible to make a function in plpgsql or whatever that
>>>wrapped the collate changes and then order by that and make functional
>>>indexes?  Would the system use it?
>>>
>>IIRC, we were debating whether we should consider collation to be an
>>attribute of the datatype (think typmod) or an attribute of individual
>>values (think field added to values of textual types).  In the former
>>case, a function like this would only work if we allowed its result to
>>be declared as having the right collate attribute.  Which is not
>>impossible, but we don't currently associate any typmod with function
>>arguments or results, and so I'm not sure how painful it would be.
>>With the field-in-data-value approach it's easy to see how it would
>>work.  But another byte or word per text value might be a high price
>>to pay ...
>>
>
>True.  Although I wonder how things like substring would work in the
>model with typmods if the collation isn't attached in any fashion to
>the return values since I think the substring collation is supposed
>to be the same as the input string's, whereas for something like
>convert it's a different collation based on a parameter. I wonder if
>as a temporary thing, you could use a function that did something
>similar to strxfrm as long as you only used that for sorting purposes.
>
That would mean a new datatype that such function returns

CREATE FUNCTION text_with_collation(text,collation) RETURNS 
text_with_collation

That would be sorted using the rules of that collation.

This can currently be added in contrib, but should eventually go into core.

The function itself is quite easy, but the collation is the part that 
can either be done by
a) writing our own library

b) using system locale (i think that locale switching is slow in default 
glibc , so the following can be slow too ORDER BY text_with_collation(t1,'et_EE'), text_with_collation(t1,'fr_CA') but
Idoubt anybody uses it.
 

c) using a third party library - at least IBM has one which is almost as 
big as whole postgreSQL ;)

assuming that one backend needs mostl one locale at a time, I think that 
b) will be the easiest to
implement, but this will clash with current locale support if it is 
compiled in  so you have to be
rapidly swithcing LC_COLLATE between the default and that of the current 
datum.

so what we actually need is a system that will _not_ use locale-aware 
functions unless specifically
told to do so by feeding it with text_with_locale values.

---------------
Hannu










----------------
Hannu




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: OCTET_LENGTH is wrong
Next
From: Hannu Krosing
Date:
Subject: Re: [DOCS] Use of 'now'