Thread: Localization or an other solution

Localization or an other solution

From
Erol Oz
Date:
Hi,
As everybody knows, or at least guess, the Turkish-specific characters
is ordered incorrectly unless you don't have localization support. (For
example there is a letter like 's' but has a dot under it; this letter
must come after 's'; without localization support this and other
non-English ones come after 'z' )
According to the manual, localization causes the loose of performance.
Beside, I scare to use localization which is not familiar to me. Because
of these, I try to find an other solution. The one that I think on is- to use a seperate field for each of the fields
thatI want to order;- to put a correctly-sortable version of the data in the related field- and to use this extra
fieldsfor sorting purposes.
 
Example:
Original data: 'şimşek'
In extra field: 'szzimszzek'
Finally, my question is:
In a table which has 100.000 or more records, which one will be more
faster? Localization or this one.
Thanks in advance
Erol Oz



Re: Localization or an other solution

From
"tjk@tksoft.com"
Date:
Erol,

A possible solution would be to create special functions
for sorting each language you need to sort.

E.g. sortturkish() for sorting Turkish text.

You would implement sortturkish by writing a C
function.

You would need to do string comparisons of Turkish
language strings within the function.

In terms of more specifics, I hope you are more knowleadgeable.
I don't know one thing about Turkish, so I can't make
real attempts. If you just have some 8 bit characters
which are special, then the jobs is relatively simple.
Read each letter one at a time and use a lookup
table which has a value assigned to all Turkish letters,
for doing the comparison.

I am curious if anyone else thinks using functions would
be more flexible/convenient. You couldn't handle issues such
as numbers, commas, etc., but sorting alone would be useful.

Not perfect, but convenient, especially when your primary
language is English and you only want to be able to
handle some tables with text in other languages.


Troy




>
> Hi,
> As everybody knows, or at least guess, the Turkish-specific characters
> is ordered incorrectly unless you don't have localization support. (For
> example there is a letter like 's' but has a dot under it; this letter
> must come after 's'; without localization support this and other
> non-English ones come after 'z' )
> According to the manual, localization causes the loose of performance.
> Beside, I scare to use localization which is not familiar to me. Because
> of these, I try to find an other solution. The one that I think on is
>  - to use a seperate field for each of the fields that I want to order;
>  - to put a correctly-sortable version of the data in the related field
>  - and to use this extra fields for sorting purposes.
> Example:
> Original data: 'þimþek'
> In extra field: 'szzimszzek'
> Finally, my question is:
> In a table which has 100.000 or more records, which one will be more
> faster? Localization or this one.
> Thanks in advance
> Erol Oz
>
>

Re: Localization or an other solution

From
"tjk@tksoft.com"
Date:
Followup on my own message.

I just paused for a second and realized you couldn't
do straight sorting, since then all text would be handled
that way.

It seems the only way external functions could be used
to help you out would be to write a turkishhash
function which assigns a value to all strings, based on
their alphabetic order, in Turkish. Then you could
sort the results of a query with something like this:

select username from testtable order by turkishhash(username);

You would get into trouble with long strings and large
tables, as they received the same hash value. The
solution might be better than nothing, though.


Troy


>
> Hi,
> As everybody knows, or at least guess, the Turkish-specific characters
> is ordered incorrectly unless you don't have localization support. (For
> example there is a letter like 's' but has a dot under it; this letter
> must come after 's'; without localization support this and other
> non-English ones come after 'z' )
> According to the manual, localization causes the loose of performance.
> Beside, I scare to use localization which is not familiar to me. Because
> of these, I try to find an other solution. The one that I think on is
>  - to use a seperate field for each of the fields that I want to order;
>  - to put a correctly-sortable version of the data in the related field
>  - and to use this extra fields for sorting purposes.
> Example:
> Original data: 'þimþek'
> In extra field: 'szzimszzek'
> Finally, my question is:
> In a table which has 100.000 or more records, which one will be more
> faster? Localization or this one.
> Thanks in advance
> Erol Oz
>
>

Re: Localization or an other solution

From
Peter Eisentraut
Date:
Erol Oz writes:

> According to the manual, localization causes the loose of performance.

This is no longer an issue in 7.0.

> Beside, I scare to use localization which is not familiar to me.

What it amounts to is setting a few environment variables to the effect of
"I'm in Turkey" and the rest will be taken care of. Look for a 'locale'
man page on your system.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden