Thread: Localization or an other solution
Hi, As everybody knows, or at least guess, the Turkish-specific characters is ordered incorrectly unless you don't have localization support. (For example there is a letter like 's' but has a dot under it; this letter must come after 's'; without localization support this and other non-English ones come after 'z' ) According to the manual, localization causes the loose of performance. Beside, I scare to use localization which is not familiar to me. Because of these, I try to find an other solution. The one that I think on is- to use a seperate field for each of the fields thatI want to order;- to put a correctly-sortable version of the data in the related field- and to use this extra fieldsfor sorting purposes. Example: Original data: 'şimşek' In extra field: 'szzimszzek' Finally, my question is: In a table which has 100.000 or more records, which one will be more faster? Localization or this one. Thanks in advance Erol Oz
Erol, A possible solution would be to create special functions for sorting each language you need to sort. E.g. sortturkish() for sorting Turkish text. You would implement sortturkish by writing a C function. You would need to do string comparisons of Turkish language strings within the function. In terms of more specifics, I hope you are more knowleadgeable. I don't know one thing about Turkish, so I can't make real attempts. If you just have some 8 bit characters which are special, then the jobs is relatively simple. Read each letter one at a time and use a lookup table which has a value assigned to all Turkish letters, for doing the comparison. I am curious if anyone else thinks using functions would be more flexible/convenient. You couldn't handle issues such as numbers, commas, etc., but sorting alone would be useful. Not perfect, but convenient, especially when your primary language is English and you only want to be able to handle some tables with text in other languages. Troy > > Hi, > As everybody knows, or at least guess, the Turkish-specific characters > is ordered incorrectly unless you don't have localization support. (For > example there is a letter like 's' but has a dot under it; this letter > must come after 's'; without localization support this and other > non-English ones come after 'z' ) > According to the manual, localization causes the loose of performance. > Beside, I scare to use localization which is not familiar to me. Because > of these, I try to find an other solution. The one that I think on is > - to use a seperate field for each of the fields that I want to order; > - to put a correctly-sortable version of the data in the related field > - and to use this extra fields for sorting purposes. > Example: > Original data: 'þimþek' > In extra field: 'szzimszzek' > Finally, my question is: > In a table which has 100.000 or more records, which one will be more > faster? Localization or this one. > Thanks in advance > Erol Oz > >
Followup on my own message. I just paused for a second and realized you couldn't do straight sorting, since then all text would be handled that way. It seems the only way external functions could be used to help you out would be to write a turkishhash function which assigns a value to all strings, based on their alphabetic order, in Turkish. Then you could sort the results of a query with something like this: select username from testtable order by turkishhash(username); You would get into trouble with long strings and large tables, as they received the same hash value. The solution might be better than nothing, though. Troy > > Hi, > As everybody knows, or at least guess, the Turkish-specific characters > is ordered incorrectly unless you don't have localization support. (For > example there is a letter like 's' but has a dot under it; this letter > must come after 's'; without localization support this and other > non-English ones come after 'z' ) > According to the manual, localization causes the loose of performance. > Beside, I scare to use localization which is not familiar to me. Because > of these, I try to find an other solution. The one that I think on is > - to use a seperate field for each of the fields that I want to order; > - to put a correctly-sortable version of the data in the related field > - and to use this extra fields for sorting purposes. > Example: > Original data: 'þimþek' > In extra field: 'szzimszzek' > Finally, my question is: > In a table which has 100.000 or more records, which one will be more > faster? Localization or this one. > Thanks in advance > Erol Oz > >
Erol Oz writes: > According to the manual, localization causes the loose of performance. This is no longer an issue in 7.0. > Beside, I scare to use localization which is not familiar to me. What it amounts to is setting a few environment variables to the effect of "I'm in Turkey" and the rest will be taken care of. Look for a 'locale' man page on your system. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden