Thread: using strxfrm for having multi locale/please vote for adding this function in contribution
using strxfrm for having multi locale/please vote for adding this function in contribution
From
Mahmoud Taghizadeh
Date:
there was a discussion in postgresql mailing list about using strxfrm function to add support multi locale.
most of developers agree that this method is a correct but not perfect solution for having multiple locale in postgresql and some people suggest to add such functions in contribution.
the main problem of such functions is the overhead of setlocale. (that can be ommited when you run postgresql in a OS with advanced GLIBC and strxfrm_l function)
I summerized discussion and add one implementation of such functions, I try to convince tom lane to add this function to the contribution but I failed.
maybe you are not interested to this subject but I kindly ask you to say your idea in list.
please tell clearly that you are agree/disagree for add this function in contribution or not.
I am thankful in advance.
With Regards,
--taghi
Do you Yahoo!?
Yahoo! Search presents - Jib Jab's 'Second Term'
Attachment
This has been saved for the 8.1 release: http://momjian.postgresql.org/cgi-bin/pgpatches2 --------------------------------------------------------------------------- Mahmoud Taghizadeh wrote: > there was a discussion in postgresql mailing list about using strxfrm function to add support multi locale. > most of developers agree that this method is a correct but not perfect solution for having multiple locale in postgresqland some people suggest to add such functions in contribution. > > the main problem of such functions is the overhead of setlocale. (that can be ommited when you run postgresql in a OS withadvanced GLIBC and strxfrm_l function) > > I summerized discussion and add one implementation of such functions, I try to convince tom lane to add this function tothe contribution but I failed. > > maybe you are not interested to this subject but I kindly ask you to say your idea in list. > please tell clearly that you are agree/disagree for add this function in contribution or not. > > > I am thankful in advance. > > > > > With Regards, > --taghi > > --------------------------------- > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' Content-Description: nls_sort.tgz [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
I think we have concluded that the use of the ICU library is the way we are going to accomplish multi-locale support in the future. --------------------------------------------------------------------------- Mahmoud Taghizadeh wrote: > there was a discussion in postgresql mailing list about using strxfrm function to add support multi locale. > most of developers agree that this method is a correct but not perfect solution for having multiple locale in postgresqland some people suggest to add such functions in contribution. > > the main problem of such functions is the overhead of setlocale. (that can be ommited when you run postgresql in a OS withadvanced GLIBC and strxfrm_l function) > > I summerized discussion and add one implementation of such functions, I try to convince tom lane to add this function tothe contribution but I failed. > > maybe you are not interested to this subject but I kindly ask you to say your idea in list. > please tell clearly that you are agree/disagree for add this function in contribution or not. > > > I am thankful in advance. > > > > > With Regards, > --taghi > > --------------------------------- > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' Content-Description: nls_sort.tgz [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Re: using strxfrm for having multi locale/please vote for adding this function in contribution
From
Greg Stark
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I think we have concluded that the use of the ICU library is the way we > are going to accomplish multi-locale support in the future. You did? It really seemed like there was one crowd pushing ICU and hardly anyone else interested in piling a huge library dependency on Postgres. It seemed like ICU was only really necessary if you wanted some esoteric functionality that wasn't entirely explained. I'm having no trouble handling multi-locale already using the strxfrm implementation that was posted and refined by several people on the mailing list. Yes it's true that on some OSes it wouldn't be tolerably efficient but on glibc it's more than tolerable. If better solutions (strxfrm_l) become available at some point in the future then it would be about as efficient as it could be on platforms where those features are available. -- greg
Greg Stark wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > I think we have concluded that the use of the ICU library is the way we > > are going to accomplish multi-locale support in the future. > > You did? It really seemed like there was one crowd pushing ICU and hardly > anyone else interested in piling a huge library dependency on Postgres. It > seemed like ICU was only really necessary if you wanted some esoteric > functionality that wasn't entirely explained. I thought we were willing to require the library for multi-locale builds. > I'm having no trouble handling multi-locale already using the strxfrm > implementation that was posted and refined by several people on the mailing > list. > > Yes it's true that on some OSes it wouldn't be tolerably efficient but on > glibc it's more than tolerable. If better solutions (strxfrm_l) become > available at some point in the future then it would be about as efficient as > it could be on platforms where those features are available. There are some things I think ICU can fix for us like indexing non-C localed columns. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Mon, Jun 06, 2005 at 10:11:15PM -0400, Bruce Momjian wrote: > Greg Stark wrote: > > Yes it's true that on some OSes it wouldn't be tolerably efficient but on > > glibc it's more than tolerable. If better solutions (strxfrm_l) become > > available at some point in the future then it would be about as efficient as > > it could be on platforms where those features are available. > > There are some things I think ICU can fix for us like indexing non-C > localed columns. Huh, we already do that, don't we? -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Everybody understands Mickey Mouse. Few understand Hermann Hesse. Hardly anybody understands Einstein. And nobody understands Emperor Norton."
Alvaro Herrera wrote: > On Mon, Jun 06, 2005 at 10:11:15PM -0400, Bruce Momjian wrote: > > Greg Stark wrote: > > > > Yes it's true that on some OSes it wouldn't be tolerably efficient but on > > > glibc it's more than tolerable. If better solutions (strxfrm_l) become > > > available at some point in the future then it would be about as efficient as > > > it could be on platforms where those features are available. > > > > There are some things I think ICU can fix for us like indexing non-C > > localed columns. > > Huh, we already do that, don't we? Sorry, I meant LIKE index usage for non-C columns. We can do that now with a special LIKE indexing method, but this would allow normal indexes to work. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Alvaro Herrera wrote: >>> There are some things I think ICU can fix for us like indexing non-C >>> localed columns. >> >> Huh, we already do that, don't we? > Sorry, I meant LIKE index usage for non-C columns. We can do that now > with a special LIKE indexing method, but this would allow normal indexes > to work. Sounds like pie in the sky to me. Exactly how do you think that ICU will magically mask the fundamental semantic inconsistency? regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Alvaro Herrera wrote: > >>> There are some things I think ICU can fix for us like indexing non-C > >>> localed columns. > >> > >> Huh, we already do that, don't we? > > > Sorry, I meant LIKE index usage for non-C columns. We can do that now > > with a special LIKE indexing method, but this would allow normal indexes > > to work. > > Sounds like pie in the sky to me. Exactly how do you think that ICU > will magically mask the fundamental semantic inconsistency? I am hoping ICU will allow us to see the next greatest value for that character. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: > > Sounds like pie in the sky to me. Exactly how do you think that > > ICU will magically mask the fundamental semantic inconsistency? > > I am hoping ICU will allow us to see the next greatest value for that > character. As Tom says, it's a semantic inconsistency, not a lack of features. Collation (sorting of strings) takes the entire string into account, pattern matching compares character by character. For example, some collations compare strings from back to front, whereas a pattern matching expression could never make sense of that. The SQL standard actually does not draw that distinction, but, well, it's broken. Using separate operator classes for separate semantic interpretations of data seems to be exactly the right solution. -- Peter Eisentraut http://developer.postgresql.org/~petere/