Thread: Czech2ASCII with --mb=Latin2
Hi, I have a database in Latin2 encoding (Czech stuff) and Latin2/Win1250 on-the-fly recoding with 'set client_encoding' works smoothly. Now, when I set client encoding to SQL_ASCII, accented characters are converted to (hexa) codes. Is there any (simple) way to make this recoding convert accented characters to just the chars themselves but without accents? Thanks in advance. - Robert P.S. Moreover, the non-Czech speakers tend to search the database with words without accents, it would be usefull to make this conversion works in the other direction: name LIKE 'ceske%' would return also names starting with accented version. P.S.2 I could do this quite easily in Perl on the application level, but don't want to start programming before I'm sure there's no standard postgres solution.
On Wed, 15 Dec 1999, Robert wrote: > Hi, > > I have a database in Latin2 encoding (Czech stuff) and Latin2/Win1250 > on-the-fly recoding with 'set client_encoding' works smoothly. Now, when > I set client encoding to SQL_ASCII, accented characters are converted to > (hexa) codes. Is there any (simple) way to make this recoding convert > accented characters to just the chars themselves but without accents? > Thanks in advance. > > - Robert Ahoj :-) if I good remember, in PgSQL is not any routine for this (IMHO is it lang-specific and make any generally (for all langs and encodings..etc) routine is problem). But you can easy write this in C or Tcl. Karel
On 1999-12-15, Robert mentioned: > I have a database in Latin2 encoding (Czech stuff) and Latin2/Win1250 > on-the-fly recoding with 'set client_encoding' works smoothly. Now, when > I set client encoding to SQL_ASCII, accented characters are converted to > (hexa) codes. Is there any (simple) way to make this recoding convert > accented characters to just the chars themselves but without accents? I think this sort of thing has been the dream of many folks using internationalized software, but it's not that easy. Perhaps one could write a function that does this sort of conversion, which would have to keep a gigantic table internally. However, perhaps in your language it's customary to just leave off the diacritic marks if they're not available, but in other languages such as Swedish or German there are rules about converting those to sequences of other letters. And if you start encoding rules of natural languages into software, oh boy ... -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden ************
On 1999-12-15, Robert mentioned: > I have a database in Latin2 encoding (Czech stuff) and Latin2/Win1250 > on-the-fly recoding with 'set client_encoding' works smoothly. Now, when > I set client encoding to SQL_ASCII, accented characters are converted to > (hexa) codes. Is there any (simple) way to make this recoding convert > accented characters to just the chars themselves but without accents? I think this sort of thing has been the dream of many folks using internationalized software, but it's not that easy. Perhaps one could write a function that does this sort of conversion, which would have to keep a gigantic table internally. However, perhaps in your language it's customary to just leave off the diacritic marks if they're not available, but in other languages such as Swedish or German there are rules about converting those to sequences of other letters. And if you start encoding rules of natural languages into software, oh boy ... -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden ************