Thread: convert function
I have added new function called "convert" similar to SQL99's convert. Convert converts encoding according to parameters. For example, if you have a table named "unicode" in an Unicode database, SELECT convert(text_field, 'LATIN1') FROM unicode; will return text in ISO-8859-1 representation. This is not very impressive because PGCLIENTENCODING could do same thing. However consider a table: CREATE TABLE mixed ( french TEXT, -- ISO-8859-1 czech TEXT -- ISO-8859-2 ); Those columns hold texts in different language. Since we do not have NCHAR yet, we cannot extract data from "mixed" very well (note that we could store french and czech data by INSERT french and then UPDATE czech, so on. Even we could store them at once actually since they are "binary compatible.") However, using convert, (and if you have an Unicode aware termminal), you could view them: SELECT convert(french, 'LATIN1', 'UNICODE'), convert(czech, 'LATIN2', 'UNICODE') FROM unicode; Convert is especially usefull if you want to sort Unicode data according to a specific locale. For example: SELECT * FROM unicode order by convert(text_field, 'LATIN1'); should return data in proper sort order if you enable European locale. Note that to make above possible, you will need to turn on all of --enable-multibyte, --enable-locale, --enable-unicode-conversion options. See Table 4-7 in the string function section in the docs for more details. Testings are welcome. I only understand Japanese and English! -- Tatsuo Ishii
> I have added new function called "convert" similar to SQL99's convert. > Convert converts encoding according to parameters. For example, if you > have a table named "unicode" in an Unicode database, Forgot to mention that anyone who wants to try the new function should do initdb. I did not increment the version id so that people who are not interested in the function were not forced to do an initdb. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > I have added new function called "convert" similar to SQL99's convert. > Convert converts encoding according to parameters. For example, if you > have a table named "unicode" in an Unicode database, > SELECT convert(text_field, 'LATIN1') FROM unicode; > will return text in ISO-8859-1 representation. I don't understand how this works. If you have a multibyte-enabled backend, won't backend libpq try to convert all outgoing text to whatever PGCLIENTENCODING says? How can it know that one particular column of a result set is not in the regular encoding of this database, but something else? Seems like libpq is going to mess up the results by applying an inappropriate multibyte conversion. regards, tom lane
> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > I have added new function called "convert" similar to SQL99's convert. > > Convert converts encoding according to parameters. For example, if you > > have a table named "unicode" in an Unicode database, > > > SELECT convert(text_field, 'LATIN1') FROM unicode; > > > will return text in ISO-8859-1 representation. > > I don't understand how this works. If you have a multibyte-enabled > backend, won't backend libpq try to convert all outgoing text to > whatever PGCLIENTENCODING says? How can it know that one particular > column of a result set is not in the regular encoding of this database, > but something else? Seems like libpq is going to mess up the results > by applying an inappropriate multibyte conversion. If the encodings of frontend and backend are same, no conversion would be applied by libpq. -- Tatsuo Ishii