Thread: convert function

convert function

From
Tatsuo Ishii
Date:
I have added new function called "convert" similar to SQL99's convert.
Convert converts encoding according to parameters. For example, if you
have a table named "unicode" in an Unicode database,

SELECT convert(text_field, 'LATIN1') FROM unicode;

will return text in ISO-8859-1 representation. This is not very
impressive because PGCLIENTENCODING could do same thing. However
consider a table:

CREATE TABLE mixed (      french TEXT,    -- ISO-8859-1      czech TEXT    -- ISO-8859-2
);

Those columns hold texts in different language. Since we do not have
NCHAR yet, we cannot extract data from "mixed" very well (note that we
could store french and czech data by INSERT french and then UPDATE
czech, so on. Even we could store them at once actually since they are
"binary compatible.") However, using convert, (and if you have an
Unicode aware termminal), you could view them:

SELECT convert(french, 'LATIN1', 'UNICODE'),      convert(czech, 'LATIN2', 'UNICODE') FROM unicode;

Convert is especially usefull if you want to sort Unicode data
according to a specific locale. For example:

SELECT * FROM unicode order by convert(text_field, 'LATIN1');

should return data in proper sort order if you enable European locale.
Note that to make above possible, you will need to turn on all of
--enable-multibyte, --enable-locale, --enable-unicode-conversion
options.

See Table 4-7 in the string function section in the docs for more
details.

Testings are welcome. I only understand Japanese and English!
--
Tatsuo Ishii



Re: convert function

From
Tatsuo Ishii
Date:
> I have added new function called "convert" similar to SQL99's convert.
> Convert converts encoding according to parameters. For example, if you
> have a table named "unicode" in an Unicode database,

Forgot to mention that anyone who wants to try the new function should
do initdb. I did not increment the version id so that people who are
not interested in the function were not forced to do an initdb.
--
Tatsuo Ishii


Re: convert function

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I have added new function called "convert" similar to SQL99's convert.
> Convert converts encoding according to parameters. For example, if you
> have a table named "unicode" in an Unicode database,

> SELECT convert(text_field, 'LATIN1') FROM unicode;

> will return text in ISO-8859-1 representation.

I don't understand how this works.  If you have a multibyte-enabled
backend, won't backend libpq try to convert all outgoing text to
whatever PGCLIENTENCODING says?  How can it know that one particular
column of a result set is not in the regular encoding of this database,
but something else?  Seems like libpq is going to mess up the results
by applying an inappropriate multibyte conversion.
        regards, tom lane


Re: convert function

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > I have added new function called "convert" similar to SQL99's convert.
> > Convert converts encoding according to parameters. For example, if you
> > have a table named "unicode" in an Unicode database,
> 
> > SELECT convert(text_field, 'LATIN1') FROM unicode;
> 
> > will return text in ISO-8859-1 representation.
> 
> I don't understand how this works.  If you have a multibyte-enabled
> backend, won't backend libpq try to convert all outgoing text to
> whatever PGCLIENTENCODING says?  How can it know that one particular
> column of a result set is not in the regular encoding of this database,
> but something else?  Seems like libpq is going to mess up the results
> by applying an inappropriate multibyte conversion.

If the encodings of frontend and backend are same, no conversion would
be applied by libpq.
--
Tatsuo Ishii