Re: convert string function and built-in conversions - Mailing list pgsql-general

From culley harrelson
Subject Re: convert string function and built-in conversions
Date
Msg-id 20031019190627.751BB6FC9E@smtp.us2.messagingengine.com
Whole thread Raw
In response to Re: convert string function and built-in conversions  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Responses Re: convert string function and built-in conversions  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
List pgsql-general
It is one of the extended characters in iso-8859-1.  This data was taken
from a text field in a SQL_ASCII database.  Basically what I am trying to
do is migrate data from a SQL_ASCII database to a UNICODE database by
running all the data through an external script that does something like:

select convert(my_field using ascii_to_utf_8) from my_table;

then inserts the selected text into an identical table in the unicode
database.  All the data goes across, but extended characters such as ñ
are getting munged.  The docs indicate that ascii_to_utf_8 is for
SQL_ASCII -> UNICODE...  Are you saying that ñ isn't really an ASCII
character even though it is valid in a SQL_ASCII database?  I have found
that all extended characters of the various LATIN encodings will work
just fine in my SQL_ASCII database.

This project is a big can of worms...  Every 6 months I open the can,
stir the worms around a bit, wrinkle my nose then promptly close the can
again and stuff it away for another 6 months. :)  Wish I could figure it
out.



On Sun, 19 Oct 2003 00:31:43 -0700 (PDT), "Stephan Szabo"
<sszabo@megazone.bigpanda.com> said:
> On Sun, 19 Oct 2003, culley harrelson wrote:
>
> > It seems to me that these values should be the same:
> >
> > select 'lydia eugenia treviño', convert('lydia eugenia treviño' using
> > ascii_to_utf_8);
> >
> > but they seem to be different.  What am I missing?
>
> I don't think the marked n is a valid ascii character (it might be
> extended ascii, but that's different and not really standard afaik).
> You're probably getting the character associated with the lower 7 bits.

pgsql-general by date:

Previous
From: Greg Stark
Date:
Subject: Re: ShmemAlloc errors
Next
From: Gaetano Mendola
Date:
Subject: Re: VACUUM degrades performance significantly. Database