Re: MSSQL to PostgreSQL : Encoding problem - Mailing list pgsql-general

From Magnus Hagander
Subject Re: MSSQL to PostgreSQL : Encoding problem
Date
Msg-id 6BCB9D8A16AC4241919521715F4D8BCEA35977@algol.sollentuna.se
Whole thread Raw
In response to Re: MSSQL to PostgreSQL : Encoding problem  (Bruce Momjian <bruce@momjian.us>)
List pgsql-general
> > > I thought Win1252 was supposed to be almost the same as Latin1.
> > > While I'd expect certain differences, I wouldn't expect it to use
> > > 0x00 as data!
> > >
> > > Maybe you could have DTS export Unicode, which would
> presumably be
> > > UTF-16, then recode that to something else (possibly
> UTF-8) with GNU
> > > iconv.
> >
> > UTF-16 ! That's something I haven't tried !
> > I'll try an iconv conversion tomorrow from UTF16 to UTF8 !
>
> Right!  To clarify, Unicode is the character set, and UTF8
> and UTF16 are ways of representing that characters set in
> 8-bit and 16-bit segments, respectively.  PostgreSQL only
> suports UTF8, and Win32 only supports
> UTF16 in the operating system.  And 0x00 is not a valid value
> in any of those, that I know of, but perhaps it is in UTF16.

Actually, Win32 supports UTF8 as well. There are a few operations that
aren't supported on it, but you can certainly read and write files in it
from most builtin apps.

One other problem is that in most (all) win32 documentation talks about
UNICODE when they mean UTF16 (in <= NT4, UCS-2). And PostgreSQL used to
say UNICODE when we meant UTF8. Adds to the confusion.

Finally, UTF-8 does not represent the characters in 8-bit segments - it
can use anything from 8 to 32 bits. UTF-16 always uses 16 bits. This
also means that you acn't talk about "0x00 being valid" in UTF-16,
because all characters are 16-bit. It would be "0x0000" or "0x00 0x00".
But that requires an application that knows UTF16, which postgresql
doesn't, so it reports on the first 0x00.

//Magnus

pgsql-general by date:

Previous
From: "Guy Rouillier"
Date:
Subject: Re: Uninstalling PostgreSql
Next
From: Jason Earl
Date:
Subject: Re: PGSQL Newbie