Thread: Encoding Conversion

Encoding Conversion

From
"beer"
Date:
Hello All

So I have an old database that is ASCII_SQL encoded.  For a variety of reasons I need to convert the database to
UNICODE. I did some googling on this but have yet to find anything that looked like a viable option, so i thought I'd
postto the group and see what sort of advice might arise. :) 

TIA

-b


Re: Encoding Conversion

From
jef peeraer
Date:
beer schreef:
 > Hello All
 >
 > So I have an old database that is ASCII_SQL encoded.  For a variety
of reasons I need to convert the database to UNICODE.  I did some
googling on this but have yet to find anything that looked like a viable
option, so i thought I'd post to the group and see what sort of advice
might arise. :)
well i recently struggled with the same problem. After a lot of trial
and error and reading, it seems that an ascii encoded database can't use
its client encoding capabilities ( set client_encoding to utf8 ).
i think the easist solution is to do a dump, recreate the database with
a proper encoding, and restore the dump.

jef peeraer
 >
 > TIA
 >
 > -b
 >
 >
 > ---------------------------(end of broadcast)---------------------------
 > TIP 1: if posting/reading through Usenet, please send an appropriate
 >        subscribe-nomail command to majordomo@postgresql.org so that your
 >        message can get through to the mailing list cleanly
 >

Re: Encoding Conversion

From
Alan Hodgson
Date:
On May 9, 2006 01:03 pm, jef peeraer <jef.peeraer@telenet.be> wrote:
> well i recently struggled with the same problem. After a lot of trial
> and error and reading, it seems that an ascii encoded database can't use
> its client encoding capabilities ( set client_encoding to utf8 ).
> i think the easist solution is to do a dump, recreate the database with
> a proper encoding, and restore the dump.

You also need to convert any non-ASCII encoded characters present in the
dump to UTF-8, prior to restoring it.  If you're lucky, and they're all the
same (ie. LATIN1 or something), you can use iconv to easily do that.
Remember to change the set client_encoding line in the dump file, too.

--
Alan

Re: Encoding Conversion

From
jef peeraer
Date:
beer schreef:
> Hello All
>
> So I have an old database that is ASCII_SQL encoded.  For a variety of reasons I need to convert the database to
UNICODE. I did some googling on this but have yet to find anything that looked like a viable option, so i thought I'd
postto the group and see what sort of advice might arise. :) 
well i recently struggled with the same problem. After a lot of trial
and error and reading, it seems that an ascii encoded database can't use
its client encoding capabilities ( set client_encoding to utf8 ).
i think the easist solution is to do a dump, recreate the database with
a proper encoding, and restore the dump.

jef peeraer
>
> TIA
>
> -b
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


What's wrong with this SQL?

From
"Adam"
Date:
I'm trying to create a table and I'm getting this error:

      SQL error:


ERROR:  syntax error at or near "(" at character 39
      In statement:
      CREATE TABLE "users" ("user_ID" SERIAL(12), "first_name" character
varying(40) NOT NULL, "last_name" character varying(40) NOT NULL, "password"
character varying(16) NOT NULL, "email" character varying(100) NOT NULL,
"privilege" integer(2) NOT NULL, PRIMARY KEY ("user_ID"))



      What's wrong with this SQL?



Re: What's wrong with this SQL?

From
Tom Lane
Date:
"Adam" <adam@spatialsystems.org> writes:
> I'm trying to create a table and I'm getting this error:
> ERROR:  syntax error at or near "(" at character 39
>       In statement:
>       CREATE TABLE "users" ("user_ID" SERIAL(12),

SERIAL doesn't take a parameter.

            regards, tom lane

Re: What's wrong with this SQL?

From
Chris
Date:
Adam wrote:
> I'm trying to create a table and I'm getting this error:
>
>      SQL error:
>
>
> ERROR:  syntax error at or near "(" at character 39
>      In statement:
>      CREATE TABLE "users" ("user_ID" SERIAL(12), "first_name" character
> varying(40) NOT NULL, "last_name" character varying(40) NOT NULL,
> "password" character varying(16) NOT NULL, "email" character
> varying(100) NOT NULL, "privilege" integer(2) NOT NULL, PRIMARY KEY
> ("user_ID"))

"privilege" integer(2) NOT NULL

Postgres doesn't support integers of different sizes like this.

http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-INT

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: What's wrong with this SQL?

From
John DeSoi
Date:
On May 10, 2006, at 12:06 AM, Adam wrote:

> CREATE TABLE "users" ("user_ID" SERIAL(12), "first_name" character
> varying(40) NOT NULL, "last_name" character varying(40) NOT NULL,
> "password" character varying(16) NOT NULL, "email" character varying
> (100) NOT NULL, "privilege" integer(2) NOT NULL, PRIMARY KEY
> ("user_ID"))

PostgreSQL integers don't have a size property. So SERIAL(12) and
integer(2) are wrong. See the type choices here:

http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-
INT



John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


Re: Encoding Conversion

From
Rick Gigger
Date:
jef peeraer wrote:
> beer schreef:
>> Hello All
>>
>> So I have an old database that is ASCII_SQL encoded.  For a variety
>> of reasons I need to convert the database to UNICODE.  I did some
>> googling on this but have yet to find anything that looked like a
>> viable option, so i thought I'd post to the group and see what sort
>> of advice might arise. :)
> well i recently struggled with the same problem. After a lot of trial
> and error and reading, it seems that an ascii encoded database can't
> use its client encoding capabilities ( set client_encoding to utf8 ).
> i think the easist solution is to do a dump, recreate the database
> with a proper encoding, and restore the dump.
>
> jef peeraer
>>
>> TIA
>>
>> -b
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
>

In my experience ASCII_SQL will let you put anything in there.  You need
to figure out the actual encoding of the data.  Is it LATIN1?  Is it
UTF-8?  UTF-16?  I found that my old ASCII_SQL dbs, before they were
converted to unicode, contained 99.9% LATIN1 chars but also had a few
random weird characters thrown in from people copying and pasting from
office.  For instance MS Word uses these non-ascii standard characters
to implement it's "magic quotes" or whatever they call it where the
quotes curl in towards each other.

I had to identify what the bad chars were.  I think that viewing the
dump in vi showed me the hex codes for the non-ascii chars.  Then I
changed the encoding specified at the top of the dump as LATIN1.  Then I
used sed to remove them as I piped it into a postgres unicode db.

Rick