Re: Change server encoding after the fact - Mailing list pgsql-general

From Cody Caughlan
Subject Re: Change server encoding after the fact
Date
Msg-id CAPVp=gYBrYqo2tegVHNYFE+jUJBx83wHe__3nt39TAAn8bM1QQ@mail.gmail.com
Whole thread Raw
In response to Re: Change server encoding after the fact  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: Change server encoding after the fact  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: Change server encoding after the fact  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Please see below.

On Fri, Sep 30, 2011 at 11:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Sep 30, 2011 at 12:26 PM, Cody Caughlan <toolbag@gmail.com> wrote:
> Thanks Scott. See below:
>
> On Fri, Sep 30, 2011 at 11:05 AM, Scott Marlowe <scott.marlowe@gmail.com>
> wrote:
>>
>> On Fri, Sep 30, 2011 at 11:25 AM, Cody Caughlan <toolbag@gmail.com> wrote:
>> > I would like to change my server_encoding which is currently SQL_ASCII
>> > to UTF8.
>> >
>> > I have existing data that I would like to keep.
>> >
>> > From my understanding of the steps I need to:
>> >
>> > 1) alter the template1 database encoding via
>> >
>> > UPDATE pg_database SET encoding = 6 where datname IN ('template0',
>> > 'template1');
>>
>> Just create database using template0 as template and you can skip this
>> step ^^
>
>
> Wouldn't this only work if my template0 was UTF8 itself?
> => select datname, pg_encoding_to_char(encoding) from pg_database;
>        datname        | pg_encoding_to_char
> ----------------------+---------------------
>  template1            | SQL_ASCII
>  template0            | SQL_ASCII
>  postgres             | SQL_ASCII
>
> So it appears both template0 & template1 are SQL_ASCII, so how would
> creating from a new DB from template0 be any different than template1?

Well, let's try, shall we?  From a freshly created cluster on my
laptop, running 8.4:

smarlowe=# select datname, pg_encoding_to_char(encoding) from pg_database;
 datname  | pg_encoding_to_char
-----------+---------------------
 template1 | SQL_ASCII
 template0 | SQL_ASCII
 postgres  | SQL_ASCII
 smarlowe  | SQL_ASCII
(4 rows)

smarlowe=# create database j template template0 encoding 'UTF8';
CREATE DATABASE

Seems to work.

P.s. I'm not sure why it works, I just know that it does. :)


Ok, I see what you mean. This would create a new DB with the proper encoding. Which is "fine", and probably what I will do. I guess I see an ideal scenario being one where we permanently convert the template encoding to UTF8 so going forward I dont have to worry about forgetting to adding the encoding= 'UTF8' for every new DB I create.
 
>> > Are these the correct steps to perform or is there an easier / in-place
>> > way?
>>
>> > Also, when I dump my old DB and restore it, will it be converted
>> > appropriately (e.g. it came from am SQL_ASCII encoding and its going into a
>> > UTF-8 database)?
>>
>> You might need to set client encoding when restoring.  Or use iconv to
>> convert from one encoding to another, which is what I usually do.
>> Note that it's VERY likely you'll have data in a SQL_ASCII db that
>> won't go into a UTF8 database without some lossiness.
>
>
> Yes, I see this might be the case. From my playing around with iconv I
> cannot even properly do the conversion:
> $ pg_dump -Fp foo > foo.sql
> $ file -i foo.sql
> foo.sql: text/plain; charset=us-ascii
> $ iconv -f utf-8 foo.sql > utf8.sql
> iconv: illegal input sequence at position 2512661

I think you got it backwards, the -f should be somthing other than
utf-8 right?  That's what the -t should be right?  Try iconv without a
-f switch and a -t of utf-8 and see what happens...

You're right, I had -f when I needed -t. I tried it again with the same error:

$ iconv -t utf-8 foo.sql > utf.sql
iconv: illegal input sequence at position 2512661

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Change server encoding after the fact
Next
From: Scott Marlowe
Date:
Subject: Re: Change server encoding after the fact