Thread: Covert database from ASCII to UTF-8

Covert database from ASCII to UTF-8

From
Jeff Amiel
Date:
It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8.  Have gone through a bit
ofeffort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour
dump-restore.

Using Postgres 8.4.X.

Are there any other magic options open to me?
Any way to do an in-place conversion?
I assume slony replication is an option.....
What about some sort of wal log shipping replication?

Any thoughts would be appreciated.





Re: Covert database from ASCII to UTF-8

From
Vibhor Kumar
Date:
On Feb 22, 2011, at 10:23 PM, Jeff Amiel wrote:

> It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8.  Have gone through a bit
ofeffort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour
dump-restore.
>
> Using Postgres 8.4.X.
>
> I assume slony replication is an option.....

Right! Replication would help. You can also try Bucardo.

> What about some sort of wal log shipping replication?


WAL Log shipping won't help.


Thanks & Regards,
Vibhor Kumar

Re: Covert database from ASCII to UTF-8

From
Vick Khera
Date:
On Tue, Feb 22, 2011 at 11:53 AM, Jeff Amiel <becauseimjeff@yahoo.com> wrote:
> I assume slony replication is an option.....

this is my plan, once i finish cleaning up the code and the DB data.
you have to ensure that whatever the original DB emits (in the form of
COPY and individual updates later on) will import correctly into the
new DB without error, else you're hosed.

also, I'd do it one table at a time rather than all at once, to
minimize failure cases if there is a problem with one table.


> What about some sort of wal log shipping replication?
>
I don't think you can do that.  This is a binary replication that
copies disk pages.

Re: Covert database from ASCII to UTF-8

From
Geoffrey Myers
Date:
Vibhor Kumar wrote:
> On Feb 22, 2011, at 10:23 PM, Jeff Amiel wrote:
>
>> It's come time to bite the bullet and convert a half-terraybyte database from ASCII to UTF8.  Have gone through a
bitof effort to track down the unclean ascii text and repair it but would like to avoid the outage of a many-many hour
dump-restore.
>>
>> Using Postgres 8.4.X.
>>
>> I assume slony replication is an option.....
>
> Right! Replication would help. You can also try Bucardo.

I'm assuming you're saying you can replicate from an ASCII database to
UTF8?  What happens to the data that is not UTF8 'friendly?'

>
>> What about some sort of wal log shipping replication?
>
>
> WAL Log shipping won't help.
>
>
> Thanks & Regards,
> Vibhor Kumar
>


--
Geoffrey Myers
Myers Consulting Inc.
770.592.1651

Re: Covert database from ASCII to UTF-8

From
Vick Khera
Date:
On Wed, Feb 23, 2011 at 9:06 AM, Geoffrey Myers
<geof@serioustechnology.com> wrote:
> I'm assuming you're saying you can replicate from an ASCII database to UTF8?
>  What happens to the data that is not UTF8 'friendly?'
>

The assumption up-thread was that the data was already made UTF8
friendly in the US_ASCII database.

Re: Covert database from ASCII to UTF-8

From
Dimitri Fontaine
Date:
Jeff Amiel <becauseimjeff@yahoo.com> writes:
> It's come time to bite the bullet and convert a half-terraybyte database
> from ASCII to UTF8.  Have gone through a bit of effort to track down the
> unclean ascii text and repair it but would like to avoid the outage of a
> many-many hour dump-restore.

Those blog articles of mine might be of interest to you:

  http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_1.html
  http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support