Re: Significance of Database Encoding - Mailing list pgsql-sql

From Rajesh Mallah
Subject Re: Significance of Database Encoding
Date
Msg-id 20050516021650.78342.qmail@web31014.mail.mud.yahoo.com
Whole thread Raw
In response to Significance of Database Encoding  (Rajesh Mallah <mallah_rajesh@yahoo.com>)
List pgsql-sql
--- PFC <lists@boutiquenumerique.com> wrote:
> 
> > $ iconv -f US-ASCII -t UTF-8  < test.sql > out.sql
> > iconv: illegal input sequence at position 114500
> >
> > Any ideas how the job can be accomplised reliably.
> >
> > Also my database may contain data in multiple encodings
> > like WINDOWS-1251 and WINDOWS-1256 in various places
> > as data has been inserted by different peoples using
> > different sources and client software.
> 
>     You could use a simple program like that (in Python):
> 
> output = open( "unidump", "w" )
> for line in open( "your dump" ):
>     for encoding in "utf-8", "iso-8859-15", "whatever":
>         try:
>             output.write( unicode( line, encoding ).encode( "utf-8" ))
>             break
>         except UnicodeError:
>             pass
>     else:
>         print "No suitable encoding for line..."


This may not work . Becuase ,conversion to utf-8 can be successfull (no runtime error)
even for an incorrect guess of the original encoding but the  result will be  an 
incorrect utf8. 

Regds
Rajesh Kumar Mallah


> 
>     I'd say this might work, if UTF-8 cannot absorb an apostrophe inside a  
> multibit character. Can it ?
> 
>     Or you could do that to all your table using SELECTs but it's going to be  
> painful...
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
> 

    
__________________________________ 
Do you Yahoo!? 
Read only the mail you want - Yahoo! Mail SpamGuard. 
http://promotions.yahoo.com/new_mail 


pgsql-sql by date:

Previous
From: PFC
Date:
Subject: Re: Significance of Database Encoding
Next
From: "Ilya A. Kovalenko"
Date:
Subject: choosing index to use