Re: Error while loading sql file - Mailing list pgsql-general
From | Adarsh Sharma |
---|---|
Subject | Re: Error while loading sql file |
Date | |
Msg-id | 4EF95879.6080406@orkash.com Whole thread Raw |
In response to | Re: Error while loading sql file (Alban Hertroys <haramrae@gmail.com>) |
Responses |
Re: Error while loading sql file
Re: Error while loading sql file Re: Error while loading sql file |
List | pgsql-general |
Thanks for the Explaination,
I find it hard to determine the way to store data in different encodings to store in postgresql, below is the demo of some data :-
INSERT INTO conceptnet_frame VALUES(3884,'ja','{1}は{2}を持っている。',16,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3885,'ja','{1}は{2}と同じくらい大きい。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3886,'ja','{1}は{2}と同じくらいの大きさである。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3887,'ja','{1}は{2}と同じくらい小さい。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3888,'ja','{1}の痛みの強度は、{2}と同じくらい。',29,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3889,'ja','{1}の痛み方は、{2}に似ている。',28,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3890,'ja','{1}は、{2}のインスタンスである。',14,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3891,'ja','{2}に成功したい時、{1}は一般的だ。',9,3,2140,NULL,NULL,NULL);
Below link explains all the things :-
http://www.depesz.com/index.php/2010/03/07/error-invalid-byte-sequence-for-encoding/
Above link shows the above encoding schemes is in utf16 format but postgresql-8.4 doesn't support it.
Is there any way to store data in different encoding in a utf-8 database.
Happy Holidays!
Alban Hertroys wrote:
I find it hard to determine the way to store data in different encodings to store in postgresql, below is the demo of some data :-
INSERT INTO conceptnet_frame VALUES(3884,'ja','{1}は{2}を持っている。',16,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3885,'ja','{1}は{2}と同じくらい大きい。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3886,'ja','{1}は{2}と同じくらいの大きさである。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3887,'ja','{1}は{2}と同じくらい小さい。',31,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3888,'ja','{1}の痛みの強度は、{2}と同じくらい。',29,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3889,'ja','{1}の痛み方は、{2}に似ている。',28,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3890,'ja','{1}は、{2}のインスタンスである。',14,3,2140,NULL,NULL,NULL);
INSERT INTO conceptnet_frame VALUES(3891,'ja','{2}に成功したい時、{1}は一般的だ。',9,3,2140,NULL,NULL,NULL);
Below link explains all the things :-
http://www.depesz.com/index.php/2010/03/07/error-invalid-byte-sequence-for-encoding/
Above link shows the above encoding schemes is in utf16 format but postgresql-8.4 doesn't support it.
Is there any way to store data in different encoding in a utf-8 database.
Happy Holidays!
Alban Hertroys wrote:
On 26 Dec 2011, at 8:22, Adarsh Sharma wrote:Dear all, I am facing a unique issue when I try to load an sql into a postgresql database :-Actually, your issue isn't unique at all. You'll find it reoccurs on this list regularly, although perhaps less frequent these days.I faced an issue some days ago & I solved the issue by the below command : cat backup.sql | recode iso-8859-1..u8 > backup.sqlThat command assumes that every string in the sql file is encoded as iso-8859-1 (unless it already is unicode).But this time the byte sequence changes to Japanese , & I fail to solve the issue. Please let me know how to solve the issue as typing the error in Google shows only one link: ( http://blog.e-shell.org/134 )The above recode command works for the guys in the blog post you linked, as they were converting a database with Spanish data to UTF-8. They knew what encoding they were coming from. In your case, you have a mixed bag of encodings, going all the way from latin encodings to japanese. I'm not sure what recode would do to data that's in a different encoding than the specified source encoding - I expect that it will just assume it's in the specified source encoding (it cannot know that this isn't the case for a particular string) and attempt to convert it to UTF-8 _using that encoding_. Chances are you just converted valid data in a different encoding (than the source encoding you specified) into garbage in UTF-8... I seem to recall that if recode runs into problems recoding a string to UTF-8 it will leave it untouched, but that will NOT happen in all cases. Sometimes it will succeed, even though the result has no meaning to a human. That's a nasty problem you ran into, I hope the archives provide the wisdom you need. Alban Hertroys -- Screwing up is an excellent way to attach something to the ceiling.
pgsql-general by date: