Thread: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding
BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding
From
exclusion@gmail.com
Date:
The following bug has been logged on the website: Bug reference: 6742 Logged by: Alexander LAW Email address: exclusion@gmail.com PostgreSQL version: 9.1.4 Operating system: Windows Description:=20=20=20=20=20=20=20=20 When I try to dump database with UTF-8 encoding in Windows, I get unreadable object names. Please look at the screenshot (http://oi50.tinypic.com/2lw6ipf.jpg). On the left window all the pg_dump messages displayed correctly (except for the prompt password (bug #6510)), but the non-ASCII object name is gibberish. On the right window (where dump is done with the Windows 1251 encoding (OS Encoding for Russian locale)) everything is right. It seems that pg_dump doesn't do necessary encoding conversion for the object names. For example, there is a code in pg_dump.c: write_msg(NULL, "finding the columns and types of table \"%s\"\n", tbinfo->dobj.name); or in backup_archiver.c ahlog(AH, 1, "setting owner and privileges for %s %s\n", te->desc, te->tag); And then it comes to the following function in dumputils.c: void vwrite_msg(const char *modulename, const char *fmt, va_list ap) { ... vfprintf(stderr, _(fmt), ap); } So the format string goes through the translation and encoding conversion (for the current OS locale), but tbinfo->dobj.name (or te->tag) does not. I think it would be appropriate to convert all the obect names with some function like dump_output_encoding_to_OS_encoding. Best regards, Alexander
Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding
From
Thomas Kellerer
Date:
exclusion@gmail.com, 18.07.2012 09:17: > The following bug has been logged on the website: > > Bug reference: 6742 > Logged by: Alexander LAW > Email address: exclusion@gmail.com > PostgreSQL version: 9.1.4 > Operating system: Windows > Description: > > When I try to dump database with UTF-8 encoding in Windows, I get unreadable > object names. > Please look at the screenshot (http://oi50.tinypic.com/2lw6ipf.jpg). On the > left window all the pg_dump messages displayed correctly (except for the > prompt password (bug #6510)), but the non-ASCII object name is gibberish. On > the right window (where dump is done with the Windows 1251 encoding (OS > Encoding for Russian locale)) everything is right. > Did you check the dump file using an editor that can handle UTF-8? The Windows console is not known for properly handling that encoding. Thomas