Thread: Backup messages displayed with wrong encoding

Backup messages displayed with wrong encoding

From
Alexander LAW
Date:
Hello,

I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS 
Win1251) with Postgresql 9.1.2.
Having database with UTF-8 encoding and objects, those names contain 
non-ASCII (Russian) characters, I get non-readable object names when 
performing backup in pgAdmin3.

I think it's happened because pgAdmin assumes that pg_dump always 
streams output using current locale encoding. But it's not the case, 
cause it can be a database encoding (by default) or the encoding 
specified explicitly in frmBackup.

Thanks,
Alexander



Re: Backup messages displayed with wrong encoding

From
Guillaume Lelarge
Date:
On Mon, 2011-12-12 at 09:49 +0400, Alexander LAW wrote:
> Hello,
> 
> I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS 
> Win1251) with Postgresql 9.1.2.
> Having database with UTF-8 encoding and objects, those names contain 
> non-ASCII (Russian) characters, I get non-readable object names when 
> performing backup in pgAdmin3.
> 

You mean when you restore it? pgAdmin is UTF-8 only but it accepts to
use other encodings to do the dump.

> I think it's happened because pgAdmin assumes that pg_dump always 
> streams output using current locale encoding. But it's not the case, 
> cause it can be a database encoding (by default) or the encoding 
> specified explicitly in frmBackup.
> 

pgAdmin doesn't assume anything. It simply launches pg_dump with the
options you set in the frmBackup dialog. Righ now, I cannot say wher the
issue is. I would need more info to guess that.


-- 
Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com



Re: Backup messages displayed with wrong encoding

From
Alexander LAW
Date:
Hi,
To make it clear I am posting two screenshots.
ss_backup_win1251 shows valid table name (which is "Test" in Russian),
but in ss_backup_utf8 you can see the name with wrong encoding.

When I said "pgAdmin assumes", I meant that it converts pg_admin output
stream to string as ANSI-encoded, but it's not always the case.
In fact, the opposite is common on Windows with Russian locale (and
non-ASCII object names), cause UTF-8 is a default encoding for a
database, but locale encoding (SBCS) is Win1251, and when you do backup
with a default encoding, you get an unreadable log.

Best regards,
Alexander


13.12.2011 00:13, Guillaume Lelarge wrote:
> On Mon, 2011-12-12 at 09:49 +0400, Alexander LAW wrote:
>> Hello,
>>
>> I am using pgAdmin3 1.14.1 on Windows 2008 R2 (Russian locale, SBCS
>> Win1251) with Postgresql 9.1.2.
>> Having database with UTF-8 encoding and objects, those names contain
>> non-ASCII (Russian) characters, I get non-readable object names when
>> performing backup in pgAdmin3.
>>
> You mean when you restore it? pgAdmin is UTF-8 only but it accepts to
> use other encodings to do the dump.
>
>> I think it's happened because pgAdmin assumes that pg_dump always
>> streams output using current locale encoding. But it's not the case,
>> cause it can be a database encoding (by default) or the encoding
>> specified explicitly in frmBackup.
>>
> pgAdmin doesn't assume anything. It simply launches pg_dump with the
> options you set in the frmBackup dialog. Righ now, I cannot say wher the
> issue is. I would need more info to guess that.
>
>


Attachment

Re: Backup messages displayed with wrong encoding

From
Guillaume Lelarge
Date:
On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote:
> Hi,
> To make it clear I am posting two screenshots.
> ss_backup_win1251 shows valid table name (which is "Test" in Russian), 
> but in ss_backup_utf8 you can see the name with wrong encoding.
> 
> When I said "pgAdmin assumes", I meant that it converts pg_admin output 
> stream to string as ANSI-encoded, but it's not always the case.
> In fact, the opposite is common on Windows with Russian locale (and 
> non-ASCII object names), cause UTF-8 is a default encoding for a 
> database, but locale encoding (SBCS) is Win1251, and when you do backup 
> with a default encoding, you get an unreadable log.
> 

pgAdmin simply displays what pg_dump gives him. If it's in the right
encoding, you'll see your tables' name correct. I'm not sure it would be
a good idea to grab every line and to convert them in whatever encoding
pgAdmin would like. If it's possible at all.


-- 
Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com



Re: Backup messages displayed with wrong encoding

From
Alexander LAW
Date:
Hello,
I don't think that Win1251 encoding is more right than UTF-8. IMHO, a 
program should understand what encoding is text in or let me choose the 
encoding to read the text. If you would consider this behavior as a 
problem (for the conditions described) you could solve it by providing a 
combobox in the Messages tab, that lets me choose the encoding of the 
log. But then I will choose there the same encoding as I did before in 
the File Options tab or the encoding of the database. So pgAdmin knows 
which encoding to use when reading the pg_admin output stream.
And about the need and possibility of conversion, I believe that the 
every line of the log is converted anyway. Please look at 
sysProcess:ReadStream, there you have strings read from input and 
appended to txtMessages.

str.Append(wxString::Format(wxT("%s"), wxString(buffer, 
wxConvLibc).c_str()));

As I understand, wxConvLibc here specifies that the input strings always 
are in OS locale encoding, but IMO this should depend on the backup 
encoding.

Best regards

14.12.2011 00:20, Guillaume Lelarge пишет:
> On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote:
>> Hi,
>> To make it clear I am posting two screenshots.
>> ss_backup_win1251 shows valid table name (which is "Test" in Russian),
>> but in ss_backup_utf8 you can see the name with wrong encoding.
>>
>> When I said "pgAdmin assumes", I meant that it converts pg_admin output
>> stream to string as ANSI-encoded, but it's not always the case.
>> In fact, the opposite is common on Windows with Russian locale (and
>> non-ASCII object names), cause UTF-8 is a default encoding for a
>> database, but locale encoding (SBCS) is Win1251, and when you do backup
>> with a default encoding, you get an unreadable log.
>>
> pgAdmin simply displays what pg_dump gives him. If it's in the right
> encoding, you'll see your tables' name correct. I'm not sure it would be
> a good idea to grab every line and to convert them in whatever encoding
> pgAdmin would like. If it's possible at all.
>
>



Re: Backup messages displayed with wrong encoding

From
Alexander LAW
Date:
Hi,

I would like to clarify the issue with the encoding.

Now that we've got postgres localized (with russian messages), I can see
that the encoding of the pg_dump output not changed depending on the
database setting or the pg_dump option.
It seems that pg_dump changes only the encoding of a database objects
names (see the screenshot).
So it looks like a pg_dump bug, not a feature.
Thanks for your feedback.

Best regards,
Alexander

14.12.2011 09:12, Alexander LAW пишет:
> Hello,
> I don't think that Win1251 encoding is more right than UTF-8. IMHO, a
> program should understand what encoding is text in or let me choose
> the encoding to read the text. If you would consider this behavior as
> a problem (for the conditions described) you could solve it by
> providing a combobox in the Messages tab, that lets me choose the
> encoding of the log. But then I will choose there the same encoding as
> I did before in the File Options tab or the encoding of the database.
> So pgAdmin knows which encoding to use when reading the pg_admin
> output stream.
> And about the need and possibility of conversion, I believe that the
> every line of the log is converted anyway. Please look at
> sysProcess:ReadStream, there you have strings read from input and
> appended to txtMessages.
>
> str.Append(wxString::Format(wxT("%s"), wxString(buffer,
> wxConvLibc).c_str()));
>
> As I understand, wxConvLibc here specifies that the input strings
> always are in OS locale encoding, but IMO this should depend on the
> backup encoding.
>
> Best regards
>
> 14.12.2011 00:20, Guillaume Lelarge пишет:
>> On Tue, 2011-12-13 at 08:15 +0400, Alexander LAW wrote:
>>> Hi,
>>> To make it clear I am posting two screenshots.
>>> ss_backup_win1251 shows valid table name (which is "Test" in Russian),
>>> but in ss_backup_utf8 you can see the name with wrong encoding.
>>>
>>> When I said "pgAdmin assumes", I meant that it converts pg_admin output
>>> stream to string as ANSI-encoded, but it's not always the case.
>>> In fact, the opposite is common on Windows with Russian locale (and
>>> non-ASCII object names), cause UTF-8 is a default encoding for a
>>> database, but locale encoding (SBCS) is Win1251, and when you do backup
>>> with a default encoding, you get an unreadable log.
>>>
>> pgAdmin simply displays what pg_dump gives him. If it's in the right
>> encoding, you'll see your tables' name correct. I'm not sure it would be
>> a good idea to grab every line and to convert them in whatever encoding
>> pgAdmin would like. If it's possible at all.
>>
>>
>


Attachment