Thread: BUG #6510: A simple prompt is displayed using wrong charset

BUG #6510: A simple prompt is displayed using wrong charset

From

exclusion@gmail.com

Date:

03 March 2012, 14:44:57

The following bug has been logged on the website:

Bug reference:      6510
Logged by:          Alexander LAW
Email address:      exclusion@gmail.com
PostgreSQL version: 9.1.3
Operating system:   Windows
Description:=20=20=20=20=20=20=20=20

I'm using postgresSQL in Windows with Russian locale and get unreadable
messages when the postgres utilities prompting me for input.
Please look at the screenshot:
http://oi44.tinypic.com/aotje8.jpg
(The psql writes the unreadable message prompting for the password.)
But at the same time the following message (WARINING) displayed right.

I believe it's related to setlocale and the difference between OEM and ANSI
encoding, which we had in Windows with the Russian locale.
The startup code of psql sets locale with the call setlocale(LC_ALL, "") and
MSDN documentation says that the call:
Sets the locale to the default, which is the user-default ANSI code page
obtained from the operating system.

After the call all the strings printed with the printf(stdout) will go
through  the ANSI->OEM conversion.

But in the simple_prompt function strings written to con, and such writes go
without conversion.

I've made a little test to illustrate this:

#include "stdafx.h"
#include <locale.h>

int _tmain(int argc, _TCHAR* argv[])
{
    printf("=D0=9E=D0=9A\n");
    setlocale(0, "");
    fprintf(stdout, "=D0=9E=D0=9A\n");
    FILE * termin =3D fopen("con", "w");
    fprintf(termin, "=D0=9E=D0=9A\n");
    fflush(termin);
    return 0;
}

where "=D0=9E=D0=9A" is "OK" with russian letters.
This test gives the following result:
http://oi39.tinypic.com/35jgljs.jpg

The second line is readable, while the others are not.

If it can be helpful to understand the issue, I can perform another tests.

Thanks in advance,
Alexander

Re: BUG #6510: A simple prompt is displayed using wrong charset

From

Alvaro Herrera

Date:

16 March 2012, 16:14:20

Excerpts from exclusion's message of s=C3=A1b mar 03 15:44:37 -0300 2012:

> I'm using postgresSQL in Windows with Russian locale and get unreadable
> messages when the postgres utilities prompting me for input.
> Please look at the screenshot:
> http://oi44.tinypic.com/aotje8.jpg
> (The psql writes the unreadable message prompting for the password.)
> But at the same time the following message (WARINING) displayed right.
>=20
> I believe it's related to setlocale and the difference between OEM and AN=
SI
> encoding, which we had in Windows with the Russian locale.
> The startup code of psql sets locale with the call setlocale(LC_ALL, "") =
and
> MSDN documentation says that the call:
> Sets the locale to the default, which is the user-default ANSI code page
> obtained from the operating system.
>=20
> After the call all the strings printed with the printf(stdout) will go
> through  the ANSI->OEM conversion.
>=20
> But in the simple_prompt function strings written to con, and such writes=
 go
> without conversion.

Were you able to come up with some way to make this work?

--=20
=C3=81lvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6510: A simple prompt is displayed using wrong charset

From

Alexander LAW

Date:

18 March 2012, 06:05:11

I see two ways to resolve the issue.
First is to use CharToOemBuff when writing a string to the "con" and OemToCharBuff when reading an input from it.
The other is to always use stderr/stdin for Win32 as it was done for msys before. I think it's more straightforward.
I tested the attached patch (build the source with msvc) and it fixes the issue. If it looks acceptible, then probably DEVTTY should not be used on Windows at all.
I found two other references of DEVTTY at
psql/command.c
success = saveHistory(fname ? fname : DEVTTY, -1, false, false);

and
contrib/pg_upgrade/option.c
log_opts.debug_fd = fopen(DEVTTY, "w");

By the way, is there any reason to use stderr for the prompt output, not stdout?

Regards,
Alexander

16.03.2012 23:13, Alvaro Herrera пишет:

Excerpts from exclusion's message of sáb mar 03 15:44:37 -0300 2012:

I'm using postgresSQL in Windows with Russian locale and get unreadable
messages when the postgres utilities prompting me for input.
Please look at the screenshot:
http://oi44.tinypic.com/aotje8.jpg
(The psql writes the unreadable message prompting for the password.)
But at the same time the following message (WARINING) displayed right.

I believe it's related to setlocale and the difference between OEM and ANSI
encoding, which we had in Windows with the Russian locale.
The startup code of psql sets locale with the call setlocale(LC_ALL, "") and
MSDN documentation says that the call:
Sets the locale to the default, which is the user-default ANSI code page
obtained from the operating system.

After the call all the strings printed with the printf(stdout) will go
through  the ANSI->OEM conversion.

But in the simple_prompt function strings written to con, and such writes go
without conversion.

Were you able to come up with some way to make this work?

Attachment

sprompt.diff

Re: BUG #6510: A simple prompt is displayed using wrong charset

From

Alvaro Herrera

Date:

19 March 2012, 17:09:46

Excerpts from Alexander LAW's message of dom mar 18 06:04:51 -0300 2012:
> I see two ways to resolve the issue.
> First is to use CharToOemBuff when writing a string to the "con" and=20
> OemToCharBuff when reading an input from it.
> The other is to always use stderr/stdin for Win32 as it was done for=20
> msys before. I think it's more straightforward.

Using console directly instead of stdin/out/err is more appropriate when
asking for passwords and reading them back, because you can redirect the
rest of the output to/from files or pipes, without the prompt
interfering with that.  This also explains why stderr is used instead of
stdout.

--=20
=C3=81lvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6510: A simple prompt is displayed using wrong charset

From

Alexander LAW

Date:

20 March 2012, 16:50:43

Thanks, I've understood your point.
Please look at the patch. It implements the first way and it makes psql
work too.

Regards,
Alexander

20.03.2012 00:05, Alvaro Herrera пишет:
> Excerpts from Alexander LAW's message of dom mar 18 06:04:51 -0300 2012:
>> I see two ways to resolve the issue.
>> First is to use CharToOemBuff when writing a string to the "con" and
>> OemToCharBuff when reading an input from it.
>> The other is to always use stderr/stdin for Win32 as it was done for
>> msys before. I think it's more straightforward.
> Using console directly instead of stdin/out/err is more appropriate when
> asking for passwords and reading them back, because you can redirect the
> rest of the output to/from files or pipes, without the prompt
> interfering with that.  This also explains why stderr is used instead of
> stdout.
>

Attachment

sprompt.diff

Re: BUG #6510: A simple prompt is displayed using wrong charset

From

Alvaro Herrera

Date:

23 March 2012, 12:32:53

Excerpts from Alexander LAW's message of mar mar 20 16:50:14 -0300 2012:
> Thanks, I've understood your point.
> Please look at the patch. It implements the first way and it makes psql
> work too.

Great, thanks.  Hopefully somebody with Windows-compile abilities will
have a look at this.


--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding

From

Alexander Law

Date:

18 July 2012, 05:51:54

Hello,

The dump file itself is correct. The issue is only with the non-ASCII
object names in pg_dump messages.
The messages text (which is non-ASCII too) displayed consistently with
right encoding (i.e. with OS encoding thanks to libintl/gettext), but
encoding of db object names depends on the dump encoding and thus
they're getting unreadable when different encoding is used.
The same can be reproduced in Linux (where console encoding is UTF-8)
when doing dump with Windows-1251 or Latin1 (for western european
languages).

Thanks,
Alexander


    The following bug has been logged on the website:

    Bug reference:      6742
    Logged by:          Alexander LAW
    Email address:      exclusion(at)gmail(dot)com
    PostgreSQL version: 9.1.4
    Operating system:   Windows
    Description:

    When I try to dump database with UTF-8 encoding in Windows, I get unreadable
    object names.
    Please look at the screenshot (http://oi50.tinypic.com/2lw6ipf.jpg). On the
    left window all the pg_dump messages displayed correctly (except for the
    prompt password (bug #6510)), but the non-ASCII object name is gibberish. On
    the right window (where dump is done with the Windows 1251 encoding (OS
    Encoding for Russian locale)) everything is right.

Did you check the dump file using an editor that can handle UTF-8?
The Windows console is not known for properly handling that encoding.

Thomas