Thread: BUG #5800: "corrupted" error messages (encoding problem ?)

BUG #5800: "corrupted" error messages (encoding problem ?)

From
"Carlo Curatolo"
Date:
The following bug has been logged online:

Bug reference:      5800
Logged by:          Carlo Curatolo
Email address:      genamiga@brutele.be
PostgreSQL version: 9.0.2 64bits
Operating system:   Windows 7 64bits
Description:        "corrupted" error messages (encoding problem ?)
Details:

On a new PC I install only Windows (French) and PosgreSQL 9.0.2 with default
parameters and without creating any object.

Example of "corrupted" error messages that occurs on several client software
(my own Java program, dbVisualizer, EMS SQL Manager)

[Error Code: 0, SQL State: 42601]  ERREUR: erreur de syntaxe � la fin de
l'entr�e

Test 1 : Windows 7 64 bits and PosgreSQL 9.0.2 64 bits

... the problem occurs

Test 2 : Windows 7 64 bits and PosgreSQL 9.0.2 32 bits

... the problem occurs

Test 3 : Windows 7 32 bits and PosgreSQL 9.0.2 32 bits

... NO problem, the error message is correct
[Error Code: 0, SQL State: 42601]  ERREUR: erreur de syntaxe à la fin de
l'entrée

This issue occurs only when PostgreSQL 9.0.2 is installed on Windows 7 64
bits...

Is there a solution ?
A workaround ?
Probably this will be fixed in the next release ?

Thanking you in advance.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
On Tue, Dec 21, 2010 at 3:47 PM, Carlo Curatolo <genamiga@brutele.be> wrote:
>
> The following bug has been logged online:
>
> Bug reference: =C2=A0 =C2=A0 =C2=A05800
> Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Carlo Curatolo
> Email address: =C2=A0 =C2=A0 =C2=A0genamiga@brutele.be
> PostgreSQL version: 9.0.2 64bits
> Operating system: =C2=A0 Windows 7 64bits
> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0"corrupted" error messages (encod=
ing problem ?)
> Details:
>
> On a new PC I install only Windows (French) and PosgreSQL 9.0.2 with defa=
ult
> parameters and without creating any object.
>
> Example of "corrupted" error messages that occurs on several client softw=
are
> (my own Java program, dbVisualizer, EMS SQL Manager)
>
> [Error Code: 0, SQL State: 42601] =C2=A0ERREUR: erreur de syntaxe =EF=BF=
=BD la fin de
> l'entr=EF=BF=BDe
>
> Test 1 : Windows 7 64 bits and PosgreSQL 9.0.2 64 bits
>
> ... the problem occurs
>
> Test 2 : Windows 7 64 bits and PosgreSQL 9.0.2 32 bits
>
> ... the problem occurs
>
> Test 3 : Windows 7 32 bits and PosgreSQL 9.0.2 32 bits
>
> ... NO problem, the error message is correct
> [Error Code: 0, SQL State: 42601] =C2=A0ERREUR: erreur de syntaxe =C3=A0 =
la fin de
> l'entr=C3=A9e
>
> This issue occurs only when PostgreSQL 9.0.2 is installed on Windows 7 64
> bits...

FYI, we've been investigating this, however, whilst we can reproduce
the same issue, we see it in different circumstances which conflict
with yours:

You reported:

64 bit OS, 64 bit PG - corruption
64 bit OS, 32 bit PG - corruption
32 bit OS, 32 bit PG - OK

We see:

64 bit OS, 64 bit PG - OK
64 bit OS, 32 bit PG - corruption
32 bit OS, 32 bit PG - corruption

That implies to us that this is something environmental, rather than a
build or installer bug (between us, both installers work correctly on
their native platforms).

What does the environment look like on both of your servers? Try
running "\! set" from a psql session in each.

--=20
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
[Please keep messages on the mailing list]

Hi,

I don't see anything there that gives me any ideas. Anyone else have any id=
eas?

2011/1/7 G=C3=A9n=C3=A9ration Amiga <genamiga@brutele.be>:
> Hello Dave,
>
> Here are the result of "\! set" on the servers and I still have W7-32 and=
 W7-64 (WinXP died)
> **************
> W7-64 and PG9-64
> **************
> postgres=3D# \! set;
> =3D::=3D::\
> =3DC:=3DC:\Program Files\PostgreSQL\9.0\scripts
> ALLUSERSPROFILE=3DC:\ProgramData
> APPDATA=3DC:\Users\Carlo\AppData\Roaming
> CLIENTENCODING_JP=3D0
> CommonProgramFiles=3DC:\Program Files\Common Files
> CommonProgramFiles(x86)=3DC:\Program Files (x86)\Common Files
> CommonProgramW6432=3DC:\Program Files\Common Files
> COMPUTERNAME=3DWIN7-64
> ComSpec=3DC:\Windows\system32\cmd.exe
> database=3Dpostgres
> FP_NO_HOST_CHECK=3DNO
> HOMEDRIVE=3DC:
> HOMEPATH=3D\Users\Carlo
> LOCALAPPDATA=3DC:\Users\Carlo\AppData\Local
> LOGONSERVER=3D\\WIN7-64
> NUMBER_OF_PROCESSORS=3D2
> OS=3DWindows_NT
> Path=3DC:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows=
\System32\WindowsPowerShell\v1.0\
> PATHEXT=3D.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
> PGLOCALEDIR=3DC:/Program Files/PostgreSQL/9.0/share/locale
> PGSYSCONFDIR=3DC:/Program Files/PostgreSQL/9.0/etc
> port=3D5432
> PROCESSOR_ARCHITECTURE=3DAMD64
> PROCESSOR_IDENTIFIER=3DIntel64 Family 6 Model 15 Stepping 6, GenuineIntel
> PROCESSOR_LEVEL=3D6
> PROCESSOR_REVISION=3D0f06
> ProgramData=3DC:\ProgramData
> ProgramFiles=3DC:\Program Files
> ProgramFiles(x86)=3DC:\Program Files (x86)
> ProgramW6432=3DC:\Program Files
> PROMPT=3D$P$G
> PSModulePath=3DC:\Windows\system32\WindowsPowerShell\v1.0\Modules\
> PUBLIC=3DC:\Users\Public
> server=3Dlocalhost
> SESSIONNAME=3DConsole
> SystemDrive=3DC:
> SystemRoot=3DC:\Windows
> TEMP=3DC:\Users\Carlo\AppData\Local\Temp
> TMP=3DC:\Users\Carlo\AppData\Local\Temp
> USERDOMAIN=3DWIN7-64
> USERNAME=3Dpostgres
> USERPROFILE=3DC:\Users\Carlo
> windir=3DC:\Windows
> postgres=3D#
>
> **************
> W7-32 and PG9-32
> **************
> postgres=3D# \! set;
> =3D::=3D::\
> =3DC:=3DC:\Program Files\PostgreSQL\9.0\scripts
> ALLUSERSPROFILE=3DC:\ProgramData
> APPDATA=3DC:\Users\Carlo\AppData\Roaming
> CLIENTENCODING_JP=3D0
> CommonProgramFiles=3DC:\Program Files\Common Files
> COMPUTERNAME=3DWIN7-32
> ComSpec=3DC:\Windows\system32\cmd.exe
> database=3Dpostgres
> FP_NO_HOST_CHECK=3DNO
> HOMEDRIVE=3DC:
> HOMEPATH=3D\Users\Carlo
> LOCALAPPDATA=3DC:\Users\Carlo\AppData\Local
> LOGONSERVER=3D\\WIN7-32
> NUMBER_OF_PROCESSORS=3D1
> OS=3DWindows_NT
> Path=3DC:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows=
\System32\WindowsPowerShell\v1.0\
> PATHEXT=3D.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
> PGLOCALEDIR=3DC:/Program Files/PostgreSQL/9.0/share/locale
> PGSYSCONFDIR=3DC:/Program Files/PostgreSQL/9.0/etc
> port=3D5432
> PROCESSOR_ARCHITECTURE=3Dx86
> PROCESSOR_IDENTIFIER=3Dx86 Family 15 Model 2 Stepping 4, GenuineIntel
> PROCESSOR_LEVEL=3D15
> PROCESSOR_REVISION=3D0204
> ProgramData=3DC:\ProgramData
> ProgramFiles=3DC:\Program Files
> PROMPT=3D$P$G
> PSModulePath=3DC:\Windows\system32\WindowsPowerShell\v1.0\Modules\
> PUBLIC=3DC:\Users\Public
> server=3Dlocalhost
> SESSIONNAME=3DConsole
> SystemDrive=3DC:
> SystemRoot=3DC:\Windows
> TEMP=3DC:\Users\Carlo\AppData\Local\Temp
> TMP=3DC:\Users\Carlo\AppData\Local\Temp
> USERDOMAIN=3DWin7-32
> USERNAME=3Dpostgres
> USERPROFILE=3DC:\Users\Carlo
> windir=3DC:\Windows
> postgres=3D#
> ***********************************************
> I can send you whatever you need, I don't touch anything to those servers=
. I use them only for testing and I have image backups.
>
> Best regards.
>
> Curatolo Carlo
>
> =3D =3D =3D =3D =3D =3D =3D =3D Message d'origine du 2011-01-07 =C3=A0 11=
:41:30 =3D =3D =3D =3D =3D =3D =3D =3D
> On Tue, Dec 21, 2010 at 3:47 PM, Carlo Curatolo <genamiga@brutele.be> wro=
te:
>>
>> The following bug has been logged online:
>>
>> Bug reference: =C2=A0 =C2=A0 =C2=A05800
>> Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Carlo Curatolo
>> Email address: =C2=A0 =C2=A0 =C2=A0genamiga@brutele.be
>> PostgreSQL version: 9.0.2 64bits
>> Operating system: =C2=A0 Windows 7 64bits
>> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0"corrupted" error messages (enco=
ding problem ?)
>> Details:
>>
>> On a new PC I install only Windows (French) and PosgreSQL 9.0.2 with def=
ault
>> parameters and without creating any object.
>>
>> Example of "corrupted" error messages that occurs on several client soft=
ware
>> (my own Java program, dbVisualizer, EMS SQL Manager)
>>
>> [Error Code: 0, SQL State: 42601] =C2=A0ERREUR: erreur de syntaxe =EF=BF=
=BD la fin de
>> l'entr=EF=BF=BDe
>>
>> Test 1 : Windows 7 64 bits and PosgreSQL 9.0.2 64 bits
>>
>> ... the problem occurs
>>
>> Test 2 : Windows 7 64 bits and PosgreSQL 9.0.2 32 bits
>>
>> ... the problem occurs
>>
>> Test 3 : Windows 7 32 bits and PosgreSQL 9.0.2 32 bits
>>
>> ... NO problem, the error message is correct
>> [Error Code: 0, SQL State: 42601] =C2=A0ERREUR: erreur de syntaxe =C3=A0=
 la fin de
>> l'entr=C3=A9e
>>
>> This issue occurs only when PostgreSQL 9.0.2 is installed on Windows 7 64
>> bits...
> FYI, we've been investigating this, however, whilst we can reproduce
> the same issue, we see it in different circumstances which conflict
> with yours:
> You reported:
> 64 bit OS, 64 bit PG - corruption
> 64 bit OS, 32 bit PG - corruption
> 32 bit OS, 32 bit PG - OK
> We see:
> 64 bit OS, 64 bit PG - OK
> 64 bit OS, 32 bit PG - corruption
> 32 bit OS, 32 bit PG - corruption
> That implies to us that this is something environmental, rather than a
> build or installer bug (between us, both installers work correctly on
> their native platforms).
> What does the environment look like on both of your servers? Try
> running "\! set" from a psql session in each.
> --
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> -----
> Aucun virus trouv=C3=A9 dans ce message.
> Analyse effectu=C3=A9e par AVG - www.avg.fr
> Version: 10.0.1191 / Base de donn=C3=A9es virale: 1435/3364 - Date: 06/01=
/2011
>
> =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =
=3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D =3D=
 =3D =3D



--=20
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Susanne Ebrecht
Date:
On 07.01.2011 12:35, Dave Page wrote:
> [Please keep messages on the mailing list]
>
> Hi,
>
> I don't see anything there that gives me any ideas. Anyone else have any ideas?

Hello,

Yes.

I would like to see output of CHCP.
Which Windows codepage is used? 850?

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
On Fri, Jan 7, 2011 at 12:37 PM, Susanne Ebrecht
<susanne@2ndquadrant.com> wrote:
> On 07.01.2011 12:35, Dave Page wrote:
>>
>> [Please keep messages on the mailing list]
>>
>> Hi,
>>
>> I don't see anything there that gives me any ideas. Anyone else have any
>> ideas?
>
> Hello,
>
> Yes.
>
> I would like to see output of CHCP.
> Which Windows codepage is used? 850?

In our testing, the windows codepage was 1252, and the console
codepage was 850 (giving the normal warning about the mismatch that
we've seen for years). That was the case on installations with, and
without the incorrect formatting.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
2011/1/27 G=E9n=E9ration Amiga <genamiga@brutele.be>:
> Hello Dave,
>
> Any news about that encoding problem ?

We've been working with a couple of our friends in Japan, and it looks
like one of them has tracked down an issue in the gettext library. It
looks like we can work around it. We'l try to get it into the next
release.

> Our messages don't appear on the "pgsql-bugs" web page...
> How can I do that ?

Yes they do: http://archives.postgresql.org/pgsql-bugs/2010-12/msg00156.php

--=20
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
genamiga
Date:
I tried with 9.0.3...same problem...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4248990.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
genamiga
Date:
I tried with 9.1.alpha5...same problem...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4291142.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
On Tue, Mar 22, 2011 at 7:07 AM, genamiga <genamiga@brutele.be> wrote:
> I tried with 9.0.3...same problem...

This should be resolved in 9.0.4 btw.



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
Just tested the 9.0.4...same problem I am affraid...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4340437.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
On Tue, Apr 26, 2011 at 10:07 AM, Carlo Curatolo <genamiga@brutele.be> wrote:
> Just tested the 9.0.4...same problem I am affraid...

Uh, that's odd. I've asked someone to see if we can reproduce it again.


--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
Last test...

On a new PC I install in this order :

- Windows 7 Pro SP1 x64 French (brand new original DVD)
- postgresql-9.0.4-1-windows_x64.exe (installing with Locale French, Belgium
or default locale, same results)
- jre-6u24-windows-x64
- dbvis_windows-x64_7_1_4.exe
- postgresql-9.0-801.jdbc4.jar for use with java test application and
DBVisualizer

Here is the Java test application source :
http://postgresql.1045698.n5.nabble.com/file/n4346044/Main.java Main.java

Here are the results :
http://postgresql.1045698.n5.nabble.com/file/n4346044/java_app_cmd.png
http://postgresql.1045698.n5.nabble.com/file/n4346044/java_app_swing.png
http://postgresql.1045698.n5.nabble.com/file/n4346044/dbvisualizer.png

Result in logfile is correct...
http://postgresql.1045698.n5.nabble.com/file/n4346044/postgresql-2011-04-28_095111.log
postgresql-2011-04-28_095111.log

Result in PGAdmin is correct...client_encoding is also UNICODE but error
message is correct like in the log file...

So...I have posted the first time on December...

I am installing my new server (database and files). I am ready to migrate
everything on it.

I have tested my application and except that everything works fine. I would
like to use PG9 in production now.

If I use that in this state, this problem can be fixed with a simple install
of a new release ?
Or do I have to reinstall everything ?
Do I have to wait a release without this problem ?

Please let me know.
Thanks in advance.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4346044.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Dave Page
Date:
On Thu, Apr 28, 2011 at 9:53 AM, Carlo Curatolo <genamiga@brutele.be> wrote:
> Last test...
>
> On a new PC I install in this order :
>
> - Windows 7 Pro SP1 x64 French (brand new original DVD)
> - postgresql-9.0.4-1-windows_x64.exe (installing with Locale French, Belgium
> or default locale, same results)
> - jre-6u24-windows-x64
> - dbvis_windows-x64_7_1_4.exe
> - postgresql-9.0-801.jdbc4.jar for use with java test application and
> DBVisualizer
>
> Here is the Java test application source :
> http://postgresql.1045698.n5.nabble.com/file/n4346044/Main.java Main.java
>
> Here are the results :
> http://postgresql.1045698.n5.nabble.com/file/n4346044/java_app_cmd.png
> http://postgresql.1045698.n5.nabble.com/file/n4346044/java_app_swing.png
> http://postgresql.1045698.n5.nabble.com/file/n4346044/dbvisualizer.png
>
> Result in logfile is correct...
> http://postgresql.1045698.n5.nabble.com/file/n4346044/postgresql-2011-04-28_095111.log
> postgresql-2011-04-28_095111.log
>
> Result in PGAdmin is correct...client_encoding is also UNICODE but error
> message is correct like in the log file...
>
> So...I have posted the first time on December...
>
> I am installing my new server (database and files). I am ready to migrate
> everything on it.
>
> I have tested my application and except that everything works fine. I would
> like to use PG9 in production now.
>
> If I use that in this state, this problem can be fixed with a simple install
> of a new release ?
> Or do I have to reinstall everything ?
> Do I have to wait a release without this problem ?

So you're saying it works for you now in PostgreSQL and pgAdmin?

I can't help with Java apps or dbVisualizer I'm afraid.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
My actual PG 8.4 production server works perfectly and in the reports
correctly errors everywhere (PGAdmin, DBVisualizer, Java applications, EMS
SQL Manager Lite)...

If somebody have an idea...a workaround...a solution...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4346143.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
Same test but with the 32bits version of PG9

The problem do NOT occurs...

Everything works paerfectly everywhere...

Where I have to post my problem to have a solution ?

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4346180.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Robert Haas
Date:
On Thu, Apr 28, 2011 at 6:31 AM, Carlo Curatolo <genamiga@brutele.be> wrote:
> Same test but with the 32bits version of PG9
>
> The problem do NOT occurs...
>
> Everything works paerfectly everywhere...
>
> Where I have to post my problem to have a solution ?

The problem isn't that you are posting in the wrong place, or that the
right people aren't listening.  The problem is that after repeated
attempts, we haven't been able to figure out exactly what is going
wrong here.  It's not even clear to me whether this is a PostgreSQL
bug, an installer bug, or expected Windows behavior caused by some
non-obvious aspect of your configuration.  The EnterpriseDB team has
made multiple attempts to reproduce this internally, and while we've
managed to create various weird behaviors, they aren't obviously the
same as what's happening to you.  :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
Just tried with PG 9.1 64bits...same problem...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4812066.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Carlo Curatolo
Date:
Just tried with PG 9.1...same problem...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/BUG-5800-corrupted-error-messages-encoding-problem-tp3313951p4812062.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Craig Ringer
Date:
On 09/17/2011 05:10 AM, Carlo Curatolo wrote:
> Just tried with PG 9.1...same problem...
Yep. There appears to be no interest in fixing this bug. All the
alternatives I proposed were rejected, and there doesn't seem to be any
concern about the issue. I'd be willing to have a go at the issue if
there some indication that a patch might be considered, but if I don't
have an approach that might even be considered for inclusion there isn't
much point.

There are several sources of differently-encoded text that may appear in
logs. These are often set to the same encoding, but need not necessarily
be. The only valid fixes are to log them to different files (with some
way to identify which encoding is used) or convert them all to a single
standard encoding - probably UTF-8 - for logging. When syslog is used,
encoding conversion is the only valid answer.

--
Craig Ringer

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Tom Lane
Date:
Craig Ringer <ringerc@ringerc.id.au> writes:
> On 09/17/2011 05:10 AM, Carlo Curatolo wrote:
>> Just tried with PG 9.1...same problem...

> Yep. There appears to be no interest in fixing this bug. All the
> alternatives I proposed were rejected, and there doesn't seem to be any
> concern about the issue.

The problem is to find a cure that's not worse than the disease.
I'm not exactly convinced that forcing all log messages into a common
encoding is a better behavior than allowing backends to log in their
native database encoding.

If you do want a common encoding, there's a very easy way to get it, ie,
standardize on one encoding for all your databases.  People who aren't
doing that already probably have good reasons why they want to stay with
the encoding choices they've made; forcing their logs into some other
encoding isn't necessarily going to improve their lives.

> ... The only valid fixes are to log them to different files (with some
> way to identify which encoding is used)

I don't recall having heard any serious discussion of such a design, but
perhaps doing that would satisfy some use-cases.  One idea that comes to
mind is to provide a %-escape for log_filename that expands to the name
of the database encoding (or more likely, some suitable abbrevation).
The logging collector protocol would have to be expanded to include that
information, but that seems do-able.

            regards, tom lane

Re: BUG #5800: "corrupted" error messages (encoding problem ?)

From
Craig Ringer
Date:
First, sorry for the slow reply.

Response inline.

On 09/17/2011 08:34 AM, Tom Lane wrote:
> Craig Ringer<ringerc@ringerc.id.au>  writes:
>> On 09/17/2011 05:10 AM, Carlo Curatolo wrote:
>>> Just tried with PG 9.1...same problem...
>
>> Yep. There appears to be no interest in fixing this bug. All the
>> alternatives I proposed were rejected, and there doesn't seem to be any
>> concern about the issue.
 >
> The problem is to find a cure that's not worse than the disease.
> I'm not exactly convinced that forcing all log messages into a common
> encoding is a better behavior than allowing backends to log in their
> native database encoding.
 >
> If you do want a common encoding, there's a very easy way to get it, ie,
> standardize on one encoding for all your databases.

The postmaster may still emit messages in a different encoding if the
system encoding is not the same as the standard database encoding chosen.

> People who aren't
> doing that already probably have good reasons why they want to stay with
> the encoding choices they've made; forcing their logs into some other
> encoding isn't necessarily going to improve their lives.

I'm not convinced.

Mixing their logs with messages in other encodings makes it *impossible*
for most people to read them at all. A file with (say) mixed UTF-8,
latin-1 and Shift-JIS is effectively hopelessly corrupted as far as most
people are concerned. If lines are differently encoded, the file is a
totally mangled mess. Try it and see what I mean. As such, I disagree:
forcing all their logs into one encoding WILL improve their lives over
the current situation, and won't affect people whose databases are all
already in the system encoding.

In any case, if the system uses a utf8 encoding and the databases are
latin-1 (for example) the admin might actually prefer to have utf8 logs
for easy reading and processing by system tools, no matter what encoding
the databases are in.

The database encoding is an internal thing. The log encoding is an
external thing. Writing messages to stdout/stderr in an encoding other
than that specified by LC_CTYPE and LC_MESSAGES is wrong as it'll cause
garbage to be shown on a terminal; so IMO is logging in a different
encoding.

Because there's no standard way to flag a file as having a certain
encoding, I contend that the correct default is to write files in the
default encoding used by the system. That is what programs that consume
the logs will expect. The only other correct alternative would be to
write UTF-8 logs with a BOM that lets programs unamgiguously identify
the encoding. That said, users probably should be able to override the
log file location and encoding so a particular database's logs go to a
separate file in a user-defined encoding and/or override the default
encoding Pg writes.


>> ... The only valid fixes are to log them to different files (with some
>> way to identify which encoding is used)
>
> I don't recall having heard any serious discussion of such a design, but
> perhaps doing that would satisfy some use-cases.  One idea that comes to
> mind is to provide a %-escape for log_filename that expands to the name
> of the database encoding (or more likely, some suitable abbrevation).
> The logging collector protocol would have to be expanded to include that
> information, but that seems do-able.

That'd work, though it doesn't solve the problem for people logging to
syslog or to a single file.

I think Pg should also be able to convert all messages into a common
encoding for logging to a single file and should default to using the
system encoding as that encoding.

The user could configure a different encoding - for example, they might
want to force utf-8 logging because their databases may have all sorts
of different encodings, but they're logging to syslog so they can't
split logs out to different files.

A special log destination encoding name, say "log_encoding = database"
could be used to bypass all encoding conversion, retaining the current
behaviour of logging in whatever encoding the database happens to use.

I'm willing to implement this setup (or try, at least) if you think it's
a reasonable thing to do. I don't know how I'll go with multi-file
logging in log_filename, but I'm pretty sure I can handle the log
message encoding conversion and associated configuration directives.

There's some overhead to encoding conversion, but it's pretty minimal.
It can be avoided entirely by ensuring that your log destination
encoding is the same as your Pg database encoding, which under this
scheme you can do by setting "log_encoding = database" and sticking to
one encoding or using multi-file logging.

Reasonable plan?

--
Craig Ringer