Thread: BUG #6308: Problem w. encoding in client
The following bug has been logged online: Bug reference: 6308 Logged by: Thomas Goerner Email address: tg@clickware.de PostgreSQL version: 9.1.1 Operating system: Windows 7 64-bit Description: Problem w. encoding in client Details: Hi, we have a problem regarding encoding with postgres 9.1.1 and Win7 64-bit Database encoding: UTF-8 active codepage in Windows console: 1252 PGCLIENTENCODING: Win1252 Console font: Lucida console In the above configuration, the following problems occur: 1) Text output from the client applications, e.g. the welcome-prompt of psql or the help page from pg_dump --help is not displayed correctly (especially german Umlauts and characters like "«" ). 2) When we restore a dump in custom format and then try to re-dump the database, we get error messages like Zeichen 0xe28093 in Kodierung »UTF8« hat keine Entsprechung in »Win1252« (character 0xe28093 in UTF-8 cannot be translated to Win1252) The above configuration is our standard configuration and works just fine in Windows XP and even in Windows 7 32-bit. Is there any solution to this problem? Thanks in advance Thomas
On 11/25/2011 08:21 PM, Thomas Goerner wrote: > > The following bug has been logged online: > > Bug reference: 6308 > Logged by: Thomas Goerner > Email address: tg@clickware.de > PostgreSQL version: 9.1.1 > Operating system: Windows 7 64-bit > Description: Problem w. encoding in client > Details: > > Hi, we have a problem regarding encoding with postgres 9.1.1 and Win7 > 64-bit > > Database encoding: UTF-8 > active codepage in Windows console: 1252 > PGCLIENTENCODING: Win1252 > Console font: Lucida console > > In the above configuration, the following problems occur: > > 1) > Text output from the client applications, e.g. the welcome-prompt of psql or > the help page from pg_dump --help is not displayed correctly (especially > german Umlauts and characters like "«" ). That shouldn't be happening. As a workaround, try using a unicode console (see the "chcp" command) and a unicode client encoding. The issue with mismatched chars sounds like a real bug that wants looking into. > When we restore a dump in custom format and then try to re-dump the > database, we get error messages like Zeichen 0xe28093 in Kodierung »UTF8« > hat keine Entsprechung in »Win1252« (character 0xe28093 in UTF-8 cannot be > translated to Win1252) Restore using PgAdmin III or using a unicode console. This is a limitation of using a Win1252 client encoding when restoring data that isn't restricted to Win1252 and cannot be fixed directly. If you don't mind possibly corrupted error and NOTICE messages you can just set a unicode client_encoding for your restore. -- Craig Ringer
Hello Craig, =20 thanks for your answer. =20 =20 > Restore using PgAdmin III or using a unicode console.=20 > This is a limitation of using a Win1252 client encoding when restoring=20 > data that isn't restricted to Win1252 and cannot be fixed directly. =20 That's new to me. AFAIK pg_restore looks into the dump file and sets the client encoding accordingly (In fact the dump contains the statement SET client_encoding =3D 'UTF8';). Is this overridden by PGCLIENTENCODING? And if so, should it be? =20 And as we only encounter both problems in Windows7-64, it seems to me they are closely related. =20 Regards Thomas =20 =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D click:ware Informationstechnik GmbH Thomas Goerner Gesch=E4ftsf=FChrer fon: 0221 - 13 99 88-0 fax: 0221 - 13 99 88-79 Kamekestra=DFe 19 50672 K=F6ln tg@clickware.de www.clickware.de =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Kennen Sie schon unser GasDataWarehouse - Die kosteng=FCnstige L=F6sung f= =FCr den Austausch von Gasmessdaten? www.gasdatawarehouse.de =20 -----Urspr=FCngliche Nachricht----- Von: Craig Ringer [mailto:ringerc@ringerc.id.au] Gesendet: Sonntag, 27. November 2011 10:00 An: Thomas Goerner Cc: pgsql-bugs@postgresql.org Betreff: Re: [BUGS] BUG #6308: Problem w. encoding in client =20 On 11/25/2011 08:21 PM, Thomas Goerner wrote: >=20 > The following bug has been logged online: >=20 > Bug reference: 6308 > Logged by: Thomas Goerner > Email address: tg@clickware.de > PostgreSQL version: 9.1.1 > Operating system: Windows 7 64-bit > Description: Problem w. encoding in client > Details: >=20 > Hi, we have a problem regarding encoding with postgres 9.1.1 and Win7=20 > 64-bit >=20 > Database encoding: UTF-8 > active codepage in Windows console: 1252 > PGCLIENTENCODING: Win1252 > Console font: Lucida console >=20 > In the above configuration, the following problems occur: >=20 > 1) > Text output from the client applications, e.g. the welcome-prompt of=20 > psql or the help page from pg_dump --help is not displayed correctly=20 > (especially german Umlauts and characters like "=AB" ). =20 That shouldn't be happening. As a workaround, try using a unicode console (see the "chcp" command) and a unicode client encoding. =20 The issue with mismatched chars sounds like a real bug that wants looking into. =20 > When we restore a dump in custom format and then try to re-dump the=20 > database, we get error messages like Zeichen 0xe28093 in Kodierung=20 > =BBUTF8=AB hat keine Entsprechung in =BBWin1252=AB (character 0xe28093 in= =20 > UTF-8 cannot be translated to Win1252) =20 Restore using PgAdmin III or using a unicode console. This is a limitation of using a Win1252 client encoding when restoring data that isn't restricted to Win1252 and cannot be fixed directly. =20 If you don't mind possibly corrupted error and NOTICE messages you can just set a unicode client_encoding for your restore. =20 -- Craig Ringer =20 =20
On 11/28/2011 08:26 PM, Thomas Goerner wrote: > > Hello Craig, > > thanks for your answer. > > > Restore using PgAdmin III or using a unicode console. > > > This is a limitation of using a Win1252 client encoding when restoring > > > data that isn't restricted to Win1252 and cannot be fixed directly. > > That's new to me. AFAIK pg_restore looks into the dump file and sets > the client encoding accordingly (In fact the dump contains the > statement SET client_encoding = 'UTF8';). Is this overridden by > PGCLIENTENCODING? And if so, should it be? > Nope, pg_restore should be using UTF8 as the client encoding in that case. If there are any errors or notices it won't be able to emit them correctly on the terminal though, as win1252 can't represent everything in UTF8 (and IIRC pg_restore doesn't recode from client_encoding to terminal encoding anyway). If the restore its self is failing then I agree that something's not working properly, because you should be able to use a client_encoding different to your terminal encoding. I wonder if recent changes intended to get psql to pick up the terminal encoding automatically have had the unintended side-effect of overriding pg_restore's attempt to set the client_encoding? I'm rather surprised you only see this on x64. You're using the same Windows and Pg version for both x64 and x64 but only the x64 test fails? -- Craig Ringer
Hello Craig, =20 it seems as if there were illegal chars in the originally dumped database, so the dump/restore problem might be due to this. At the moment we are doing further investigation on this issue.=20 =20 But the problem regarding message output from the client applications still persists. We are now setting up another set of 32/64-bit (virtual) Windows 7 machines to verify that the problem occurs only on 64 bit windows. =20 I will keep you informed. =20 =20 Regards Thomas =20 =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D click:ware Informationstechnik GmbH Thomas Goerner Gesch=E4ftsf=FChrer fon: 0221 - 13 99 88-0 fax: 0221 - 13 99 88-79 Kamekestra=DFe 19 50672 K=F6ln <mailto:tg@clickware.de> tg@clickware.de <http://www.clickware.de/> www.clickware.de =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Kennen Sie schon unser GasDataWarehouse -=20 Die kosteng=FCnstige L=F6sung f=FCr den Austausch=20 von Gasmessdaten? <http://www.gasdatawarehouse.de/> www.gasdatawarehouse.de _____=20=20 Von: Craig Ringer [mailto:ringerc@ringerc.id.au]=20 Gesendet: Dienstag, 29. November 2011 03:33 An: Thomas Goerner Cc: pgsql-bugs@postgresql.org Betreff: Re: [BUGS] BUG #6308: Problem w. encoding in client =20 On 11/28/2011 08:26 PM, Thomas Goerner wrote:=20 Hello Craig, =20 thanks for your answer. =20 =20 > Restore using PgAdmin III or using a unicode console.=20 > This is a limitation of using a Win1252 client encoding when restoring=20 > data that isn't restricted to Win1252 and cannot be fixed directly. =20 That's new to me. AFAIK pg_restore looks into the dump file and sets the client encoding accordingly (In fact the dump contains the statement SET client_encoding =3D 'UTF8';). Is this overridden by PGCLIENTENCODING? And if so, should it be? Nope, pg_restore should be using UTF8 as the client encoding in that case. If there are any errors or notices it won't be able to emit them correctly on the terminal though, as win1252 can't represent everything in UTF8 (and IIRC pg_restore doesn't recode from client_encoding to terminal encoding anyway). If the restore its self is failing then I agree that something's not working properly, because you should be able to use a client_encoding different to your terminal encoding. I wonder if recent changes intended to get psql to pick up the terminal encoding automatically have had the unintended side-effect of overriding pg_restore's attempt to set the client_encoding? I'm rather surprised you only see this on x64. You're using the same Windows and Pg version for both x64 and x64 but only the x64 test fails? -- Craig Ringer