Thread: BUG #1976: steps to reproduce BUG #1438: Non UTF-8 client encoding problem

BUG #1976: steps to reproduce BUG #1438: Non UTF-8 client encoding problem

From
"Stanislav Sukholet"
Date:
The following bug has been logged online:

Bug reference:      1976
Logged by:          Stanislav Sukholet
Email address:      ctac113@mail.ru
PostgreSQL version: 7.4.8.1.FC3.1
Operating system:   2.6.12-1.1378_FC3
Description:        steps to reproduce BUG #1438: Non UTF-8 client encoding
problem
Details:

That was really easy to reproduce:
$ export LANG=ru_RU.koi8r
$ createdb -E UNICODE mydb
$ psql -d mydb
mydb=# \encoding KOI8
mydb=# create table a (aa integer);
CREATE TABLE
mydb=# create table b (bb integer primary key);
ERROR:  ignoring unconvertible UTF-8 character 0xd3cf
mydb=# \d
           Список связей
 Схема  | Имя |   Тип   | Владелец
--------+-----+---------+----------
 public | a   | таблица | postgres
(1 запись)

mydb=#

So, it's always a problem when I put PRIMARY KEY modifier after column
declaration with KOI8 encoding.
I've put this report to bugzilla@redhat:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=171174
"Stanislav Sukholet" <ctac113@mail.ru> writes:
> mydb=# create table b (bb integer primary key);
> ERROR:  ignoring unconvertible UTF-8 character 0xd3cf

Can't reproduce this here.  What locale settings are you using in the
database?  (Particularly lc_ctype and lc_messages)

            regards, tom lane
Stanislav Sukholet <ctac@osib.so-cdu.ru> writes:
>> Can't reproduce this here.  What locale settings are you using in the
>> database?  (Particularly lc_ctype and lc_messages)

> mydb=> SHOW client_encoding ;
>  client_encoding
> -----------------
>  KOI8
> (1 запись)

> mydb=> show LC_CTYPE;
>   lc_ctype
> -------------
>  ru_RU.koi8r
> (1 запись)

> mydb=> show LC_MESSAGES;
>  lc_messages
> -------------
>  ru_RU.koi8r
> (1 запись)

> mydb=> CREATE TABLE a (b INTEGER PRIMARY KEY);
> ERROR:  ignoring unconvertible UTF-8 character 0xd3cf

OK, with that I can reproduce it in 7.4, but more recent releases
produce a bunch of "WARNING:  ignoring unconvertible UTF-8 character"
notices and then complete the operation successfully.

This is basically the same problem discussed in this thread:
http://archives.postgresql.org/pgsql-patches/2005-08/msg00037.php
namely that gettext() converts the translated error message to the
encoding implied by LC_CTYPE ... but the error reporting machinery
expects the string to be in the encoding specified for the database.

I have applied a minor tweak to the 7.4 branch to make it behave more
like the later releases, ie you get a WARNING not an ERROR.  However
this is certainly not really a solution --- the only reason the behavior
isn't worse is that the ru_RU message catalog doesn't try to translate
"ignoring unconvertible UTF-8 character" and so you don't get into the
recursive failure discussed in the above thread.

The bottom line is that this is one of several reasons why it's a bad
idea to use a database encoding that's incompatible with the underlying
locale settings.  I doubt that we'll really be able to fix that until
we replace all our dependence on the C library's locale facilities
... which is something that will probably happen someday, but don't
hold your breath waiting :-(

In short, if you want to use UTF8 database encoding, specify a
UTF8-based locale setting when you initdb.  Don't try to change
the database encoding via -E.

            regards, tom lane
SGksCgpJIGhhdmUgdGhlIGZvbGxvd2luZyBzY2VuYXJpby4gSSBoYXZlIHR3
byBib3hlcyAoMSB3aW5kb3dzIHNlcnZlciAyMDAzCmFuZCAxIGxpbnV4IGRl
YmlhbiBzYXJnZSkuCgpUaGUgZGViaWFuIGJveCBydW5zIHRoZSBQb3N0Z3Jl
U1FMIHNlcnZlciBhbmQgdGhlIHdpbmRvd3MgYm94IGlzIHVzaW5nCkNoaW5l
c2UgY2hhcmFjdGVyIHNldC4KCklmIEkgd2FudCB0byBidWlsZGluZyBhbiBh
cHBsaWNhdGlvbiBvbiB3aW5kb3dzICh0aHJvdWdoIE9EQkMpLCBzaG91bGQK
SSBjb25uZWN0IHRvIHRoZSBzZXJ2ZXIgd2l0aCBjbGllbnQgZW5jb2Rpbmcg
c2V0IHRvIEVVQ19DTiBvciBVTklDT0RFPwoKT24gdGhlIHNlcnZlciBzaWRl
LCBzaG91ZGwgSSBpbml0ZGIgLUUgdXNpbmcgRVVDX0NOIG9yIFVOSUNPREU/
CgpBbHNvLCB3aXRoIHRoZSBsb2NhbGUgc2V0dGluZy4KU2hvdWRsIEkgc2V0
IC0tbG9jYWxlPXpoX1pOLlVURi04PwoKClRoYW5rcy4KQmlsbAoKT24gMjAv
MTAvMDUsIFRvbSBMYW5lIDx0Z2xAc3NzLnBnaC5wYS51cz4gd3JvdGU6Cj4g
U3RhbmlzbGF2IFN1a2hvbGV0IDxjdGFjQG9zaWIuc28tY2R1LnJ1PiB3cml0
ZXM6Cj4gPj4gQ2FuJ3QgcmVwcm9kdWNlIHRoaXMgaGVyZS4gV2hhdCBsb2Nh
bGUgc2V0dGluZ3MgYXJlIHlvdSB1c2luZyBpbiB0aGUKPiA+PiBkYXRhYmFz
ZT8gKFBhcnRpY3VsYXJseSBsY19jdHlwZSBhbmQgbGNfbWVzc2FnZXMpCj4K
PiA+IG15ZGI9PiBTSE9XIGNsaWVudF9lbmNvZGluZyA7Cj4gPiAgY2xpZW50
X2VuY29kaW5nCj4gPiAtLS0tLS0tLS0tLS0tLS0tLQo+ID4gIEtPSTgKPiA+
ICgxINC30LDQv9C40YHRjCkKPgo+ID4gbXlkYj0+IHNob3cgTENfQ1RZUEU7
Cj4gPiAgIGxjX2N0eXBlCj4gPiAtLS0tLS0tLS0tLS0tCj4gPiAgcnVfUlUu
a29pOHIKPiA+ICgxINC30LDQv9C40YHRjCkKPgo+ID4gbXlkYj0+IHNob3cg
TENfTUVTU0FHRVM7Cj4gPiAgbGNfbWVzc2FnZXMKPiA+IC0tLS0tLS0tLS0t
LS0KPiA+ICBydV9SVS5rb2k4cgo+ID4gKDEg0LfQsNC/0LjRgdGMKQo+Cj4g
PiBteWRiPT4gQ1JFQVRFIFRBQkxFIGEgKGIgSU5URUdFUiBQUklNQVJZIEtF
WSk7Cj4gPiBFUlJPUjogIGlnbm9yaW5nIHVuY29udmVydGlibGUgVVRGLTgg
Y2hhcmFjdGVyIDB4ZDNjZgo+Cj4gT0ssIHdpdGggdGhhdCBJIGNhbiByZXBy
b2R1Y2UgaXQgaW4gNy40LCBidXQgbW9yZSByZWNlbnQgcmVsZWFzZXMKPiBw
cm9kdWNlIGEgYnVuY2ggb2YgIldBUk5JTkc6ICBpZ25vcmluZyB1bmNvbnZl
cnRpYmxlIFVURi04IGNoYXJhY3RlciIKPiBub3RpY2VzIGFuZCB0aGVuIGNv
bXBsZXRlIHRoZSBvcGVyYXRpb24gc3VjY2Vzc2Z1bGx5Lgo+Cj4gVGhpcyBp
cyBiYXNpY2FsbHkgdGhlIHNhbWUgcHJvYmxlbSBkaXNjdXNzZWQgaW4gdGhp
cyB0aHJlYWQ6Cj4gaHR0cDovL2FyY2hpdmVzLnBvc3RncmVzcWwub3JnL3Bn
c3FsLXBhdGNoZXMvMjAwNS0wOC9tc2cwMDAzNy5waHAKPiBuYW1lbHkgdGhh
dCBnZXR0ZXh0KCkgY29udmVydHMgdGhlIHRyYW5zbGF0ZWQgZXJyb3IgbWVz
c2FnZSB0byB0aGUKPiBlbmNvZGluZyBpbXBsaWVkIGJ5IExDX0NUWVBFIC4u
LiBidXQgdGhlIGVycm9yIHJlcG9ydGluZyBtYWNoaW5lcnkKPiBleHBlY3Rz
IHRoZSBzdHJpbmcgdG8gYmUgaW4gdGhlIGVuY29kaW5nIHNwZWNpZmllZCBm
b3IgdGhlIGRhdGFiYXNlLgo+Cj4gSSBoYXZlIGFwcGxpZWQgYSBtaW5vciB0
d2VhayB0byB0aGUgNy40IGJyYW5jaCB0byBtYWtlIGl0IGJlaGF2ZSBtb3Jl
Cj4gbGlrZSB0aGUgbGF0ZXIgcmVsZWFzZXMsIGllIHlvdSBnZXQgYSBXQVJO
SU5HIG5vdCBhbiBFUlJPUi4gIEhvd2V2ZXIKPiB0aGlzIGlzIGNlcnRhaW5s
eSBub3QgcmVhbGx5IGEgc29sdXRpb24gLS0tIHRoZSBvbmx5IHJlYXNvbiB0
aGUgYmVoYXZpb3IKPiBpc24ndCB3b3JzZSBpcyB0aGF0IHRoZSBydV9SVSBt
ZXNzYWdlIGNhdGFsb2cgZG9lc24ndCB0cnkgdG8gdHJhbnNsYXRlCj4gImln
bm9yaW5nIHVuY29udmVydGlibGUgVVRGLTggY2hhcmFjdGVyIiBhbmQgc28g
eW91IGRvbid0IGdldCBpbnRvIHRoZQo+IHJlY3Vyc2l2ZSBmYWlsdXJlIGRp
c2N1c3NlZCBpbiB0aGUgYWJvdmUgdGhyZWFkLgo+Cj4gVGhlIGJvdHRvbSBs
aW5lIGlzIHRoYXQgdGhpcyBpcyBvbmUgb2Ygc2V2ZXJhbCByZWFzb25zIHdo
eSBpdCdzIGEgYmFkCj4gaWRlYSB0byB1c2UgYSBkYXRhYmFzZSBlbmNvZGlu
ZyB0aGF0J3MgaW5jb21wYXRpYmxlIHdpdGggdGhlIHVuZGVybHlpbmcKPiBs
b2NhbGUgc2V0dGluZ3MuICBJIGRvdWJ0IHRoYXQgd2UnbGwgcmVhbGx5IGJl
IGFibGUgdG8gZml4IHRoYXQgdW50aWwKPiB3ZSByZXBsYWNlIGFsbCBvdXIg
ZGVwZW5kZW5jZSBvbiB0aGUgQyBsaWJyYXJ5J3MgbG9jYWxlIGZhY2lsaXRp
ZXMKPiAuLi4gd2hpY2ggaXMgc29tZXRoaW5nIHRoYXQgd2lsbCBwcm9iYWJs
eSBoYXBwZW4gc29tZWRheSwgYnV0IGRvbid0Cj4gaG9sZCB5b3VyIGJyZWF0
aCB3YWl0aW5nIDotKAo+Cj4gSW4gc2hvcnQsIGlmIHlvdSB3YW50IHRvIHVz
ZSBVVEY4IGRhdGFiYXNlIGVuY29kaW5nLCBzcGVjaWZ5IGEKPiBVVEY4LWJh
c2VkIGxvY2FsZSBzZXR0aW5nIHdoZW4geW91IGluaXRkYi4gIERvbid0IHRy
eSB0byBjaGFuZ2UKPiB0aGUgZGF0YWJhc2UgZW5jb2RpbmcgdmlhIC1FLgo+
Cj4gICAgICAgICAgICAgICAgICAgICAgICAgcmVnYXJkcywgdG9tIGxhbmUK
Pgo+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLShlbmQgb2YgYnJvYWRj
YXN0KS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQo+IFRJUCA5OiBJbiB2
ZXJzaW9ucyBiZWxvdyA4LjAsIHRoZSBwbGFubmVyIHdpbGwgaWdub3JlIHlv
dXIgZGVzaXJlIHRvCj4gICAgICAgIGNob29zZSBhbiBpbmRleCBzY2FuIGlm
IHlvdXIgam9pbmluZyBjb2x1bW4ncyBkYXRhdHlwZXMgZG8gbm90Cj4gICAg
ICAgIG1hdGNoCj4KCgotLQpQZXJzaXN0ZW5jZSBpcyB0aGUgdHdpbiBzaXN0
ZXIgb2YgZXhjZWxsZW5jZS4gT25lIGlzIGEgbWF0dGVyIG9mCnF1YWxpdHk7
IHRoZSBvdGhlciwgYSBtYXR0ZXIgb2YgdGltZS4KICAgICAgICBNYXJhYmVs
IE1vcmdhbiwgVGhlIEVsZWN0cmljIFdvbWFuCg==