Thread: BUG #1976: steps to reproduce BUG #1438: Non UTF-8 client encoding problem
BUG #1976: steps to reproduce BUG #1438: Non UTF-8 client encoding problem
From
"Stanislav Sukholet"
Date:
The following bug has been logged online: Bug reference: 1976 Logged by: Stanislav Sukholet Email address: ctac113@mail.ru PostgreSQL version: 7.4.8.1.FC3.1 Operating system: 2.6.12-1.1378_FC3 Description: steps to reproduce BUG #1438: Non UTF-8 client encoding problem Details: That was really easy to reproduce: $ export LANG=ru_RU.koi8r $ createdb -E UNICODE mydb $ psql -d mydb mydb=# \encoding KOI8 mydb=# create table a (aa integer); CREATE TABLE mydb=# create table b (bb integer primary key); ERROR: ignoring unconvertible UTF-8 character 0xd3cf mydb=# \d СпиÑок ÑвÑзей Ð¡Ñ ÐµÐ¼Ð° | ÐÐ¼Ñ | Тип | ÐÐ»Ð°Ð´ÐµÐ»ÐµÑ --------+-----+---------+---------- public | a | ÑаблиÑа | postgres (1 запиÑÑ) mydb=# So, it's always a problem when I put PRIMARY KEY modifier after column declaration with KOI8 encoding. I've put this report to bugzilla@redhat: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=171174
"Stanislav Sukholet" <ctac113@mail.ru> writes: > mydb=# create table b (bb integer primary key); > ERROR: ignoring unconvertible UTF-8 character 0xd3cf Can't reproduce this here. What locale settings are you using in the database? (Particularly lc_ctype and lc_messages) regards, tom lane
Stanislav Sukholet <ctac@osib.so-cdu.ru> writes: >> Can't reproduce this here.  What locale settings are you using in the >> database?  (Particularly lc_ctype and lc_messages) > mydb=> SHOW client_encoding ; > client_encoding > ----------------- > KOI8 > (1 запиÑÑ) > mydb=> show LC_CTYPE; > lc_ctype > ------------- > ru_RU.koi8r > (1 запиÑÑ) > mydb=> show LC_MESSAGES; > lc_messages > ------------- > ru_RU.koi8r > (1 запиÑÑ) > mydb=> CREATE TABLE a (b INTEGER PRIMARY KEY); > ERROR: ignoring unconvertible UTF-8 character 0xd3cf OK, with that I can reproduce it in 7.4, but more recent releases produce a bunch of "WARNING: ignoring unconvertible UTF-8 character" notices and then complete the operation successfully. This is basically the same problem discussed in this thread: http://archives.postgresql.org/pgsql-patches/2005-08/msg00037.php namely that gettext() converts the translated error message to the encoding implied by LC_CTYPE ... but the error reporting machinery expects the string to be in the encoding specified for the database. I have applied a minor tweak to the 7.4 branch to make it behave more like the later releases, ie you get a WARNING not an ERROR. However this is certainly not really a solution --- the only reason the behavior isn't worse is that the ru_RU message catalog doesn't try to translate "ignoring unconvertible UTF-8 character" and so you don't get into the recursive failure discussed in the above thread. The bottom line is that this is one of several reasons why it's a bad idea to use a database encoding that's incompatible with the underlying locale settings. I doubt that we'll really be able to fix that until we replace all our dependence on the C library's locale facilities ... which is something that will probably happen someday, but don't hold your breath waiting :-( In short, if you want to use UTF8 database encoding, specify a UTF8-based locale setting when you initdb. Don't try to change the database encoding via -E. regards, tom lane
SGksCgpJIGhhdmUgdGhlIGZvbGxvd2luZyBzY2VuYXJpby4gSSBoYXZlIHR3 byBib3hlcyAoMSB3aW5kb3dzIHNlcnZlciAyMDAzCmFuZCAxIGxpbnV4IGRl YmlhbiBzYXJnZSkuCgpUaGUgZGViaWFuIGJveCBydW5zIHRoZSBQb3N0Z3Jl U1FMIHNlcnZlciBhbmQgdGhlIHdpbmRvd3MgYm94IGlzIHVzaW5nCkNoaW5l c2UgY2hhcmFjdGVyIHNldC4KCklmIEkgd2FudCB0byBidWlsZGluZyBhbiBh cHBsaWNhdGlvbiBvbiB3aW5kb3dzICh0aHJvdWdoIE9EQkMpLCBzaG91bGQK SSBjb25uZWN0IHRvIHRoZSBzZXJ2ZXIgd2l0aCBjbGllbnQgZW5jb2Rpbmcg c2V0IHRvIEVVQ19DTiBvciBVTklDT0RFPwoKT24gdGhlIHNlcnZlciBzaWRl LCBzaG91ZGwgSSBpbml0ZGIgLUUgdXNpbmcgRVVDX0NOIG9yIFVOSUNPREU/ CgpBbHNvLCB3aXRoIHRoZSBsb2NhbGUgc2V0dGluZy4KU2hvdWRsIEkgc2V0 IC0tbG9jYWxlPXpoX1pOLlVURi04PwoKClRoYW5rcy4KQmlsbAoKT24gMjAv MTAvMDUsIFRvbSBMYW5lIDx0Z2xAc3NzLnBnaC5wYS51cz4gd3JvdGU6Cj4g U3RhbmlzbGF2IFN1a2hvbGV0IDxjdGFjQG9zaWIuc28tY2R1LnJ1PiB3cml0 ZXM6Cj4gPj4gQ2FuJ3QgcmVwcm9kdWNlIHRoaXMgaGVyZS4gV2hhdCBsb2Nh bGUgc2V0dGluZ3MgYXJlIHlvdSB1c2luZyBpbiB0aGUKPiA+PiBkYXRhYmFz ZT8gKFBhcnRpY3VsYXJseSBsY19jdHlwZSBhbmQgbGNfbWVzc2FnZXMpCj4K PiA+IG15ZGI9PiBTSE9XIGNsaWVudF9lbmNvZGluZyA7Cj4gPiAgY2xpZW50 X2VuY29kaW5nCj4gPiAtLS0tLS0tLS0tLS0tLS0tLQo+ID4gIEtPSTgKPiA+ ICgxINC30LDQv9C40YHRjCkKPgo+ID4gbXlkYj0+IHNob3cgTENfQ1RZUEU7 Cj4gPiAgIGxjX2N0eXBlCj4gPiAtLS0tLS0tLS0tLS0tCj4gPiAgcnVfUlUu a29pOHIKPiA+ICgxINC30LDQv9C40YHRjCkKPgo+ID4gbXlkYj0+IHNob3cg TENfTUVTU0FHRVM7Cj4gPiAgbGNfbWVzc2FnZXMKPiA+IC0tLS0tLS0tLS0t LS0KPiA+ICBydV9SVS5rb2k4cgo+ID4gKDEg0LfQsNC/0LjRgdGMKQo+Cj4g PiBteWRiPT4gQ1JFQVRFIFRBQkxFIGEgKGIgSU5URUdFUiBQUklNQVJZIEtF WSk7Cj4gPiBFUlJPUjogIGlnbm9yaW5nIHVuY29udmVydGlibGUgVVRGLTgg Y2hhcmFjdGVyIDB4ZDNjZgo+Cj4gT0ssIHdpdGggdGhhdCBJIGNhbiByZXBy b2R1Y2UgaXQgaW4gNy40LCBidXQgbW9yZSByZWNlbnQgcmVsZWFzZXMKPiBw cm9kdWNlIGEgYnVuY2ggb2YgIldBUk5JTkc6ICBpZ25vcmluZyB1bmNvbnZl cnRpYmxlIFVURi04IGNoYXJhY3RlciIKPiBub3RpY2VzIGFuZCB0aGVuIGNv bXBsZXRlIHRoZSBvcGVyYXRpb24gc3VjY2Vzc2Z1bGx5Lgo+Cj4gVGhpcyBp cyBiYXNpY2FsbHkgdGhlIHNhbWUgcHJvYmxlbSBkaXNjdXNzZWQgaW4gdGhp cyB0aHJlYWQ6Cj4gaHR0cDovL2FyY2hpdmVzLnBvc3RncmVzcWwub3JnL3Bn c3FsLXBhdGNoZXMvMjAwNS0wOC9tc2cwMDAzNy5waHAKPiBuYW1lbHkgdGhh dCBnZXR0ZXh0KCkgY29udmVydHMgdGhlIHRyYW5zbGF0ZWQgZXJyb3IgbWVz c2FnZSB0byB0aGUKPiBlbmNvZGluZyBpbXBsaWVkIGJ5IExDX0NUWVBFIC4u LiBidXQgdGhlIGVycm9yIHJlcG9ydGluZyBtYWNoaW5lcnkKPiBleHBlY3Rz IHRoZSBzdHJpbmcgdG8gYmUgaW4gdGhlIGVuY29kaW5nIHNwZWNpZmllZCBm b3IgdGhlIGRhdGFiYXNlLgo+Cj4gSSBoYXZlIGFwcGxpZWQgYSBtaW5vciB0 d2VhayB0byB0aGUgNy40IGJyYW5jaCB0byBtYWtlIGl0IGJlaGF2ZSBtb3Jl Cj4gbGlrZSB0aGUgbGF0ZXIgcmVsZWFzZXMsIGllIHlvdSBnZXQgYSBXQVJO SU5HIG5vdCBhbiBFUlJPUi4gIEhvd2V2ZXIKPiB0aGlzIGlzIGNlcnRhaW5s eSBub3QgcmVhbGx5IGEgc29sdXRpb24gLS0tIHRoZSBvbmx5IHJlYXNvbiB0 aGUgYmVoYXZpb3IKPiBpc24ndCB3b3JzZSBpcyB0aGF0IHRoZSBydV9SVSBt ZXNzYWdlIGNhdGFsb2cgZG9lc24ndCB0cnkgdG8gdHJhbnNsYXRlCj4gImln bm9yaW5nIHVuY29udmVydGlibGUgVVRGLTggY2hhcmFjdGVyIiBhbmQgc28g eW91IGRvbid0IGdldCBpbnRvIHRoZQo+IHJlY3Vyc2l2ZSBmYWlsdXJlIGRp c2N1c3NlZCBpbiB0aGUgYWJvdmUgdGhyZWFkLgo+Cj4gVGhlIGJvdHRvbSBs aW5lIGlzIHRoYXQgdGhpcyBpcyBvbmUgb2Ygc2V2ZXJhbCByZWFzb25zIHdo eSBpdCdzIGEgYmFkCj4gaWRlYSB0byB1c2UgYSBkYXRhYmFzZSBlbmNvZGlu ZyB0aGF0J3MgaW5jb21wYXRpYmxlIHdpdGggdGhlIHVuZGVybHlpbmcKPiBs b2NhbGUgc2V0dGluZ3MuICBJIGRvdWJ0IHRoYXQgd2UnbGwgcmVhbGx5IGJl IGFibGUgdG8gZml4IHRoYXQgdW50aWwKPiB3ZSByZXBsYWNlIGFsbCBvdXIg ZGVwZW5kZW5jZSBvbiB0aGUgQyBsaWJyYXJ5J3MgbG9jYWxlIGZhY2lsaXRp ZXMKPiAuLi4gd2hpY2ggaXMgc29tZXRoaW5nIHRoYXQgd2lsbCBwcm9iYWJs eSBoYXBwZW4gc29tZWRheSwgYnV0IGRvbid0Cj4gaG9sZCB5b3VyIGJyZWF0 aCB3YWl0aW5nIDotKAo+Cj4gSW4gc2hvcnQsIGlmIHlvdSB3YW50IHRvIHVz ZSBVVEY4IGRhdGFiYXNlIGVuY29kaW5nLCBzcGVjaWZ5IGEKPiBVVEY4LWJh c2VkIGxvY2FsZSBzZXR0aW5nIHdoZW4geW91IGluaXRkYi4gIERvbid0IHRy eSB0byBjaGFuZ2UKPiB0aGUgZGF0YWJhc2UgZW5jb2RpbmcgdmlhIC1FLgo+ Cj4gICAgICAgICAgICAgICAgICAgICAgICAgcmVnYXJkcywgdG9tIGxhbmUK Pgo+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLShlbmQgb2YgYnJvYWRj YXN0KS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQo+IFRJUCA5OiBJbiB2 ZXJzaW9ucyBiZWxvdyA4LjAsIHRoZSBwbGFubmVyIHdpbGwgaWdub3JlIHlv dXIgZGVzaXJlIHRvCj4gICAgICAgIGNob29zZSBhbiBpbmRleCBzY2FuIGlm IHlvdXIgam9pbmluZyBjb2x1bW4ncyBkYXRhdHlwZXMgZG8gbm90Cj4gICAg ICAgIG1hdGNoCj4KCgotLQpQZXJzaXN0ZW5jZSBpcyB0aGUgdHdpbiBzaXN0 ZXIgb2YgZXhjZWxsZW5jZS4gT25lIGlzIGEgbWF0dGVyIG9mCnF1YWxpdHk7 IHRoZSBvdGhlciwgYSBtYXR0ZXIgb2YgdGltZS4KICAgICAgICBNYXJhYmVs IE1vcmdhbiwgVGhlIEVsZWN0cmljIFdvbWFuCg==