Question about Encoding a Custom Type - Mailing list pgsql-hackers
From | David E. Wheeler |
---|---|
Subject | Question about Encoding a Custom Type |
Date | |
Msg-id | 95FC9074-9199-4214-93B8-51B1264DFDD5@kineticode.com Whole thread Raw |
Responses |
Re: Question about Encoding a Custom Type
(Martijn van Oosterhout <kleptog@svana.org>)
|
List | pgsql-hackers |
Howdy, Possibly showing my ignorance here, but as I'm working on updating citext to be locale-aware and to work on 8.3, I've run into this peculiarity: try=# \encoding UTF8 try=# select setting from pg_settings where name = 'lc_collate'; setting ------------- en_US.UTF-8 (1 row) try=# create table try (name citext); try=# insert into try (name) values ('aardvark'), ('AAA'); try=# select name, name = 'aaa' from try; name | ?column? ----------+---------- aardvark | f AAA | t (2 rows) try=# insert into try (name) values ('aba'), ('ABC'), ('abc'); try=# select name, name = 'aaa' from try; name | ?column? ----------+---------- aardvark | f AAA | t aba | f ABC | f abc | f (5 rows) try=# insert into try (name) values ('AAAA'); try=# select name, name = 'aaa' from try; ERROR: invalid byte sequence for encoding "UTF8": 0xf6bd HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". I've no idea what could be different about 'AAAA' vs. any other value. And if I do either of these: select name, name = 'aaa'::text from try; select name, name::text = 'aaa' from try; It just works. I'm mystified. My casts: CREATE CAST (citext AS text) WITHOUT FUNCTION AS IMPLICIT; CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS IMPLICIT; CREATE CAST (citext AS bpchar) WITHOUT FUNCTION AS IMPLICIT; CREATE CAST (text AS citext) WITHOUT FUNCTION AS ASSIGNMENT; CREATE CAST (varchar AS citext) WITHOUT FUNCTION AS ASSIGNMENT; CREATE CAST (bpchar AS citext) WITHOUT FUNCTION AS ASSIGNMENT; Question about the code? It's all here (for now): https://svn.kineticode.com/citext/trunk/ Hrm. Fiddling a bit more, I find that this fails, too: try=# select citext_smaller( 'aardvark'::citext, 'AARDVARKasdfasdfasdfasdf'::citext ); ERROR: invalid byte sequence for encoding "UTF8": 0xc102 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". So I guess that something must be up with citext_smaller(). It's quite simple, though: PG_FUNCTION_INFO_V1(citext_smaller); Datum citext_smaller (PG_FUNCTION_ARGS) { text * left = PG_GETARG_TEXT_P(0); text * right = PG_GETARG_TEXT_P(1); PG_RETURN_TEXT_P(citextcmp( PG_ARGS ) < 0 ? left : right ); } Context: https://svn.kineticode.com/citext/trunk/citext.c Anyone have any idea? Feedback would be *most* appreciated. Thanks, David
pgsql-hackers by date: