Re: Array access to type "name" - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Array access to type "name"
Date
Msg-id Pine.LNX.4.44.0304271823240.2298-100000@peter.localdomain
Whole thread Raw
In response to Re: Array access to type "name"  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Array access to type "name"
Re: Array access to type "name"
List pgsql-hackers
Tom Lane writes:

> I'm not having any luck duplicating that here, but in any case what the
> above suggests to me is lack of robustness in the output conversion
> chain for type "char".  Or do you want to legislate that byte values
> corresponding to the first bytes of multibyte character sequences are
> illegal values for type "char"?  I'd have a problem with that ...

I think it comes down to defining what we really want.  Clearly, "char" is
a byte, not a character, much like in C.  Perhaps we should adopt the
bytea escape mechanism for "char" values above 127.  Otherwise, what gets
stored and what gets printed out both depends on character set conversion
issues, which seems yucky.

Now you can define name[x] to be the x'th *byte* of name, but that seems
contrived and inconsistent with the original purpose, because whether you
get useful or garbage values depends on the character set encoding.  If
you want to select the x'th character, use substring(), if you want access
to bytes, use bytea.  The character set encoding is an internal matter
that should not be accessible to users.

Btw., the issue is even a bit more serious than the example I posted:

$ dropdb test
$ createdb -E UNICODE test
$ psql test
=> create table åland (a int);
=> \d
ERROR:  Could not convert UTF-8 to ISO8859-1

(Latest sources.)

--
Peter Eisentraut   peter_e@gmx.net



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: current breakage with PGCLIENTENCODING
Next
From: Tom Lane
Date:
Subject: Re: current breakage with PGCLIENTENCODING