Thread: invalid byte sequence for encoding "UTF8": 0xab
I am having a vexing problem with a script I am writing to populate reference tables in a new database.
I am running postgreSQL 8.3 with psql 8.3.7.
Psql reads this SQL statement:
INSERT INTO META_AUTH.DOMAIN_META_ASSERTION (TITLE, DESCRIPTION, META_ASSERTION)
VALUES ('Super-User Authorization',
'This allows a super-user to administer all meta-data.',
'UserID «Administer» ()');
and I get this message:
ERROR: invalid byte sequence for encoding "UTF8": 0xab
HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".
It is complaining about the ‘«’ character. I do not understand why. The database is created the commands
CREATE DATABASE mayyou
WITH OWNER=meta_auth ENCODING='UTF8';
ALTER DATABASE mayyou SET client_encoding = 'UTF8';
When I give psql the \encoding command, it replies
UTF8
Why is it complaining about this valid character code?
This e-mail message (including any attachments) is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution
or copying of this message (including any attachments) is strictly
prohibited.
If you have received this message in error, please contact
the sender by reply e-mail message and destroy all copies of the
original message (including attachments).
"Grand, Mark D." <mgrand@emory.edu> writes: > ... I get this message: > ERROR: invalid byte sequence for encoding "UTF8": 0xab > HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlledby "client_encoding". > It is complaining about the '<' character. I do not understand why. The ASCII code for '<' is 0x3c, not 0xab. I am not sure what you are actually typing; although it's suggestive that the LATIN1 code 0xab corresponds to a symbol that looks approximately like '<<'. The most likely bet is that you are typing the wrong thing and using a terminal emulator that is not set to generate UTF8-encoded characters. You should try to make sure that client_encoding is set to match what your keyboard actually generates. regards, tom lane
On Fri, Jun 5, 2009 at 9:57 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > The ASCII code for '<' is 0x3c, not 0xab. I am not sure what you are > actually typing; although it's suggestive that the LATIN1 code 0xab > corresponds to a symbol that looks approximately like '<<'. The most > likely bet is that you are typing the wrong thing and using a terminal Must be something with your mail program, because in the version I am reading postgres is complaining about the "approximately like '<<'" symbol.
Mark D. Grand wrote: > I am having a vexing problem with a script I am writing to > populate reference tables in a new database. > > I am running postgreSQL 8.3 with psql 8.3.7. > > Psql reads this SQL statement: > > INSERT INTO META_AUTH.DOMAIN_META_ASSERTION (TITLE, DESCRIPTION, META_ASSERTION) > VALUES ('Super-User Authorization', > 'This allows a super-user to administer all meta-data.', > 'UserID «Administer» ()'); > > and I get this message: > > ERROR: invalid byte sequence for encoding "UTF8": 0xab > > HINT: This error can also happen if the byte sequence does > not match the encoding expected by the server, which is > controlled by "client_encoding". > > It is complaining about the '«' character. I do not > understand why. The database is created the commands > > CREATE DATABASE mayyou > WITH OWNER=meta_auth ENCODING='UTF8'; > > ALTER DATABASE mayyou SET client_encoding = 'UTF8'; > > When I give psql the \encoding command, it replies > UTF8 > > Why is it complaining about this valid character code? The database stores characters in UTF-8, and the client expects UTF-8 characters, but presumably the characters you feed into psql are not UTF-8. If this is some kind of UNIX, it might be instructive to type 'echo "«" | od -t x1' on the command line. Also knowing the current locale might help to determine the problem. Yours, Laurenz Albe
It turns out that my problem was that the editor I was using (emacs) does not properly support utf8 encoding. -----Original Message----- From: Albe Laurenz [mailto:laurenz.albe@wien.gv.at] Sent: Monday, June 08, 2009 5:59 AM To: Grand, Mark D.; pgsql-general@postgresql.org Subject: RE: [GENERAL] invalid byte sequence for encoding "UTF8": 0xab Mark D. Grand wrote: > I am having a vexing problem with a script I am writing to > populate reference tables in a new database. > > I am running postgreSQL 8.3 with psql 8.3.7. > > Psql reads this SQL statement: > > INSERT INTO META_AUTH.DOMAIN_META_ASSERTION (TITLE, DESCRIPTION, META_ASSERTION) > VALUES ('Super-User Authorization', > 'This allows a super-user to administer all meta-data.', > 'UserID <Administer> ()'); > > and I get this message: > > ERROR: invalid byte sequence for encoding "UTF8": 0xab > > HINT: This error can also happen if the byte sequence does > not match the encoding expected by the server, which is > controlled by "client_encoding". > > It is complaining about the '<' character. I do not > understand why. The database is created the commands > > CREATE DATABASE mayyou > WITH OWNER=meta_auth ENCODING='UTF8'; > > ALTER DATABASE mayyou SET client_encoding = 'UTF8'; > > When I give psql the \encoding command, it replies > UTF8 > > Why is it complaining about this valid character code? The database stores characters in UTF-8, and the client expects UTF-8 characters, but presumably the characters you feed into psql are not UTF-8. If this is some kind of UNIX, it might be instructive to type 'echo "<" | od -t x1' on the command line. Also knowing the current locale might help to determine the problem. Yours, Laurenz Albe This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments).
"Grand, Mark D." <mgrand@emory.edu> writes: > It turns out that my problem was that the editor I was using (emacs) > does not properly support utf8 encoding. Emacs does support utf8 properly. http://www.emacswiki.org/emacs/ChangingEncodings It could be I'm biased because I use emacs from CVS, which is going to be emacs23, and is as stable as emacs has always been for me. http://emacs.orebokech.com/ http://atomized.org/wp-content/cocoa-emacs-nightly/ From within emacs, to get a ton of information about char under point, try C-x = (one line version) or M-x describe-char (full version): < Char: < (60, #o74, #x3c) point=1312 of 4162 (31%) <301-4163> column=66 character: < (60, #o74, #x3c) preferred charset: ascii (ASCII (ISO646 IRV)) code point: 0x3C syntax: . which means: punctuation category: .:Base, a:ASCII, l:Latin, r:Roman buffer code: #x3C file code: #x3C (encoded by coding system utf-8-emacs) display: by this font (glyph code) xft:-bitstream-Bitstream Vera Sans Mono-normal-normal-normal-*-16-*-*-*-m-0-iso10646-1 (#x1F) But I guess we're off topic now. HTH, regards, -- dim