Re: SOLUTION: Insert a Euro symbol as UTF-8 from a latin1 charset. - Mailing list pgsql-hackers

From Ian Barwick
Subject Re: SOLUTION: Insert a Euro symbol as UTF-8 from a latin1 charset.
Date
Msg-id 200306131925.57964.barwick@gmx.net
Whole thread Raw
In response to SOLUTION: Insert a Euro symbol as UTF-8 from a latin1 charset.  (Roland Glenn McIntosh <roland@steeltorch.com>)
List pgsql-hackers
On Friday 13 June 2003 17:28, Roland Glenn McIntosh wrote:
> This is my solution / bug report / RFC cross-posted from [GENERAL]
> regarding insertion of hexadecimal characters from the command line.
> -----------------------------------
>
> Okay.  I have NO IDEA why this works.  If someone could enlighten me as to
> the math involved I'd appreciate it.  First, a little background:
>
> The Euro symbol is unicode value 0x20AC.  UTF-8 encoding is a way of
> representing most unicode characters in two bytes, and most latin
> characters in one byte.
>
> The only way I have found to insert a euro symbol into the database from
> the command line psql client is this: INSERT INTO mytable
> VALUES('\342\202\254');
>
> I don't know why this works.  In hex, those octal values are:
>     E2 82 AC

My apologies, I forgot to mention converting to UTF-8 in my original
reply.

> Additionally, according to the psql online documentation and man page:
> "Anything contained in single quotes is furthermore subject to C-like
> substitutions for \n (new line), \t (tab), \digits, \0digits, and \0xdigits
> (the character with the given decimal, octal, or hexadecimal code)."
>
> Those digits *should* be interpreted as decimal digits, but they aren't. 
> The man page for psql is either incorrect, or the implementation is buggy.

The docs are easy to misunderstand if you are scanning them in a hurry.
This section is referring to substitutions in psql's own meta commands,
not SQL statements, e.g. this:

\echo '\0xe2\0x82\0xac'

will display the Euro sign (assuming your terminal can print it).


Ian Barwick
barwick@gmx.net




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Pre-allocation of shared memory ...
Next
From: Oleg Bartunov
Date:
Subject: UTF8 and KOI8 mini-howto