Re: [rfc] unicode escapes for extended strings - Mailing list pgsql-hackers

From Marko Kreen
Subject Re: [rfc] unicode escapes for extended strings
Date
Msg-id e51f66da0904171455j11edf3bj46eef6da2279a3a7@mail.gmail.com
Whole thread Raw
In response to Re: [rfc] unicode escapes for extended strings  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4/18/09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>  > Andrew Dunstan <andrew@dunslane.net> wrote:
>  >> ISTM that one of the uses of this is to say "store the character
>  >> that corresponds to this Unicode code point in whatever the database
>  >> encoding is"
>
>  > I would think you're right.  As long as the given character is in the
>  > user's character set, we should allow it.  Presumably we've already
>  > confirmed that they have an encoding scheme which allows them to store
>  > everything in their character set.
>
>
> This is a good way to get your patch rejected altogether.  The lexer
>  is *not* allowed to invoke any database operations (such as
>  pg_conversion lookups) so it cannot perform arbitrary encoding
>  conversions.

Ok.  I was just thinking that if such conversion can be provided easily,
it should be done.  But if not, then no need to make things complex.

Seems the proper way to look at it is that unicode escapes have
straightforward meaning only in UTF8 encoding.  So it should be
fine to limit them in other encodings to ascii.

>  If this sort of facility is what you want, the previously suggested
>  approach via a decode-like runtime function is a better fit.

I'm a UTF8-only kind on guy, so people who actually have experience
of using other encodings must comment on that one.

-- 
marko


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [rfc] unicode escapes for extended strings
Next
From: "Kevin Grittner"
Date:
Subject: Re: [rfc] unicode escapes for extended strings