Re: JSON for PG 9.2 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: JSON for PG 9.2
Date
Msg-id CA+Tgmoa9GLx06S5KiG7YgF_T+3QkY+Dfq9RF2g79STM=LEn1_A@mail.gmail.com
Whole thread Raw
In response to Re: JSON for PG 9.2  ("David E. Wheeler" <david@kineticode.com>)
Responses Re: JSON for PG 9.2
List pgsql-hackers
On Fri, Jan 20, 2012 at 12:14 PM, David E. Wheeler <david@kineticode.com> wrote:
> On Jan 20, 2012, at 8:58 AM, Robert Haas wrote:
>
>> If, however,
>> we're not using UTF-8, we have to first turn \uXXXX into a Unicode
>> code point, then covert that to a character in the database encoding,
>> and then test for equality with the other character after that.  I'm
>> not sure whether that's possible in general, how to do it, or how
>> efficient it is.  Can you or anyone shed any light on that topic?
>
> If it’s like the XML example, it should always represent a Unicode code point, and *not* be converted to the other
characterset, no? 

Well, you can pick which way you want to do the conversion.  If the
database encoding is SJIS, and there's an SJIS character in a string
that gets passed to json_in(), and there's another string which also
gets passed to json_in() which contains \uXXXX, then any sort of
canonicalization or equality testing is going to need to convert the
SJIS character to a Unicode code point, or the Unicode code point to
an SJIS character, to see whether they match.

Err, actually, now that I think about it, that might be a problem:
what happens if we're trying to test two characters for equality and
the encoding conversion fails?  We really just want to return false -
the strings are clearly not equal if either contains even one
character that can't be converted to the other encoding - so it's not
good if an error gets thrown in there anywhere.

> At any rate, since the JSON standard requires UTF-8, such distinctions having to do with alternate encodings are not
likelyto be covered, so I suspect we can do whatever we want here. It’s outside the spec. 

I agree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)
Next
From: Robert Haas
Date:
Subject: Re: Command Triggers