Re: JSON for PG 9.2 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: JSON for PG 9.2
Date
Msg-id CA+TgmoYqhmMJtNp_w0MuFA7kL1J1S9EQDx7bZex8ZX+=SJXEQA@mail.gmail.com
Whole thread Raw
In response to Re: JSON for PG 9.2  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: JSON for PG 9.2  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Sat, Jan 14, 2012 at 3:06 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Second, what should be do when the database encoding isn't UTF8? I'm
> inclined to emit a \unnnn escape for any non-ASCII character (assuming it
> has a unicode code point - are there any code points in the non-unicode
> encodings that don't have unicode equivalents?). The alternative would be to
> fail on non-ASCII characters, which might be ugly. Of course, anyone wanting
> to deal with JSON should be using UTF8 anyway, but we still have to deal
> with these things. What about SQL_ASCII? If there's a non-ASCII sequence
> there we really have no way of telling what it should be. There at least I
> think we should probably error out.

I don't see any reason to escape anything more than the minimum
required by the spec, which only requires it for control characters.
If somebody's got a non-ASCII character in there, we can simply allow
it to be represented by itself.  That's almost certainly more compact
(and very possibly more readable) than emitting \uXXXX for each such
instance, and it also matches what the current EXPLAIN (FORMAT JSON)
output does.

In other words, let's decree that when the database encoding isn't
UTF-8, *escaping* of non-ASCII characters doesn't work.  But
*unescaped* non-ASCII characters should still work just fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: controlling the location of server-side SSL files
Next
From: Alexander Korotkov
Date:
Subject: Re: WIP: index support for regexp search