Re: JSON and unicode surrogate pairs - Mailing list pgsql-hackers

From Tom Lane
Subject Re: JSON and unicode surrogate pairs
Date
Msg-id 21439.1370873888@sss.pgh.pa.us
Whole thread Raw
In response to Re: JSON and unicode surrogate pairs  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: JSON and unicode surrogate pairs  (Andrew Dunstan <andrew@dunslane.net>)
Re: JSON and unicode surrogate pairs  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> After thinking about this some more I have come to the conclusion that 
> we should only do any de-escaping of \uxxxx sequences, whether or not 
> they are for BMP characters, when the server encoding is utf8. For any 
> other encoding, which is already a violation of the JSON standard 
> anyway, and should be avoided if you're dealing with JSON, we should 
> just pass them through even in text output. This will be a simple and 
> very localized fix.

Hmm.  I'm not sure that users will like this definition --- it will seem
pretty arbitrary to them that conversion of \u sequences happens in some
databases and not others.

> We'll still have to deal with this issue when we get to binary storage 
> of JSON, but that's not something we need to confront today.

Well, if we have to break backwards compatibility when we try to do
binary storage, we're not going to be happy either.  So I think we'd
better have a plan in mind for what will happen then.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Configurable location for extension .control files
Next
From: Andres Freund
Date:
Subject: Re: Configurable location for extension .control files