Re: JSON for PG 9.2 - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: JSON for PG 9.2
Date
Msg-id 4F1C3CB6.6090104@dunslane.net
Whole thread Raw
In response to Re: JSON for PG 9.2  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: JSON for PG 9.2
List pgsql-hackers

On 01/22/2012 04:28 AM, Andrew Dunstan wrote:
>
>
> On 01/21/2012 11:40 PM, Jeff Janes wrote:
>> On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstan<andrew@dunslane.net>  
>> wrote:
>>>
>>> On 01/14/2012 03:06 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>>
>>>> OK, here's a patch that does both query_to_json and array_to_json, 
>>>> along
>>>> with docs and regression tests. It include Robert's original patch, 
>>>> although
>>>> I can produce a differential patch if required. It can also be 
>>>> pulled from
>>>> <https://bitbucket.org/adunstan/pgdevel>
>>>>
>>>>
>>>
>>> Here's an update that adds row_to_json, plus a bit more cleanup.
>> This is bit-rotted such that initdb fails
>>
>> creating template1 database in
>> /tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL:  could
>> not create unique index "pg_proc_oid_index"
>> DETAIL:  Key (oid)=(3145) is duplicated.
>>
>> I bumped up those oids in the patch, and it passes make check once I
>> figure out how to get the test run under UTF-8.  Is it supposed to
>> pass under other encodings?  I can't tell from the rest of thread
>> whether it supposed to pass in other encodings or not.
>>
>
> Yeah, regression tests generally are supposed to run in all encodings. 
> Either we could knock out the offending test, or we could supply an 
> alternative result file. If we do the latter, maybe we should modify 
> the query slightly, so it reads
>
>    SELECT 'getdatabaseencoding() = 'UTF8' as is_utf8, "\uaBcD"'::json;
>
>

Actually, given recent discussion I think that test should just be 
removed from json.c. We don't actually have any test that the code point 
is valid (e.g. that it doesn't refer to an unallocated code point). We 
don't do that elsewhere either - the unicode_to_utf8() function the 
scanner uses to turn \unnnn escapes into utf8 doesn't look for 
unallocated code points. I'm not sure how much other validation we 
should do - for example on correct use of surrogate pairs. I'd rather 
get this as right as possible now - every time we tighten encoding rules 
to make sure incorrectly encoded data doesn't get into the database it 
causes someone real pain.

cheers

andrew




pgsql-hackers by date:

Previous
From: Julien Tachoires
Date:
Subject: Re: patch : Allow toast tables to be moved to a different tablespace
Next
From: Jaime Casanova
Date:
Subject: Re: pg_stat_database deadlock counter