On 01/22/2012 04:28 AM, Andrew Dunstan wrote:
>
>
> On 01/21/2012 11:40 PM, Jeff Janes wrote:
>> On Sun, Jan 15, 2012 at 8:08 AM, Andrew Dunstan<andrew@dunslane.net>
>> wrote:
>>>
>>> On 01/14/2012 03:06 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>>
>>>> OK, here's a patch that does both query_to_json and array_to_json,
>>>> along
>>>> with docs and regression tests. It include Robert's original patch,
>>>> although
>>>> I can produce a differential patch if required. It can also be
>>>> pulled from
>>>> <https://bitbucket.org/adunstan/pgdevel>
>>>>
>>>>
>>>
>>> Here's an update that adds row_to_json, plus a bit more cleanup.
>> This is bit-rotted such that initdb fails
>>
>> creating template1 database in
>> /tmp/bar/src/test/regress/./tmp_check/data/base/1 ... FATAL: could
>> not create unique index "pg_proc_oid_index"
>> DETAIL: Key (oid)=(3145) is duplicated.
>>
>> I bumped up those oids in the patch, and it passes make check once I
>> figure out how to get the test run under UTF-8. Is it supposed to
>> pass under other encodings? I can't tell from the rest of thread
>> whether it supposed to pass in other encodings or not.
>>
>
> Yeah, regression tests generally are supposed to run in all encodings.
> Either we could knock out the offending test, or we could supply an
> alternative result file. If we do the latter, maybe we should modify
> the query slightly, so it reads
>
> SELECT 'getdatabaseencoding() = 'UTF8' as is_utf8, "\uaBcD"'::json;
>
>
Actually, given recent discussion I think that test should just be
removed from json.c. We don't actually have any test that the code point
is valid (e.g. that it doesn't refer to an unallocated code point). We
don't do that elsewhere either - the unicode_to_utf8() function the
scanner uses to turn \unnnn escapes into utf8 doesn't look for
unallocated code points. I'm not sure how much other validation we
should do - for example on correct use of surrogate pairs. I'd rather
get this as right as possible now - every time we tighten encoding rules
to make sure incorrectly encoded data doesn't get into the database it
causes someone real pain.
cheers
andrew