Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns
Date
Msg-id 53A8877F.7060208@dunslane.net
Whole thread Raw
In response to Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns
List pgsql-hackers
On 06/23/2014 11:43 AM, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> Digging more into that, I have found the issue and a fix for it. It happens
>> that populate_recordset_object_start, which is used to initialize the
>> process for the population of the record, is taken *each* time a json
>> object is found, re-creating every time the hash table for the parsing
>> process, hence removing from PopulateRecordsetState all the entries already
>> parsed and creating the problem reported by Matti. The fix I am proposing
>> to fix this issue is rather simple: simply bypass the creation of the hash
>> table if lex_level > 1 as we are in presence of a nested object and rely on
>> the existing hash table.
> Yes, this code is clearly not handling the nested-objects case correctly.
> I had written a fix more or less equivalent to yours last night.
>
> However, it seems to me that these functions (json[b]_to_record[set]) are
> handling the nested-json-objects case in a fairly brain-dead fashion to
> start with.  I would like to propose that we should think about getting
> rid of the use_json_as_text flag arguments altogether.  What purpose do
> they serve?  If we're going to the trouble of parsing the nested JSON
> objects anyway, why don't we just reconstruct from that data?
>
> (IOW, we probably actually should have nested hash tables in this case.
> I suspect that the current bug arose from incompletely-written logic
> to do it like that.)
>
> Since we've already decided to force an initdb for 9.4beta2, it's not
> quite too late to revisit this API, and I think it needs revisiting.
>
>         


Looks like we have some problems in this whole area, not just the new 
function, so we need to fix 9.3 also :-(

IIRC, originally, the intention was to disallow nested json objects, but 
the use_json_as_text was put in as a possibly less drastic possibility. 
If we get rid of it our only recourse is to error out if we encounter 
nested json. I was probably remiss in not considering the likelihood of 
a json target field.

I currently don't have lots of time to devote to this, sadly, but 
Michael's patch looks like a good minimal fix.

cheers

andrew



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [COMMITTERS] pgsql: Do all-visible handling in lazy_vacuum_page() outside its critic
Next
From: Andreas Karlsson
Date:
Subject: Re: New functions in sslinfo module