Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: WIP Incremental JSON Parser
Date
Msg-id 3efe8333-3285-1d9d-5ad3-0f9784d5c1c4@dunslane.net
Whole thread Raw
In response to Re: WIP Incremental JSON Parser  (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses Re: WIP Incremental JSON Parser
List pgsql-hackers
On 2024-02-21 We 15:26, Jacob Champion wrote:
> On Wed, Feb 21, 2024 at 6:50 AM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> On Tue, Feb 20, 2024 at 9:32 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>> *sigh* That's weird. I wonder why you can reproduce it and I can't. Can
>>> you give me details of the build? OS, compiler, path to source, build
>>> setup etc.? Anything that might be remotely relevant.
> This construction seems suspect, in json_lex_number():
>
>>        if (lex->incremental && !lex->inc_state->is_last_chunk &&
>>                len >= lex->input_length)
>>        {
>>                appendStringInfoString(&lex->inc_state->partial_token,
>>                                                           lex->token_start);
>>                return JSON_INCOMPLETE;
>>        }
> appendStringInfoString() isn't respecting the end of the chunk: if
> there's extra data after the chunk boundary (as
> AppendIncrementalManifestData() does) then all of that will be stuck
> onto the end of the partial_token.
>
> I'm about to context-switch off of this for the day, but I can work on
> a patch tomorrow if that'd be helpful. It looks like this is not the
> only call to appendStringInfoString().
>

Yeah, the issue seems to be with chunks of json that are not 
null-terminated. We don't require that they be so this code was buggy. 
It wasn't picked up earlier because the tests that I wrote did put a 
null byte at the end. Patch 5 in this series fixes those issues and 
adjusts most of the tests to add some trailing junk to the pieces of 
json, so we can be sure that this is done right.

cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Test to dump and restore objects left behind by regression
Next
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations