Re: json api WIP patch - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: json api WIP patch
Date
Msg-id 50F4D4E4.70604@dunslane.net
Whole thread Raw
In response to Re: json api WIP patch  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: json api WIP patch
Re: json api WIP patch
List pgsql-hackers
On 01/14/2013 12:52 PM, Andrew Dunstan wrote:
>
> On 01/14/2013 11:32 AM, Robert Haas wrote:
>>
>> So, how much performance does this lose on json_in() on a large
>> cstring, as compared with master?
>
> That's a good question. I'll try to devise a test.
>
>>
>> I can't shake the feeling that this is adding a LOT of unnecessary
>> data copying.  For one thing, instead of copying every single lexeme
>> (including the single-character ones?) out of the original object, we
>> could just store a pointer to the offset where the object starts and a
>> length, instead of copying it.
>
> In the pure pares case (json_in, json_reccv) nothing extra should be
> copied. On checking this after reading the above I found that wasn't
> quite the case, and some lexemes (scalars and field names, but not
> punctuation) were being copied when not needed. I have made a fix (see
> <https://bitbucket.org/adunstan/pgdevel/commits/139043dba7e6b15f1f9f7675732bd9dae1fb6497>)
> which I will include in the next version I publish.
>
> In the case of string lexemes, we are passing back a de-escaped
> version, so just handing back pointers to the beginning and end in the
> input string doesn't work.


After a couple of iterations, some performance enhancements to the json
parser and lexer have ended up with a net performance improvement over
git tip. On our test rig, the json parse test runs at just over 13s per
10000 parses on git tip and approx 12.55s per 10000 parses with the
attached patch.

Truth be told, I think the lexer changes have more than paid for the
small cost of the switch to an RD parser. But since the result is a net
performance win PLUS some enhanced functionality, I think we should be
all good.

cheers

andrew

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: logical changeset generation v4
Next
From: Alvaro Herrera
Date:
Subject: Re: count(*) of zero rows returns 1