Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

From Robert Haas
Subject Re: WIP Incremental JSON Parser
Date
Msg-id CA+TgmoZYB2xDi76w5EnuZHYtpyEEoq_Kt5c0rXPZ2XZ-MYhtxw@mail.gmail.com
Whole thread Raw
In response to Re: WIP Incremental JSON Parser  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: WIP Incremental JSON Parser
List pgsql-hackers
On Wed, Jan 3, 2024 at 9:59 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> Say we have a document with an array 1m objects, each with a field
> called "color". As it stands we'll allocate space for that field name 1m
> times. Using a hash table we'd allocated space for it once. And
> allocating the memory isn't free, although it might be cheaper than
> doing hash lookups.
>
> I guess we can benchmark it and see what the performance impact of using
> a hash table might be.
>
> Another possibility would be simply to have the callback free the field
> name after use. for the parse_manifest code that could be a one-line
> addition to the code at the bottom of json_object_manifest_field_start().

Yeah. So I'm arguing that allocating the memory each time and then
freeing it sounds cheaper than looking it up in the hash table every
time, discovering it's there, and thus skipping the allocate/free.

I might be wrong about that. It's just that allocating and freeing a
small chunk of memory should boil down to popping it off of a linked
list and then pushing it back on. And that sounds cheaper than hashing
the string and looking for it in a hash bucket.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: trying again to get incremental backup
Next
From: Nathan Bossart
Date:
Subject: Re: add AVX2 support to simd.h