Home > mailing lists

Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: WIP Incremental JSON Parser
Date	January 3 18:12:37
Msg-id	CA+TgmoZYB2xDi76w5EnuZHYtpyEEoq_Kt5c0rXPZ2XZ-MYhtxw@mail.gmail.com Whole thread Raw
In response to	Re: WIP Incremental JSON Parser (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: WIP Incremental JSON Parser
List	pgsql-hackers

Tree view

On Wed, Jan 3, 2024 at 9:59 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> Say we have a document with an array 1m objects, each with a field
> called "color". As it stands we'll allocate space for that field name 1m
> times. Using a hash table we'd allocated space for it once. And
> allocating the memory isn't free, although it might be cheaper than
> doing hash lookups.
>
> I guess we can benchmark it and see what the performance impact of using
> a hash table might be.
>
> Another possibility would be simply to have the callback free the field
> name after use. for the parse_manifest code that could be a one-line
> addition to the code at the bottom of json_object_manifest_field_start().

Yeah. So I'm arguing that allocating the memory each time and then
freeing it sounds cheaper than looking it up in the hash table every
time, discovering it's there, and thus skipping the allocate/free.

I might be wrong about that. It's just that allocating and freeing a
small chunk of memory should boil down to popping it off of a linked
list and then pushing it back on. And that sounds cheaper than hashing
the string and looking for it in a hash bucket.

--
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Robert Haas
Date: 03 January, 18:10:09
Subject: Re: trying again to get incremental backup

From: Nathan Bossart
Date: 03 January, 18:29:54
Subject: Re: add AVX2 support to simd.h

Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

Previous

Next