On 2024-01-03 We 10:12, Robert Haas wrote:
> On Wed, Jan 3, 2024 at 9:59 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>> Say we have a document with an array 1m objects, each with a field
>> called "color". As it stands we'll allocate space for that field name 1m
>> times. Using a hash table we'd allocated space for it once. And
>> allocating the memory isn't free, although it might be cheaper than
>> doing hash lookups.
>>
>> I guess we can benchmark it and see what the performance impact of using
>> a hash table might be.
>>
>> Another possibility would be simply to have the callback free the field
>> name after use. for the parse_manifest code that could be a one-line
>> addition to the code at the bottom of json_object_manifest_field_start().
> Yeah. So I'm arguing that allocating the memory each time and then
> freeing it sounds cheaper than looking it up in the hash table every
> time, discovering it's there, and thus skipping the allocate/free.
>
> I might be wrong about that. It's just that allocating and freeing a
> small chunk of memory should boil down to popping it off of a linked
> list and then pushing it back on. And that sounds cheaper than hashing
> the string and looking for it in a hash bucket.
OK, cleaning up in the client code will be much simpler, so let's go
with that for now and revisit it later if necessary.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com