Home > mailing lists

Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: WIP Incremental JSON Parser
Date	January 4, 2024 15:06:18
Msg-id	CA+TgmoYhd=Tg07oMimZBa94+w4fOAZyXu6L+f5GsBbFRMtbrGg@mail.gmail.com Whole thread
In response to	Re: WIP Incremental JSON Parser (Nico Williams <nico@cryptonector.com>)
List	pgsql-hackers

Tree view

On Wed, Jan 3, 2024 at 6:36 PM Nico Williams <nico@cryptonector.com> wrote:
> On Tue, Jan 02, 2024 at 10:14:16AM -0500, Robert Haas wrote:
> > It seems like a pretty significant savings no matter what. Suppose the
> > backup_manifest file is 2GB, and instead of creating a 2GB buffer, you
> > create an 1MB buffer and feed the data to the parser in 1MB chunks.
> > Well, that saves 2GB less 1MB, full stop. Now if we address the issue
> > you raise here in some way, we can potentially save even more memory,
> > which is great, but even if we don't, we still saved a bunch of memory
> > that could not have been saved in any other way.
>
> You could also build a streaming incremental parser.  That is, one that
> outputs a path and a leaf value (where leaf values are scalar values,
> `null`, `true`, `false`, numbers, and strings).  Then if the caller is
> doing something JSONPath-like then the caller can probably immediately
> free almost all allocations and even terminate the parse early.

I think our current parser is event-based rather than this ... but it
seems like this could easily be built on top of it, if someone wanted
to.

--
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Tomas Vondra
Date: 04 January 2024, 14:55:01
Subject: Re: index prefetching

From: Tom Lane
Date: 04 January 2024, 15:22:15
Subject: Re: the s_lock_stuck on perform_spin_delay

Re: WIP Incremental JSON Parser - Mailing list pgsql-hackers

Previous

Next