Home > mailing lists

RE: Best way to keep track of a sliced TOAST - Mailing list pgsql-hackers

From	Bruno Hass
Subject	RE: Best way to keep track of a sliced TOAST
Date	March 21, 2019 04:20:16
Msg-id	BL0PR07MB4065DE68DBDCAF316F0145BF91420@BL0PR07MB4065.namprd07.prod.outlook.com Whole thread Raw
In response to	Re: Best way to keep track of a sliced TOAST (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Best way to keep track of a sliced TOAST
List	pgsql-hackers

Tree view

I would like to optimize the jsonb key access operations. I could not find the discussion you've mentioned, but I am giving some thought to the idea.

Instead of storing lengths, could we dedicate the first chunk of the TOASTed jsonb to store where each key is located? Would it be a good idea?

You've mentioned that the current jsonb format is byte-oriented. Does that imply that a single jsonb key value might be split between multiple chunks?

Bruno Hass

De: Robert Haas <robertmhaas@gmail.com>
Enviado: sexta-feira, 15 de março de 2019 12:22
Para: Bruno Hass
Cc: pgsql-hackers
Assunto: Re: Best way to keep track of a sliced TOAST

On Fri, Mar 15, 2019 at 7:37 AM Bruno Hass <bruno_hass@live.com> wrote:
> This idea is what I was hoping to achieve. Would we be able to make optimizations on deTOASTing just by storing the chunk lengths in chunk 0?

I don't know. I guess we could also NOT store the chunk lengths and
just say that if you don't know which chunk you want by chunk number,
your only other alternative is to read the chunks in order. The
problem with that is that it you can no longer index by byte-position
without fetching every chunk prior to that byte position, but maybe
that's not important enough to justify the overhead of a list of chunk
lengths. Or maybe it depends on what you want to do with it.

Again, stuff like what you are suggesting here has been suggested
before. I think the problem is if someone did the work to invent such
an infrastructure, that wouldn't actually do anything by itself. We'd
then need to find an application of it where it brought us some clear
advantage. As I said in my previous email, jsonb seems like a
promising candidate, but I don't think it's a slam dunk. What would
the design look like, exactly? Which operations would get faster, and
could we really make them work? The existing format is, I think,
designed with a byte-oriented format in mind, and a chunk-oriented
format might have different design constraints. It seems like an idea
with potential, but there's a lot of daylight between a directional
idea with potential and a specific idea accompanied by a high-quality
implementation thereof.

> Also, wouldn't it break existing functions by dedicating a whole chunk (possibly more) to such metadata?

Anybody writing such a patch would have to be prepared to fix any such
breakage that occurred, at least as regards core code. I would guess
that this could be done without breaking too much third-party code,
but I guess it depends on exactly what the author of this hypothetical
patch ends up changing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Nikita Glukhov
Date: 21 March 2019, 04:09:52
Subject: Re: Psql patch to show access methods info

From: Tomas Vondra
Date: 21 March 2019, 04:22:40
Subject: Re: performance issue in remove_from_unowned_list()

RE: Best way to keep track of a sliced TOAST - Mailing list pgsql-hackers

Previous

Next