Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id 54229FA2.90909@vmware.com
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: jsonb format is pessimal for toast compression  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 09/24/2014 08:16 AM, Tom Lane wrote:
> Jan Wieck <jan@wi3ck.info> writes:
>> On 09/15/2014 09:46 PM, Craig Ringer wrote:
>>> Anyway - this is looking like the change will go in, and with it a
>>> catversion bump. Introduction of a jsonb version/flags byte might be
>>> worthwhile at the same time. It seems likely that there'll be more room
>>> for improvement in jsonb, possibly even down to using different formats
>>> for different data.
>>>
>>> Is it worth paying a byte per value to save on possible upgrade pain?
>
>> If there indeed has to be a catversion bump in the process of this, then
>> I agree with Craig.
>
> FWIW, I don't really.  To begin with, it wouldn't be a byte per value,
> it'd be four bytes, because we need word-alignment of the jsonb contents
> so there's noplace to squeeze in an ID byte for free.  Secondly, as I
> wrote in <15378.1408548595@sss.pgh.pa.us>:
>
> : There remains the
> : question of whether to take this opportunity to add a version ID to the
> : binary format.  I'm not as excited about that idea as I originally was;
> : having now studied the code more carefully, I think that any expansion
> : would likely happen by adding more type codes and/or commandeering the
> : currently-unused high-order bit of JEntrys.  We don't need a version ID
> : in the header for that.  Moreover, if we did have such an ID, it would be
> : notationally painful to get it to most of the places that might need it.
>
> Heikki's patch would eat up the high-order JEntry bits, but the other
> points remain.

If we don't need to be backwards-compatible with the 9.4beta on-disk 
format, we don't necessarily need to eat the high-order JEntry bit. You 
can just assume that that every nth element is stored as an offset, and 
the rest as lengths. Although it would be nice to have the flag for it 
explicitly.

There are also a few free bits in the JsonbContainer header that can be 
used as a version ID in the future. So I don't think we need to change 
the format to add an explicit version ID field.

- Heikki




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: add modulo (%) operator to pgbench
Next
From: Thom Brown
Date:
Subject: Re: pg_dump bug in 9.4beta2 and HEAD