Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Tom Lane
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id 3418.1408028694@sss.pgh.pa.us
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> For comparison, here's a patch that implements the scheme that Alexander 
> Korotkov suggested, where we store an offset every 8th element, and a 
> length in the others. It compresses Larry's example to 525 bytes. 
> Increasing the "stride" from 8 to 16 entries, it compresses to 461 bytes.

> A nice thing about this patch is that it's on-disk compatible with the 
> current format, hence initdb is not required.

TBH, I think that's about the only nice thing about it :-(.  It's
conceptually a mess.  And while I agree that this way avoids creating
a big-O performance issue for large arrays/objects, I think the micro
performance is probably going to be not so good.  The existing code is
based on the assumption that JBE_OFF() and JBE_LEN() are negligibly cheap;
but with a solution like this, it's guaranteed that one or the other is
going to be not-so-cheap.

I think if we're going to do anything to the representation at all,
we need to refactor the calling code; at least fixing the JsonbIterator
logic so that it tracks the current data offset rather than expecting to
able to compute it at no cost.

The difficulty in arguing about this is that unless we have an agreed-on
performance benchmark test, it's going to be a matter of unsupported
opinions whether one solution is faster than another.  Have we got
anything that stresses key lookup and/or array indexing?
        regards, tom lane



pgsql-hackers by date:

Previous
From: "Tomas Vondra"
Date:
Subject: Re: 9.5: Memory-bounded HashAgg
Next
From: Jeff Davis
Date:
Subject: Re: 9.5: Memory-bounded HashAgg