On Mon, Apr 15, 2019 at 11:57:49AM -0700, Ashwin Agrawal wrote:
> On Mon, Apr 15, 2019 at 11:18 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>
> Maybe. I'm not going to pretend I fully understand the internals. Does
> that mean the container contains ZSUncompressedBtreeItem as elements? Or
> just the plain Datum values?
>
> First, your reading of code and all the comments/questions so far have
> been highly encouraging. Thanks a lot for the same.
;-)
> Container contains ZSUncompressedBtreeItem as elements. As for Item will
> have to store meta-data like size, undo and such info. We don't wish to
> restrict compressing only items from same insertion sessions only. Hence,
> yes doens't just store Datum values. Wish to consider it more tuple level
> operations and have meta-data for it and able to work with tuple level
> granularity than block level.
OK, thanks for the clarification, that somewhat explains my confusion.
So if I understand it correctly, ZSCompressedBtreeItem is essentially a
sequence of ZSUncompressedBtreeItem(s) stored one after another, along
with some additional top-level metadata.
> Definitely many more tricks can be and need to be applied to optimize
> storage format, like for fixed width columns no need to store the size in
> every item. Keep it simple is theme have been trying to maintain.
> Compression ideally should compress duplicate data pretty easily and
> efficiently as well, but we will try to optimize as much we can without
> the same.
I think there's plenty of room for improvement. The main problem I see
is that it mixes different types of data, which is bad for compression
and vectorized execution. I think we'll end up with a very different
representation of the container, essentially decomposing the items into
arrays of values of the same type - array of TIDs, array of undo
pointers, buffer of serialized values, etc.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services