Re: Zedstore - compressed in-core columnar storage - Mailing list pgsql-hackers

From Ashwin Agrawal
Subject Re: Zedstore - compressed in-core columnar storage
Date
Msg-id CALfoeitV6Hj-_JHxQXoDERs=s0R=whAGYJz7Gv=g5t1z8_DKRw@mail.gmail.com
Whole thread Raw
In response to Re: Zedstore - compressed in-core columnar storage  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Zedstore - compressed in-core columnar storage
List pgsql-hackers

On Sat, Apr 13, 2019 at 4:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Apr 11, 2019 at 6:06 AM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> Reading about it reminds me of this work -- TAG column storage( https://urldefense.proofpoint.com/v2/url?u=http-3A__www09.sigmod.org_sigmod_record_issues_0703_03.article-2Dgraefe.pdf&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=gxIaqms7ncm0pvqXLI_xjkgwSStxAET2rnZQpzba2KM&m=H2hOVqCm9svWVOW1xh7FhoURKEP-WWpWso6lKD1fLoM&s=KNOse_VUg9-BW7SyDXt1vw92n6x_B92N9SJHZKrdoIo&e= ).
> Isn't this storage system inspired from there, with TID as the TAG?
>
> It is not referenced here so made me wonder.

I don't think they're particularly similar, because that paper
describes an architecture based on using purely logical row
identifiers, which is not what a TID is. TID is a hybrid
physical/logical identifier, sometimes called a "physiological"
identifier, which will have significant overhead.

Storage system wasn't inspired by that paper, but yes seems it also talks about laying out column data in btrees, which is good to see. But yes as pointed out by Peter, the main aspect the paper is focusing on to save space for TAG, isn't something zedstore plan's to leverage, it being more restrictive. As discussed below we can use other alternatives to save space.
 
Ashwin said that
ZedStore TIDs are logical identifiers, but I don't see how that's
compatible with a hybrid row/column design (unless you map heap TID to
logical row identifier using a separate B-Tree).

Would like to know more specifics on this Peter. We may be having different context on hybrid row/column design. When we referenced design supports hybrid row/column families, it meant not within same table. So, not inside a table one can have some data in row and some in column nature. For a table, the structure will be homogenous. But it can easily support storing all the columns together, or subset of columns together or single column all connected together by TID.


The big idea with Graefe's TAG design is that there is practically no
storage overhead for these logical identifiers, because each entry's
identifier is calculated by adding its slot number to the page's
tag/low key. The ZedStore design, in contrast, explicitly stores TID
for every entry. ZedStore seems more flexible for that reason, but at
the same time the per-datum overhead seems very high to me. Maybe
prefix compression could help here, which a low key and high key can
do rather well.

Yes, the plan to optimize out TID space per datum, either by prefix compression or delta compression or some other trick.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: plpgsql - execute - cannot use a reference to record field
Next
From: Tomas Vondra
Date:
Subject: Re: Multivariate MCV lists -- pg_mcv_list_items() seems to be broken