Home > mailing lists

Re: On columnar storage - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: On columnar storage
Date	June 14, 2015 14:33:41
Msg-id	20150614143327.GC133018@postgresql.org Whole thread Raw
In response to	Re: On columnar storage (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Amit Kapila wrote:
> >> Will the column store obey snapshot model similar to current heap tuples,
> >> if so will it derive the transaction information from heap tuple?
> 
> > Yes, visibility will be tied to the heap tuple -- a value is accessed
> > only when its corresponding heap row has already been determined to be
> > visible.  One interesting point that raises from this is about vacuum:
> > when are we able to remove a value from the store?  I have some
> > not-completely-formed ideas about this.
> 
> Hm.  This seems not terribly ambitious --- mightn't a column store
> extension wish to store tables *entirely* in the column store, rather
> than tying them to a perhaps-vestigial heap table?

Well, yes, it might, but that opens a huge can of worms.  What heapam
offers is not just tuple storage, but a lot of functionality on top of
that -- in particular, tuple locking and visibility.  I am certainly not
considering re-implementing any of that.  We might eventually go there,
but we will *additionally* need different implementations of those
things, and I'm pretty sure that will be painful, so I'm trying to stay
away from that.

> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> I can't help thinking that this could tie in with the storage level API
> >> that I was waving my arms about last year.  Or maybe not --- the goals
> >> are substantially different --- but I think we ought to reflect on that
> >> rather than just doing a narrow hack for column stores used in the
> >> particular way you're describing here.
> 
> > I can't seem to remember this proposal you mention.  Care to be more
> > specific?  Perhaps a link to archives is enough.
> 
> I never got to the point of having a concrete proposal, but there was a
> discussion about it at last year's PGCon unconference; were you there?

No, regrettably I wasn't there.

> Anyway the idea was to try to cut a clearer line between heap storage
> and the upper levels of the system, particularly the catalog/DDL code
> that we have so much of.  Based on Salesforce's experience so far,
> it's near impossible to get rid of HeapTuple as the lingua franca for
> representing rows in the upper system levels, so we've not really tried;
> but it would be nice if the DDL code weren't so much in bed with
> heap-specific knowledge, like the wired-into-many-places assumption that
> row insert and update actions require index updates but deletions don't.

Agreed on both counts.  As far as catalog code goes, removing direct
mapping from HeapTuple to C structs would require a huge rewrite of tons
of code.  Unless we're considering rewriting small pieces of specific
catalog handling at a time, it seems unlikely that we will have columns
from system catalogs in column stores.  (It seems reasonable that as
soon as we have column stores, we can have particular catalog code to
work with columnar storage, but I don't think there's much need for that
currently.)

I agree with your second point also --- it might be good to have a layer
in between, and it seems not completely unreasonable.  It would require
touching lots of places but not hugely transforming things.  (I think
it's not in the scope of the things I'm currently after, though.)

> We're also not very happy with the general assumption that a TID is an
> adequate row identifier (as our storage engine does not have TIDs), so
> I'm a bit disappointed to see you doubling down on that restriction
> rather than trying to lift it.

Well, in the general design, there is room for different tuple
identifiers.  I'm just not implementing it for the first version.

> Now much of this pain only comes into play if one is trying to change
> the underlying storage format for system catalogs, which I gather is
> not considered in your proposal.  But if you want one format for catalogs
> and another for user tables then you have issues like how do you guarantee
> atomic commit and crash safety across multiple storage engines.  That way
> lies a mess, especially if you're trying to keep the engines at arms'
> length which is what a pluggable architecture implies.  MySQL is a
> cautionary example we should be keeping in mind while thinking about this.

Right.  I don't want a separate "storage engine" that needs to
reimplement transactions (as is the case in MySQL), or visibility rules.
I don't want to have a different format for tables or catalogs; both
would still be based on the current heapam API.  I simply want to extend
the API so that I can have some columns in a separate place.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Stephen Frost
Date: 14 June 2015, 14:32:14
Subject: Re: 9.5 release notes

From: Alvaro Herrera
Date: 14 June 2015, 14:50:27
Subject: Re: Need Multixact Freezing Docs

Re: On columnar storage - Mailing list pgsql-hackers

Previous

Next