On columnar storage (2) - Mailing list pgsql-hackers

From Alvaro Herrera
Subject On columnar storage (2)
Date
Msg-id 20150831225328.GM2912@alvherre.pgsql
Whole thread Raw
Responses Re: On columnar storage (2)  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
As discussed in
https://www.postgresql.org/message-id/20150611230316.GM133018@postgresql.org
we've been working on implementing columnar storage for Postgres.
Here's some initial code to show our general idea, and to gather
comments related to what we're building.  This is not a complete patch,
and we don't claim that it works!  This is in very early stages, and we
have a lot of work to do to get this in working shape.

This was proposed during the Developer's Unconference in Ottawa earlier
this year.  While some questions were raised about some elements of our
design, we don't think they were outright objections, so we have pressed
forward on the expectation that any limitations can be fixed before this
is final if they are critical, or in subsequent commits if not.

The commit messages for each patch should explain what we've done in
enough technical detail, and hopefully provide a high-level overview of
what we're developing.

The first few pieces are "ready for comment" -- feel free to speak up
about the catalog additions, the new COLUMN STORE bits we added to the
grammar, the way we handle column stores in the relcache, or the
mechanics to create column store catalog entries.

The later half of the patch series is much less well cooked yet; for
example, the colstore_dummy module is just a simple experiment to let us
verify that the API is working.  The planner and executor code are
mostly stubs, and we are not yet sure of what are the executor nodes
that we would like to have: while we have discussed this topic
internally a lot, we haven't yet formed final opinions, and of course
the stub implementations are not doing the proper things, and in many
cases they are even not doing anything at all.

Still, we believe this shows the general spirit of things, which is that
we would like these new objects be first-class citizens in the Postgres
architecture:

a) so that the optimizer will be able to extract as much benefit as is
possible from columnar storage: it won't be at arms-length through an
opaque interface, but rather directly wired into plans, and have Path
representation eventually.

b) so that it is possible to implement things such as tables that live
completely in columnar storage, as mentioned by Tom regarding Salesforce
extant columnar storage.


Please don't think that the commits attached below represent development
history.  We played with the early pieces for quite a while before
settling on what you see here.  The presented split is intended to ease
reading.  We continue to play with the planner and executor code,
getting ourselves familiar with it enough that we can write something
that actually works.

This patch is joint effort of Tomáš Vondra and myself, with
contributions from Simon Riggs.  There's a lot of code attribute to me
in the commit messages that was actually authored by Tomáš.  (Git
decided to lay blame on me because I split the commits.)



  The research leading to these results has received funding from the
  European Union’s Seventh Framework Programme (FP7/2007-2015) under grant
  agreement n° 318633.

--
Álvaro Herrera                          Developer, http://www.PostgreSQL.org/

Attachment

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: WIP: Access method extendability
Next
From: Smitha Pamujula
Date:
Subject: Re: pg_upgrade + Extensions