On Mon, 2008-01-28 at 12:04 -0800, Seth Grimes wrote:
> Column stores are all the rage now in the data warehousing world as an
> alternative to traditional approaches and to MPP approaches that include
> Greenplum. There's one company, InfoBright, that's offering a
> column-store engine, not open source, for MySQL.
>
> Would a column store work within PostgreSQL from a technical point-of-view
> and is anyone pursuing this?
>
> (As an aside, I explored something along these lines myself in the
> mid-'90s using InterBase (now Firebird) and Illustra, building on top of
> blobs to store data arrays.)
I've looked into doing this to see how hard it would be.
The main thing to consider is what it can be used for. Column stores
don't have the same use case as row stores, as Stonebraker himself
points out. That's a slightly different thought than "it just goes
faster", which is the 0.1% summary of his research touted by the
marketing department.
The column approach is basically the same thing as having all indexes,
but not actually storing the row in the heap. So it's smaller and more
efficient for many types of query, but not all. It's a fairly drastic
move to say we know enough about the types of queries people will run
that we can just not store the entire row. Some databases might know
that but generally Data Warehouses aim for business flexibility, not
just performance at any price.
Any implementation for Postgres would gain benefit from blending row and
column approaches within the same database, as an option rather than as
a must-have.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com