Re: Moving Pivotal's Greenplum work upstream - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Moving Pivotal's Greenplum work upstream
Date
Msg-id CAMsr+YEkaAOG-x+SXjF-dGP7PzLmtAEJ3+2TD5pM1saZYpnjcQ@mail.gmail.com
Whole thread Raw
In response to Moving Pivotal's Greenplum work upstream  (Ewan Higgs <ewan_higgs@yahoo.co.uk>)
Responses Re: Moving Pivotal's Greenplum work upstream  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers

On 13 March 2015 at 06:24, Ewan Higgs <ewan_higgs@yahoo.co.uk> wrote:
Hi all,
There has been some press regarding Pivotal's intent to release Greenplum source as part of an Open Development Platform (along with some of their Hadoop projects). Can anyone speak on whether any of Greenplum might find its way upstream? For example, if(!) the work is being released under an appropriate license, are people at Pivotal planning to push patches for the parallel architecture and associated query planner upstream?

Greenplum appears from what's visible on the outside to make heavy modifications across a large part of the codebase, doing so with little concern about removing support for existing features, breaking other use cases, etc. So it does what it's meant to do well, but you can't necessarily expect to drop it in place of PostgreSQL and have everything just work.

My understanding is that they've written pretty much a new planner/executor for plannable statements, retaining PostgreSQL's parser, protocol code, utility statement handling, etc. But I'm finding it hard to find much hard detail on the system's innards.

It's a valid approach, but it's one that means it's unlikely to be practical to just cherry-pick a few features. There's sure to be a lot of divergence between the codebases, and no doubt Greenplum will have implemented infrastructure that overlaps with or duplicates things since added in newer PostgreSQL releases - dynamic shmem, bgworkers, etc. Even if it were feasible to pull in their features with the underlying infrastructure it'd create a significant maintenance burden. So I expect there'd need to be work done to move things over to use PostgreSQL features where they exist.

Then there's the fact that Greenplum is based on a heavily modified PostgreSQL 8.2. So even if desirable features were simple standalone patches against 8.2 (which they won't be) there'd be a lot of work getting them forward ported to 9.6.

I think it's more realistic that Greenplum's code would serve as an interesting example of how something can be done, and maybe if the license permits parts can be extracted and adapted where it takes less time than rewriting. I wouldn't pin my hopes on seeing a big influx of Greenplum code.

I'd love to hear from someone at Pivotal, though, as the above is somewhere between educated guesswork and complete hand-waving.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Redesigning checkpoint_segments
Next
From: Kouhei Kaigai
Date:
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)