Re: Hadoop backend? - Mailing list pgsql-hackers

From Hans-Jürgen Schönig
Subject Re: Hadoop backend?
Date
Msg-id D62A3AB4-243C-4721-8F46-E9EA0F319CC4@cybertec.at
Whole thread Raw
In response to Re: Hadoop backend?  ("Jonah H. Harris" <jonah.harris@gmail.com>)
List pgsql-hackers
why not just stream it in via set-returning functions and make sure that we can mark a set returning function as "STREAMABLE" or so (to prevent joins, whatever).
is it the easiest way to get it right and it helps in many other cases.
i think that the storage manager is definitely the wrong place to do this.

it is also easy to use more than just one backend then if you get the interface code right.

regards,

hans


On Feb 24, 2009, at 12:03 AM, Jonah H. Harris wrote:

On Sun, Feb 22, 2009 at 3:47 PM, Robert Haas <robertmhaas@gmail.com> wrote:
In theory, I think you could make postgres work on any type of
underlying storage you like by writing a second smgr implementation
that would exist alongside md.c.  The fly in the ointment is that
you'd need a more sophisticated implementation of this line of code,
from smgropen:

   reln->smgr_which = 0;   /* we only have md.c at present */

I believe there is more than that which would need to be done nowadays.  I seem to recall that the storage manager abstraction has slowly been dedicated/optimized for md over the past 6 years or so.  It may even be easier/preferred to write a hadoop specific access method depending on what you're looking for from hadoop.

--
Jonah H. Harris, Senior DBA
myYearbook.com



--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Synchronous replication & Hot standby patches
Next
From: Peter Eisentraut
Date:
Subject: Re: Hadoop backend?