Re: [JDBC] Regarding GSoc Application - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: [JDBC] Regarding GSoc Application
Date
Msg-id CAHyXU0zKvvviUPr=xn-XJHdWVOEEBVoUDEDagdh5kseefDE1iw@mail.gmail.com
Whole thread Raw
In response to Re: [JDBC] Regarding GSoc Application  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [JDBC] Regarding GSoc Application
List pgsql-hackers
On Tue, Apr 10, 2012 at 10:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Atri Sharma <atri.jiit@gmail.com> writes:
>> On Tue, Apr 10, 2012 at 8:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Hm?  SPI doesn't know anything about Java either.
>
>> We plan to call SQL through SPI from the FDW,which in turn would call
>> the Pl/Java routine.
>
> If you're saying that every Java function that the FDW needs would have
> to be exposed as a SQL function, that seems like a pretty high-risk
> (not to mention low performance) approach.  Not only do you have to
> design a SQL representation for every datatype you need, but you have to
> be sure that you do not have any security holes arising from
> unscrupulous users calling those SQL functions manually with arguments
> of their choosing.

Hm, well, for data type representation, an 'all text' representation
would avoid that requirement (although could certainly add it back in
later for performance reasons).  That's not all that different from
what the other fdw projects are doing -- mostly wrapping
BuildTupleFromCStrings and such.  But totally agree that for top
performance you'd need direct native transfer.  I'm in the 'perfect is
the enemy of the good' mindset here.

I think the security argument is mostly bogus -- pl/java is already
well into the untrusted side of things and I was figuring being able
to bypass the fdw layer and invoke the functions dblink style was a
feature, not a bug.

But adding up all the comments I see healthy skepticism that running
through SPI is the proper approach and it is noted.  So the way
forward is a more direct hook to the jvm or to go back to the drawing
board I suppose.  I agree that JNI isn't required -- we're going to
have to study the pl/java system a bit to determine the best way to
hook in.  This could end up getting us into the 'biting of more than
can chew' territory admittedly, but Atri is enthusiastic and wants to
give it a go.

Additionally, Dave is skeptical that pl/java dependency is a good
foundation for a generally useful library.  I'm not buying that --
pl/java is the 'best of class' for implementing java inside the
database that I'm aware of.  I see absolutely no reason why it
couldn't be packaged as an extension -- the project is a bit dusty and
needs some TLC but does what it does very well.  I also respectfully
disagree that the presence of high quality ETL engines eliminate the
usefulness of a direct database to database transfer mechanism.  I
personally never go the ETL route when I can just dblink the data
across and do the massaging in SQL.  Other developers may think
differently of course.  Of course, if there was a good way to
implement jdbc/fdw without using pl/java that would be good to know.

merlin

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Patch: add timing of buffer I/O requests
Next
From: Tom Lane
Date:
Subject: Re: Patch: add timing of buffer I/O requests