Thread: java stored procedures

java stored procedures

From

Laszlo Hornyak

Date:

03 December 2001, 13:46:51

Hi!

A few months ago I asked if anyone started working on PL/JAVA, the
ansver was no. Now I started to write a java stored procedure language
and environment for PostgreSQL. Some code is already working, and it is
geting interresting. So, I would like to ask you to write me your ideas,
suggestions, etc for this environment.
The source code will be available under GPL when it is worth for
distributing it (this will take for a while).
thanks.

Laszlo Hornyak

Re: java stored procedures

From

wsheldah@lexmark.com

Date:

03 December 2001, 14:26:40


Thanks for your work on it; although I'm not a java programmer, I'm sure a
number of people will find this useful. I do have a small licensing concern
though. Postgresql is currently distributed under a BSD-style license. If your
goal is to have this included with the Postgresql distribution, it would
probably keep things a lot simpler if the Java extension were kept under a
compatible license, or even the same one.  Please note, I am NOT trying start a
flame war over which license is better; I'm just proposing that major software
packages like Postgresql use  a consistent licensing scheme. Sort of like Perl
modules are typically licensed "under the same terms as Perl itself" so as to
guarantee they don't diverge from perl, and people aren't surprised by a
different license agreement when they add another perl module from CPAN.

Thanks again,

Wes Sheldahl



Laszlo Hornyak <hornyakl%freemail.hu@interlock.lexmark.com> on 11/30/2001
04:14:12 AM

Please respond to hornyakl%users.sourceforge.net@interlock.lexmark.com

To:   pgsql-general%postgresql.org@interlock.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  [GENERAL] java stored procedures


Hi!

A few months ago I asked if anyone started working on PL/JAVA, the
ansver was no. Now I started to write a java stored procedure language
and environment for PostgreSQL. Some code is already working, and it is
geting interresting. So, I would like to ask you to write me your ideas,
suggestions, etc for this environment.
The source code will be available under GPL when it is worth for
distributing it (this will take for a while).
thanks.

Laszlo Hornyak


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: java stored procedures

From

Barry Lind

Date:

03 December 2001, 15:56:38

Laszlo,

In my mind it would be more useful if this code was under the same
license as the rest of postgresql.  That way it could become part of the
product as opposed to always being a separate component.  (Just like
plpgsql, pltcl and the other procedural languages).

thanks,
--Barry

Laszlo Hornyak wrote:

> Hi!
>
> A few months ago I asked if anyone started working on PL/JAVA, the
> ansver was no. Now I started to write a java stored procedure language
> and environment for PostgreSQL. Some code is already working, and it is
> geting interresting. So, I would like to ask you to write me your ideas,
> suggestions, etc for this environment.
> The source code will be available under GPL when it is worth for
> distributing it (this will take for a while).
> thanks.
>
> Laszlo Hornyak
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: java stored procedures

From

Barry Lind

Date:

04 December 2001, 12:35:58

Laszlo,

I think it would help a lot if you could take a little time to write
down what your planned architecture for a pljava would be.  It then
becomes much easier for myself and probably others reading these lists
to make suggestions on ways to improve what you are planning (or
possible problems with your strategy).  Without knowing what exactly you
are thinking of doing it is difficult to comment.

But let me try throwing out a few thoughts about how I think this should
be done.

First question is how will the jvm be run?  Since postgres is a
multiprocess implementation (i.e. each connection has a separate process
on the server) and since java is a multithreaded implementation (i.e.
one process supporting multiple threads), what should the pljava
implementation look like?  I think there should be a single jvm process
for the entire db server that each postgresql process connects to
through sockets/rmi.  It will be too expensive to create a new jvm
process for each postgresql connection (expensive in both terms of
memory and cpu, since the startup time for the jvm is significant and it
requires a lot of memory).

Having one jvm that all the postgres backend processes communicate with
makes the whole feature much more complicated, but is necessary in my
opinion.

Then the question becomes how does the jvm process interact with the
database since they are two different processes.  You will need some
sort of interprocess communication between the two to execute sql
statements.  This could be accomplished by using the existing jdbc
driver.  But the bigest problem here is getting the transaction
semantics right.  How does a sql statement being run by a java stored
procedure get access to the same connection/transaction as the original
client?  What you don't want happening is that sql issued in a stored
java procedure executes in a different transaction as the caller, what
would rollback of the stored function call mean in that case?

I am very interested in hearing what your plans are for pl/java.  I
think this is a very difficult project, but one that would be very
useful and welcome.

thanks,
--Barry

Laszlo Hornyak wrote:

> Hi!
>
> I am such a lame in the licensing area. As much as I know, BSD license
> is more free than GPL. I think it is too early to think about licensing,
> but it`s ok, you won :), when it will be ready(or it will seem to get
> closer to a working thing, currently it looks more like a interresting
> test), I will ask you if you want to distribute it with Postgres, and if
> you say yes, the license will be the same as Postgresql`s license.
> Anyway is this neccessary when it is the part of the distribution?
> Is this ok for you?
>
> thanks,
> Laszlo Hornyak
>
> ps: still waiting for your ideas, suggestions, etc :) I am not memeber
> of the mailing list, please write me dirrectly!
>
> Barry Lind wrote:
>
>> Laszlo,
>>
>> In my mind it would be more useful if this code was under the same
>> license as the rest of postgresql.  That way it could become part of
>> the product as opposed to always being a separate component.  (Just
>> like plpgsql, pltcl and the other procedural languages).
>>
>> thanks,
>> --Barry
>>
>>
>
>

Re: java stored procedures

From

Doug McNaught

Date:

04 December 2001, 13:05:55

Barry Lind <barry@xythos.com> writes:

> Having one jvm that all the postgres backend processes communicate with makes
> the whole feature much more complicated, but is necessary in my opinion.

Agreed.  Also, the JVM is a multithreaded app, and running it inside a
non-threaded program (the backend) might cause problems.

> Then the question becomes how does the jvm process interact with the database
> since they are two different processes.  You will need some sort of
> interprocess communication between the two to execute sql statements.  This
> could be accomplished by using the existing jdbc driver.  But the bigest
> problem here is getting the transaction semantics right.  How does a sql
> statement being run by a java stored procedure get access to the same
> connection/transaction as the original client?  What you don't want happening
> is that sql issued in a stored java procedure executes in a different
> transaction as the caller, what would rollback of the stored function call
> mean in that case?

I think you would have to to expose the SPI layer to Java running in a
separate process, either using an RMI server written in C or a custom
protocol over a TCP socket (Java of course can't do Unix sockets).
This raises some thorny issues of authentication and security but I
don't think they're insurmountable.  You could, for example, create a
cryptographically strong "cookie" in the backend when a Java function
is called.  The cookie would be passed to the Java function when it
gets invoked, and then must be passed back to the SPI layer in order
for the latter to accept the call.  A bit clunky but should be safe as
far as I can see.

The cookie would be needed anyhow, I think, in order for the SPI layer
to be able to find the transaction that the Java function was
originally invoked in.

You could make the SPI layer stuff look like a normal JDBC driver to
user code--PL/Perl does this kind of thing with the Perl DBI
interface.

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

Re: java stored procedures

From

Barry Lind

Date:

04 December 2001, 22:27:00

Laszlo,


> I am very far from features like this.
> PL/JAVA now:
> -there is a separate process running java (kaffe). this process creates
> a sys v message queue, that holds requests. almost forgot, a shared
> memory segment too. I didn`t find better way to tell postgres the
> informations about the java process.


Does the mechanism you are planning support running any JVM?  In my
opionion Kaffe isn't good enough to be widely useful.  I think you
should be able to plugin whatever jvm is best on your platform, which
will likely be either the Sun or IBM JVMs.

Also, can you explain this a little bit more.  How does the jvm process
get started? (I would hope that the postgresql server processes would
start it when needed, as opposed to requiring that it be started
separately.)  How does the jvm access these shared memory structures?
Since there aren't any methods in the java API to do such things that I
am aware of.

> -the java request_handler function on the server side attaches to the
> shared memory, reads the key of the message queue., attaches to it,
> sends the data of the function, and a signal for the pl/java. after, it
> is waiting for a signal from the java thread.


I don't understand how you do this in java?  I must not be understanding
  something correctly here.

> -when java thread receives the signal, it reads the message(s) from the
> queue, and starts some actions. When done it tells postgres with a
> signal that it is ready, and it can come for its results. This will be
> rewritten see below problems.


Are signals the best way to accomplish this?

> -And postgres is runing, while java is waiting for postgres to say
> something.


But in reality if the postgres process is executing a stored function it
needs to wait for the result of that function call before continuing
doesn't it?

>
> Threading on the java process side is not done yet, ok, it is not that
> hard, I will write it, if it will be realy neccessary.


Agreed, this is important.

>
> The problems, for now:
> I had a very simple system, that passed a very limited scale of argument
> types, with a very limited quantity of parameters (int, varchar, bool).
> Postgres has limits for the argument count too, but not for types. It
> had too much limits, so I am working (or to tell the truth now only
> thinking) on a new type handling that fits the felxibility of
> Postgresql`s type flexibility. For this I will have to learn a lot about
> Postgres`s type system. This will be my program this weekend. :)


Shouldn't this code use all or most of the logic found in the FE/BE
protocol?  Why invent and code another mechanism to transfer data when
one already exists.  (I will admit that the current FE/BE mechanism
isn't the ideal choice, but it seems easier to reuse what exists for now
and improve on it later).


>
> thanks,
> Laszlo Hornyak
>

You didn't mention how you plan to deal with the transaction symantics.
  So what happens when the pl/java function calls through jdbc back to
the server to insert some data?  That should happen in the same
transaction as the caller correct?

thanks,
--Barry

Re: java stored procedures

From

Gunnar Rønning

Date:

06 December 2001, 07:36:26

* Barry Lind <barry@xythos.com> wrote:
|
| possible problems with your strategy).  Without knowing what exactly
| you are thinking of doing it is difficult to comment.

Agreed.

| Having one jvm that all the postgres backend processes communicate
| with makes the whole feature much more complicated, but is necessary
| in my opinion.

I'm not quite sure if I agree here. Startup time is not an issue if you are
using connection pooling on the client and memory is cheap ;-) Having
a separate process would indeed introduce overhead where you don't want
it. In the critical path when executing queries.

| I am very interested in hearing what your plans are for pl/java.  I
| think this is a very difficult project, but one that would be very
| useful and welcome.

I would very much like to hear about the plans myself.


--
Gunnar Rønning - gunnar@polygnosis.com
Senior Consultant, Polygnosis AS, http://www.polygnosis.com/

Re: java stored procedures

From

Laszlo Hornyak

Date:

10 December 2001, 09:58:46

Hi!

Barry Lind wrote:

> Laszlo,
>
> I think it would help a lot if you could take a little time to write
> down what your planned architecture for a pljava would be.  It then
> becomes much easier for myself and probably others reading these lists
> to make suggestions on ways to improve what you are planning (or
> possible problems with your strategy).  Without knowing what exactly
> you are thinking of doing it is difficult to comment.

>
>
> But let me try throwing out a few thoughts about how I think this
> should be done.
>
> First question is how will the jvm be run?  Since postgres is a
> multiprocess implementation (i.e. each connection has a separate
> process on the server) and since java is a multithreaded
> implementation (i.e. one process supporting multiple threads), what
> should the pljava implementation look like?  I think there should be a
> single jvm process for the entire db server that each postgresql
> process connects to through sockets/rmi.  It will be too expensive to
> create a new jvm process for each postgresql connection (expensive in
> both terms of memory and cpu, since the startup time for the jvm is
> significant and it requires a lot of memory).

I absolutely agree. OK, it`s done.

So, a late-night-brainstorming here:
What I would like to see in PL/JAVA is the object oriented features,
that makes postgresql nice. Creating a new table creates a new class in
the java side too. Instantiating an object of the newly created class
inserts a row into the table. In postgresql tables can be inherited, and
this could be easyly done by pl/java too. I think this would look nice.
But this is not the main feature. Why I would like to see a nice java
procedural language inside postgres is java`s advanced communication
features (I mean CORBA, jdbc, other protocols). This is the sugar in the
caffe.

I am very far from features like this.
PL/JAVA now:
-there is a separate process running java (kaffe). this process creates
a sys v message queue, that holds requests. almost forgot, a shared
memory segment too. I didn`t find better way to tell postgres the
informations about the java process.
-the java request_handler function on the server side attaches to the
shared memory, reads the key of the message queue., attaches to it,
sends the data of the function, and a signal for the pl/java. after, it
is waiting for a signal from the java thread.
-when java thread receives the signal, it reads the message(s) from the
queue, and starts some actions. When done it tells postgres with a
signal that it is ready, and it can come for its results. This will be
rewritten see below problems.
-And postgres is runing, while java is waiting for postgres to say
something.

Threading on the java process side is not done yet, ok, it is not that
hard, I will write it, if it will be realy neccessary.

The problems, for now:
I had a very simple system, that passed a very limited scale of argument
types, with a very limited quantity of parameters (int, varchar, bool).
Postgres has limits for the argument count too, but not for types. It
had too much limits, so I am working (or to tell the truth now only
thinking) on a new type handling that fits the felxibility of
Postgresql`s type flexibility. For this I will have to learn a lot about
Postgres`s type system. This will be my program this weekend. :)

thanks,
Laszlo Hornyak

Re: java stored procedures

From

Laszlo Hornyak

Date:

10 December 2001, 10:05:18

Hi!

I am such a lame in the licensing area. As much as I know, BSD license
is more free than GPL. I think it is too early to think about licensing,
but it`s ok, you won :), when it will be ready(or it will seem to get
closer to a working thing, currently it looks more like a interresting
test), I will ask you if you want to distribute it with Postgres, and if
you say yes, the license will be the same as Postgresql`s license.
Anyway is this neccessary when it is the part of the distribution?
Is this ok for you?

thanks,
Laszlo Hornyak

ps: still waiting for your ideas, suggestions, etc :) I am not memeber
of the mailing list, please write me dirrectly!

Barry Lind wrote:

> Laszlo,
>
> In my mind it would be more useful if this code was under the same
> license as the rest of postgresql.  That way it could become part of
> the product as opposed to always being a separate component.  (Just
> like plpgsql, pltcl and the other procedural languages).
>
> thanks,
> --Barry
>
>

Re: java stored procedures

From

Laszlo Hornyak

Date:

10 December 2001, 10:06:10

Hi!

Barry Lind wrote:

> Does the mechanism you are planning support running any JVM?  In my
> opionion Kaffe isn't good enough to be widely useful.  I think you
> should be able to plugin whatever jvm is best on your platform, which
> will likely be either the Sun or IBM JVMs.

Ok, I also had problems with caffe, but it may work. I like it becouse
it is small (the source is about 6M). As much as I know Java VM`s has a
somewhat standard native interface called JNI. I use this to start the
VM, and communicate with it. If you think I should change I will do it,
but it may take a long time to get the new VM. For then I have to run kaffe.

> Also, can you explain this a little bit more.  How does the jvm
> process get started? (I would hope that the postgresql server
> processes would start it when needed, as opposed to requiring that it
> be started separately.)  How does the jvm access these shared memory
> structures? Since there aren't any methods in the java API to do such
> things that I am aware of.

JVM does not. 'the java process' does with simple posix calls. I use
debian potatoe, on any other posix system it should work, on any other
somewhat posix compatible system it may work, I am not sure...

>
> I don't understand how you do this in java?  I must not be
> understanding  something correctly here.

My failure.
The 'java request_handler' is not a java function, it is the C
call_handler in the Postgres side, that is started when a function of
language 'pljava' is called.
I made some failure in my previous mail. At home I named the pl/java
language pl/pizza (something that is not caffe, but well known enough
:). The application has two running binaries:
-pizza (which was called 'java process' last time) This is a small C
program that uses JNI to start VM and call java methods.
-plpizza.so the shared object that contains the call_handler function.

>
>
>> -when java thread receives the signal, it reads the message(s) from
>> the queue, and starts some actions. When done it tells postgres with
>> a signal that it is ready, and it can come for its results. This will
>> be rewritten see below problems.
>
>
>
> Are signals the best way to accomplish this?

I don`t know if it is the best, it is the only way I know :)
Do you know any other ways?

>
>
>> -And postgres is runing, while java is waiting for postgres to say
>> something.
>
> But in reality if the postgres process is executing a stored function
> it needs to wait for the result of that function call before
> continuing doesn't it?

Surely, this is done. How could Postgres tell the result anyway ? :)

>
>>
>> Threading on the java process side is not done yet, ok, it is not
>> that hard, I will write it, if it will be realy neccessary.
>
> Agreed, this is important.
>
> Shouldn't this code use all or most of the logic found in the FE/BE
> protocol?  Why invent and code another mechanism to transfer data when
> one already exists.  (I will admit that the current FE/BE mechanism
> isn't the ideal choice, but it seems easier to reuse what exists for
> now and improve on it later).

Well, I am relatively new to Postgres, and I don`t know these protocols.
In the weekend I will start to learn it, and in Sunday or Monday I maybe
I will understand it, if not, next weekend..

>
> You didn't mention how you plan to deal with the transaction
> symantics.  So what happens when the pl/java function calls through
> jdbc back to the server to insert some data?  That should happen in
> the same transaction as the caller correct?

I don`t think this will be a problem, I have ideas for this. Idea mean:
I know how I will start it, it may be good, or it may be fataly stupid
idea, it will turn out when I tried it. Simply: The same way plpizza
tells pizza the request, pizza can talk back to plpizza. This is planed
to work with similar mechanism I described last time (shm+signals).

Monday I will try to send a little pieces of code to make thing clear, ok?

thanks,
Laszlo Hornyak