Re: pl/Ruby, deprecating plPython and Core - Mailing list pgsql-hackers

From Dave Cramer
Subject Re: pl/Ruby, deprecating plPython and Core
Date
Msg-id B8708B2C-7071-45FB-8BCC-C3665C5C2E4B@fastcrypt.com
Whole thread Raw
In response to Re: pl/Ruby, deprecating plPython and Core  (Thomas Hallgren <thhal@mailblocks.com>)
Responses Re: pl/Ruby, deprecating plPython and Core
Re: pl/Ruby, deprecating plPython and Core
List pgsql-hackers
On 17-Aug-05, at 12:40 PM, Thomas Hallgren wrote:

> Andrew Dunstan wrote:
>
>> Dave Cramer wrote:
>>
>>> As there are two java procedural languages which are available  
>>> for  postgreSQL Josh asked for an explanation as to their  
>>> differences.
>>> They are quite similar in that both of them run the function in  
>>> a  java vm, and  are pre-compiled. Neither attempt to compile the  
>>> code.
>>>
>>> The biggest difference is how they connect to the java VM.
>>>
>>> PL/Java uses Java Native Interfaces (JNI) and does a direct call  
>>> into  the java VM from the language handler.
>>>
>>> PL-J uses a network protocol to connect to a java VM.
>>>
>>>
>>> There are advantages and disadvantages to both approaches.
>>>
>>> + JNI is simpler, doesn't require a protocol, or an application   
>>> container to manage the User Defined Functions
>>> - JNI requires that the vm runs on the server machine, and a  
>>> separate  vm be instantiated for every connection that calls a  
>>> function.
>>>     This is mitigated somewhat in java 1.5, by sharing data,  
>>> however  this may or may not be a Sun only feature ( does anyone  
>>> know );
>>>     either way a separate vm is required for each connection.
>>> - startup time for the vm on the first call for the connection.
>>> - Possible ( not as likely any more ) for the java VM to take  
>>> the  server down.
>>>
>>> Using a network protocol such as a pl-j does  has the following   
>>> ( basically the opposite of the JNI (dis)advantages )
>>>
>>> + The java VM does not have to run on the server.
>>> + Only one vm per server
>>> -  More complex, requires a micro kernel application server to  
>>> manage  the UDF's  currently http://loom.codehaus.org/
>>>
>>>
>>>
> I think Dave miss a couple of important points.
>
> 1. Speed. One major reason for moving code from the middle tier  
> down to the database is that you want to execute the code close to  
> the actual persistence mechanisms in order to minimize network  
> traffic and maximize throughput.
I think until there are actual benchmarks, there are too many  
variables here to suggest one is faster than the other. The overhead  
of having  multiple java vm's is not easily estimated. Even with a  
connection pool, consider the memory footprint of even 10 java VM's
>
> 2. A growing percentage of db-clients utilize some kind of  
> connection pool (an overwelming amount of the java clients certanly  
> do), which minimizes the problem with startup times.
>
> 3. Transaction visiblity. A function that in turn issues new SQL  
> calls must do that wihtin the scope of the caller transaction. A  
> remote process must hence call back into it's caller. PL/Java has  
> its own JDBC driver that interacts directly with SPI.
PL-J maintains transaction visibility, it has it's own JDBC driver as  
well. The protocol between the language handler and the java portion  
is based upon the FE/BE protocol which made it easy to use pg's JDBC  
driver with some modification.
>
> 4. Isolation. Using separate VM's, instabilities in the VM can only  
> affect one single connecton. One VM can be debugged or monitored  
> without affecting the others. No data can be inadvertidely moved  
> between connections, etc.
Loom deals with data integrity, debugging would have to be done by a  
remote debug connection and can connect to any thread.
>
> I try to shed more light on the pros and cons here: http:// 
> gborg.postgresql.org/project/pljava/genpage.php?jni_rationale
>
>
>> That's a pretty good explanation and ought to be published more  
>> widely. It's almost a pity that we couldn't have one project with  
>> a server setting saying how we want it to run.
>>
> There are a couple of reasons that make me a bit reluctant to join  
> the projects:
>
> PL/Java have no dependencies at all besides a Java Runtime  
> Environment (or GCJ). PL/J reqires a fair amount of other modules  
> just to compile.
PL-J requires one other module, which the build environment will  
fetch automatically to compile.
>
> PL/Java is at release 1.1 and have a community of users. To my  
> knowledge, PL/J has not reached its first release yet.
>
> PL/Java and PL/J use completely different approaches and share  
> almost no code. The code that we do share (public interfaces, manly  
> for trigger management) is published at the Maven repository at  
> ibiblio.org.
>
> I think it's better to keep the two projects separate. But I also  
> think that it is extremely important that we ensure that the user  
> experience is similar for both projects so that there's nothing to  
> prevent a server setting that decides which one to use provided  
> both are present.
>
> Kind regards,
> Thomas Hallgren
>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Missing CONCURRENT VACUUM (Was: Release notes for
Next
From: Martijn van Oosterhout
Date:
Subject: Re: SPI: ERROR: no snapshot has been set