Thread: Shared memory
Hi, I'm currently investigating the feasibility of an alternative PL/Java implementation that would use shared memory to communicate between a JVM and the backend processes. I would very much like to make use of the routines provided in shmem.c but I'm a bit uncertain how to add a segment for my own use. The flow I have in mind is: Initialization: Initialization takes place when the first PL/Java function (or validator) of the first session since the postmaster was started is called. The initialization process will create a small segment that represents the JVM. It will also start the JVM which in turn will attach to this segment. The JVM uses a small JNI library for this. Session connect: Connect takes place when the first PL/Java function (or validator) of a session is called (after initialization of course if its the first session). The backend creates (or obtains, if I decide to pool them) a communication buffer of fixed size in shared memory. This buffer is can only be used by this backend and the JVM. The backend notifies the JVM of its presence using the global segment created during initialization. My questions are: 1. Do you see something right away that invalidates this approach? 2. Is using the shared memory functionality that the backend provides a good idea (I'm thinking shmem functions, critical sections, semaphores, etc.). I'd rather depend on them then having conditional code for different operating systems. 3. Would it be better if the Postmaster allocated the global segment and started the JVM (based on some config parameter)? All ideas and opinions are very welcome. Kind Regards, Thomas Hallgren
On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote: > Hi, > I'm currently investigating the feasibility of an alternative PL/Java > implementation that would use shared memory to communicate between a JVM > and the backend processes. I would very much like to make use of the > routines provided in shmem.c but I'm a bit uncertain how to add a segment > for my own use. I'm wondering if a better way to do it would be similar to the way X does it. The client connects to the X server via a pipe (tcp/ip or unix domain). This is handy because you can block on a pipe. The client then allocates a shared memory segment and sends a message to the server, who can then also connect to it. The neat thing about this is that the client can put data in the shared memory segment and send one byte through the pipe and then block on a read. The JVM which has a thread waiting on the other end wakes up, processes the data, puts the result back and writes a byte to the pipe and waits. This wakes up the client who can then read the result. No locking, no semaphores, the standard UNIX semantics on pipes and sockets make sure everything works. In practice you'd probably end up sending small responses exclusively via the pipe and only use the shared memory for larger blocks of data but that's your choice. In X this is mostly used for image data and such. > My questions are: > 1. Do you see something right away that invalidates this approach? Nothing direct, though a single segment just for finding the JVM seems a lot. A socket approach would work better I think. > 2. Is using the shared memory functionality that the backend provides a > good idea (I'm thinking shmem functions, critical sections, semaphores, > etc.). I'd rather depend on them then having conditional code for different > operating systems. That I don't know. However, ISTM a lock-free approach is better wherever possible. If you can avoid the semaphores altogether... > 3. Would it be better if the Postmaster allocated the global segment and > started the JVM (based on some config parameter)? I don't know about the segment but the postmaster should start. I thought the tsearch guys had an approach using a co-process. I don't know how they start it up but they connected via pipes. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout wrote: > On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote: > >> Hi, >> I'm currently investigating the feasibility of an alternative PL/Java >> implementation that would use shared memory to communicate between a JVM >> and the backend processes. I would very much like to make use of the >> routines provided in shmem.c but I'm a bit uncertain how to add a segment >> for my own use. >> > > I'm wondering if a better way to do it would be similar to the way X > does it. The client connects to the X server via a pipe (tcp/ip or unix > domain). This is handy because you can block on a pipe. The client then > allocates a shared memory segment and sends a message to the server, > who can then also connect to it. > > The neat thing about this is that the client can put data in the shared > memory segment and send one byte through the pipe and then block on a > read. The JVM which has a thread waiting on the other end wakes up, > processes the data, puts the result back and writes a byte to the pipe > and waits. This wakes up the client who can then read the result. > > No locking, no semaphores, the standard UNIX semantics on pipes and > sockets make sure everything works. > > In practice you'd probably end up sending small responses exclusively > via the pipe and only use the shared memory for larger blocks of data > but that's your choice. In X this is mostly used for image data and > such. > > Pipes could be used when the connection is initialized, that's for sure. Thanks for the suggestion. Only thing I need to solve is how to detect if the JVM is present and start it up when it isn't. Either I require that it's there and generate an error when it isn't (analog with what Apache would do if Tomcat is missing) or I treat the failure to obtain the pipe as an indication on that it's not started yet. >> My questions are: >> 1. Do you see something right away that invalidates this approach? >> > > Nothing direct, though a single segment just for finding the JVM seems > a lot. A socket approach would work better I think. > > For the initial setup, sure. But I think pipes might be too slow for the actual function calls. What I want is the absolute most efficient ipc mechanism that can be achived. I'm thinking in terms of critical sections obtained using spinlocks and atomic exchange on memory that perhaps migrate to a real semaphore when the spin goes on for too long. I will do some tests using pipes too. If the gain using other types of concurrency control is second to none, then I would agree that pipes are simpler and more elegant. >> 2. Is using the shared memory functionality that the backend provides a >> good idea (I'm thinking shmem functions, critical sections, semaphores, >> etc.). I'd rather depend on them then having conditional code for different >> operating systems. >> > > That I don't know. However, ISTM a lock-free approach is better > wherever possible. If you can avoid the semaphores altogether... > > Lock free? I'm not sure I understand what you mean. I'll have to wait on something. Or are you referring to the pipe approach? >> 3. Would it be better if the Postmaster allocated the global segment and >> started the JVM (based on some config parameter)? >> > > I don't know about the segment but the postmaster should start. I > thought the tsearch guys had an approach using a co-process. I don't > know how they start it up but they connected via pipes. > > I'll check that out. Thanks for the tip. > Hope this helps, > Your insights often do. Thanks a lot. Regards, Thomas Hallgren
Martijn, I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 1.5 ratio between one process doing ping-pong between two threads and two processes doing ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. That's the most efficient lightweight locking implementation I've come across). One downside is that on a Windows box, the ratio between the threads and the processes scenario seems to be 1 to 5 which is a bit worse. I've heard that Solaris too is less efficient then Linux in this respect. The real downside is that a call from SQL to PL/Java using the current in-process approach is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an overhead of between 440% and 1300% due to context switching alone. Yet, for some applications, perhaps that overhead is acceptable? It should be compared to the high memory consumption that the in-process approach undoubtedly results in (which in turn might lead to less optimal use of CPU caches and, if memory is insufficient, more time spent doing swapping). Given those numbers, it would be interesting to hear what the community as a whole thinks about this. Kind Regards, Thomas Hallgren Martijn van Oosterhout wrote: > On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote: >> Hi, >> I'm currently investigating the feasibility of an alternative PL/Java >> implementation that would use shared memory to communicate between a JVM >> and the backend processes. I would very much like to make use of the >> routines provided in shmem.c but I'm a bit uncertain how to add a segment >> for my own use. > > I'm wondering if a better way to do it would be similar to the way X > does it. The client connects to the X server via a pipe (tcp/ip or unix > domain). This is handy because you can block on a pipe. The client then > allocates a shared memory segment and sends a message to the server, > who can then also connect to it. > > The neat thing about this is that the client can put data in the shared > memory segment and send one byte through the pipe and then block on a > read. The JVM which has a thread waiting on the other end wakes up, > processes the data, puts the result back and writes a byte to the pipe > and waits. This wakes up the client who can then read the result. > > No locking, no semaphores, the standard UNIX semantics on pipes and > sockets make sure everything works. > > In practice you'd probably end up sending small responses exclusively > via the pipe and only use the shared memory for larger blocks of data > but that's your choice. In X this is mostly used for image data and > such. > >> My questions are: >> 1. Do you see something right away that invalidates this approach? > > Nothing direct, though a single segment just for finding the JVM seems > a lot. A socket approach would work better I think. > >> 2. Is using the shared memory functionality that the backend provides a >> good idea (I'm thinking shmem functions, critical sections, semaphores, >> etc.). I'd rather depend on them then having conditional code for different >> operating systems. > > That I don't know. However, ISTM a lock-free approach is better > wherever possible. If you can avoid the semaphores altogether... > >> 3. Would it be better if the Postmaster allocated the global segment and >> started the JVM (based on some config parameter)? > > I don't know about the segment but the postmaster should start. I > thought the tsearch guys had an approach using a co-process. I don't > know how they start it up but they connected via pipes. > > Hope this helps,
On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote: > Martijn, > > I tried a Socket approach. Using the new IO stuff that arrived with Java > 1.4 (SocketChannel etc.), the performance is really good. Especially on > Linux where an SMP machine show a 1 to 1.5 ratio between one process doing > ping-pong between two threads and two processes doing ping-pong using a > socket. That's acceptable overhead indeed and I don't think I'll be able to > trim it much using a shared memory approach (the thread scenario uses Java > monitor locks. That's the most efficient lightweight locking implementation > I've come across). Yeah, it's fairly well known that the distinctions between processes and threads on linux is much smaller than on other OSes. Windows is pretty bad, which is why threading is much more popular there. > The real downside is that a call from SQL to PL/Java using the current > in-process approach is really fast. It takes about 5 micro secs on my > 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro > secs on Linux and 64 micro secs on Windows. That's an overhead of between > 440% and 1300% due to context switching alone. Yet, for some applications, <snip> This might take some more measurements but AIUI the main difference between in-process and intra-process is that one has a JVM per connection, the other one JVM shared. In that case might thoughts are as follows: - Overhead of starting JVM. If you can start the JVM in the postmaster you might be able to avoid this. However, if you have to restart the JVM each process, that's a cost. - JIT overhead. For often used classes JIT compiling can help a lot with speed. But if every class needs to be reinterpreted each time, maybe that costs more than your IPC. - Memory overhead. You meantioned this already. - Are you optimising for many short-lived connections or a few long-lived connections? My gut feeling is that if someone creates a huge number of server-side java functions that performence will be better by having one always running JVM with highly JIT optimised code than having each JVM doing it from scratch. But this will obviously need to be tested. One other thing is that seperate processes give you the ability to parallelize. For example, if a Java function does an SPI query, it can receive and process results in parallel with the backend generating them. This may not be easy to acheive with an in-process JVM. Incidently, there are compilers these days that can compile Java to native. Is this Java stuff setup in such a way that you can compile your classes to native and load directly for the real speed-freaks? In that case, maybe you should concentrate on relibility and flexibility and still have a way out for functions that *must* be high-performance. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout wrote: > On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote: >> Martijn, >> >> I tried a Socket approach. Using the new IO stuff that arrived with Java >> 1.4 (SocketChannel etc.), the performance is really good. Especially on >> Linux where an SMP machine show a 1 to 1.5 ratio between one process doing >> ping-pong between two threads and two processes doing ping-pong using a >> socket. That's acceptable overhead indeed and I don't think I'll be able to >> trim it much using a shared memory approach (the thread scenario uses Java >> monitor locks. That's the most efficient lightweight locking implementation >> I've come across). > > Yeah, it's fairly well known that the distinctions between processes > and threads on linux is much smaller than on other OSes. Windows is > pretty bad, which is why threading is much more popular there. > >> The real downside is that a call from SQL to PL/Java using the current >> in-process approach is really fast. It takes about 5 micro secs on my >> 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro >> secs on Linux and 64 micro secs on Windows. That's an overhead of between >> 440% and 1300% due to context switching alone. Yet, for some applications, > > <snip> > > This might take some more measurements but AIUI the main difference > between in-process and intra-process is that one has a JVM per > connection, the other one JVM shared. In that case might thoughts are > as follows: > > - Overhead of starting JVM. If you can start the JVM in the postmaster > you might be able to avoid this. However, if you have to restart the > JVM each process, that's a cost. > > - JIT overhead. For often used classes JIT compiling can help a lot > with speed. But if every class needs to be reinterpreted each time, > maybe that costs more than your IPC. > > - Memory overhead. You meantioned this already. > > - Are you optimising for many short-lived connections or a few > long-lived connections? > > My gut feeling is that if someone creates a huge number of server-side > java functions that performence will be better by having one always > running JVM with highly JIT optimised code than having each JVM doing > it from scratch. But this will obviously need to be tested. > The use case with a huge number of short-lived connections is not feasible at all with PL/Java as it stands today. This is partly the reason for my current research. Another reason is that it's sometimes desirable to share resources between your connections. Dangerous perhaps, but an API that encourages separation and allows sharing in a controlled way might prove very beneficial. The ideal use-case for PL/Java is a client that utilizes a connection pool. And most servlet containers and EJB servers do. Scenarios where you have just a few and fairly long lived clients are OK too. > One other thing is that seperate processes give you the ability to > parallelize. For example, if a Java function does an SPI query, it can > receive and process results in parallel with the backend generating > them. This may not be easy to acheive with an in-process JVM. > It is fairly easy to achieve using threads. Only one thread at a time may of course execute an SPI query but that's true when multiple processes are in place too since the backend is single-threaded, and since the logical thread in PL/Java must utilize the same backend as where the call originated (to maintain the transaction boundaries). Any result must also sooner or later be delivered using that same backend which further limits the ability to parallelize. > Incidently, there are compilers these days that can compile Java to > native. Is this Java stuff setup in such a way that you can compile your > classes to native and load directly for the real speed-freaks? PL/Java can be used with GCJ although I don't think the GCJ compiler outranks the JIT compiler in a modern JVM. It can only do static optimizations whereas the JIT has runtime heuristics to base its optimizations on. In the test results I've seen so far, the GCJ compiler only gets the upper hand in very simple tests. The JIT generated code is faster when things are more complicated. GCJ is great if you're using short-lived connections (less startup time and everything is optimized from the very start) but the native code that it produces still needs a JVM of some sort. No interpreter of course but classes must be initialized, a garbage collector must be running etc. The shared native code results in some gain in memory consumption but it's not as significant as one might think. > In that > case, maybe you should concentrate on relibility and flexibility and > still have a way out for functions that *must* be high-performance. > Given time and enough resources, I'd like to provide the best of two worlds and give the user a choice whether or not the JVM should be external. Ideally, this should be controlled using configuration parameters so that its easy to test which scenario that works best. It's a lot of work though. It very much comes down to your point "Are you optimising for many short-lived connections or a few long-lived connections?" If the use-cases for the former are fairly few then I'm not sure it's worth the effort. In my experience, that is the case. People tend to use connection pools nowadays. But that's me and my opinion. It would be great if more people where involved in this discussion. Regards, Thomas Hallgren
Thomas Hallgren <thomas@tada.se> writes: > The real downside is that a call from SQL to PL/Java using the current > in-process approach is really fast. It takes about 5 micro secs on my > 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 > micro secs on Linux and 64 micro secs on Windows. That's an overhead > of between 440% and 1300% due to context switching alone. Yet, for > some applications, perhaps that overhead is acceptable? It's only that much difference? Given all the other advantages of separating the JVM from the backends, I'd say you should gladly pay that price. regards, tom lane
Tom Lane wrote: > It's only that much difference? Given all the other advantages of > separating the JVM from the backends, I'd say you should gladly pay > that price. > If I'm right, and the most common scenario is clients using connection pools, then it's very likely that you don't get any advantages at all. Paying for nothing with a 440% increase in calling time (at best) seems expensive :-) Regards, Thomas Hallgren
Thomas Hallgren <thomas@tada.se> writes: > Tom Lane wrote: >> It's only that much difference? Given all the other advantages of >> separating the JVM from the backends, I'd say you should gladly pay >> that price. >> > If I'm right, and the most common scenario is clients using connection pools, then it's very > likely that you don't get any advantages at all. Paying for nothing with a 440% increase in > calling time (at best) seems expensive :-) You are focused too narrowly on a few performance numbers. In my mind the primary advantage is that it will *work*. I do not actually believe that you'll ever get the embedded-JVM approach to production-grade reliability, because of the fundamental problems with threading, error processing, etc. regards, tom lane
Tom Lane wrote: > Thomas Hallgren <thomas@tada.se> writes: > >> Tom Lane wrote: >> >>> It's only that much difference? Given all the other advantages of >>> separating the JVM from the backends, I'd say you should gladly pay >>> that price. >>> >>> >> If I'm right, and the most common scenario is clients using connection pools, then it's very >> likely that you don't get any advantages at all. Paying for nothing with a 440% increase in >> calling time (at best) seems expensive :-) >> > > You are focused too narrowly on a few performance numbers. In my mind > the primary advantage is that it will *work*. I do not actually believe > that you'll ever get the embedded-JVM approach to production-grade > reliability, because of the fundamental problems with threading, error > processing, etc. > My focus with PL/Java over the last year has been to make it a production-grade product and I think I've succeeded pretty well. The current list of open bugs is second to none. What fundamental problems are you thinking of that hasn't been solved already? Regards, Thomas Hallgren
On Mon, 2006-03-27 at 18:27 +0200, Thomas Hallgren wrote: > Tom Lane wrote: > > It's only that much difference? Given all the other advantages of > > separating the JVM from the backends, I'd say you should gladly pay > > that price. > > > If I'm right, and the most common scenario is clients using connection pools, then it's very > likely that you don't get any advantages at all. Paying for nothing with a 440% increase in > calling time (at best) seems expensive :-) Just some thoughts from afar: DB2 supports in-process and out-of-process external function calls (UDFs) that it refers to as UNFENCED and FENCED procedures. For Java only, IBM have moved to supporting *only* FENCED procedures for Java functions, i.e. having a single JVM for all connections. Each connection's Java function runs as a thread on a single dedicated JVM-only process. That approach definitely does increase the invocation time, but it significantly reduces the resources associated with the JVM, as well as allowing memory management to be more controllable (bliss...). So the overall picture could be more CPU and memory resources for each connection in the connection pool. If you have a few small Java functions centralisation would not be good, but if you have a whole application architecture with many connections executing reasonable chunks of code then this can be a win. In that environment we used Java for major database functions, with SQL functions for small extensions. Also the Java invocation time we should be celebrating is that by having Java in the database the Java<->DB time is much less than it would be if we had a Java stack sitting on another server. Best Regards, Simon Riggs
Hi Simon, Thanks for your input. All good points. I actually did some work using Java stored procedures on DB2 a while back but I had managed to forget (or repress :-) ) all about the FENCED/NOT FENCED stuff. The current discussion definitely puts it in a different perspective. I think PL/Java has a pretty good 'NOT FENCED' implementation, as does many other PL's, but no PL has yet come up with a FENCED solution. This FENCED/NOT FENCED terminology would be a good way to differentiate between the two approaches. Any chance of that syntax making it into the PostgreSQL grammar, should the need arise? Some more comments inline: Simon Riggs wrote: > Just some thoughts from afar: DB2 supports in-process and out-of-process > external function calls (UDFs) that it refers to as UNFENCED and FENCED > procedures. For Java only, IBM have moved to supporting *only* FENCED > procedures for Java functions, i.e. having a single JVM for all > connections.> Are you sure about this? As I recall it a FENCED stored procedure executed in a remote JVM of it's own. A parameter could be used that either caused a new JVM to be instantiated for each stored procedure call or to be kept for the duration of the session. The former would yield really horrible performance but keep memory utilization at a minimum. The latter would get a more acceptable performance but waste more memory (in par with PL/Java today). > Each connection's Java function runs as a thread on a > single dedicated JVM-only process. > If that was true, then different threads could share dirty session data. I wanted to do that using DB2 but found it impossible. That was a while back though. > That approach definitely does increase the invocation time, but it > significantly reduces the resources associated with the JVM, as well as > allowing memory management to be more controllable (bliss...). So the > overall picture could be more CPU and memory resources for each > connection in the connection pool. > My very crude measurements indicate that the overhead of using a separate JVM is between 6-15MB of real memory per connection. Today, you get about 10MB/$ and servers configured with 4GB RAM or more are not uncommon. I'm not saying that the overhead doesn't matter. Of course it does. But the time when you needed to be extremely conservative with memory usage has passed. It might be far less expensive to buy some extra memory then to invest in SMP architectures to minimize IPC overhead. My point is, even fairly large app-servers (using connection pools with up to 200 simultaneous connections) can run using relatively inexpensive boxes such as an AMD64 based server with 4GB RAM and show very good throughput with the current implementation. > If you have a few small Java functions centralisation would not be good, > but if you have a whole application architecture with many connections > executing reasonable chunks of code then this can be a win. > One thing to remembered is that a 'chunk of code' that executes in a remote JVM and uses JDBC will be hit by the IPC overhead on each interaction over the JDBC connection. I.e. the overhead is not just limited to the actual call of the UDF, it's also imposed on all database accesses that the UDF makes in turn. > In that environment we used Java for major database functions, with SQL > functions for small extensions. > My guess is that those major database functions did a fair amount of JDBC. Am I right? > Also the Java invocation time we should be celebrating is that by having > Java in the database the Java<->DB time is much less than it would be if > we had a Java stack sitting on another server. > I think the cases when you have a Tomcat or JBoss sitting on the same physical server as the actual database are very common. One major reason being that you don't want network overhead between the middle tier and the backend. Moving logic into the database instead of keeping it in the middle tier is often done to get rid of the last hurdle, the overhead of IPC. Regards, Thomas Hallgren
Thomas Hallgren <thomas@tada.se> writes: > This FENCED/NOT FENCED terminology would be a good way to > differentiate between the two approaches. Any chance of that syntax > making it into the PostgreSQL grammar, should the need arise? Of what value would it be to have it in the grammar? The behavior would be entirely internal to any particular PL in any case. regards, tom lane
Tom Lane wrote: > Thomas Hallgren <thomas@tada.se> writes: > >> This FENCED/NOT FENCED terminology would be a good way to >> differentiate between the two approaches. Any chance of that syntax >> making it into the PostgreSQL grammar, should the need arise? >> > > Of what value would it be to have it in the grammar? The behavior would > be entirely internal to any particular PL in any case. > > Not necessarily but perhaps the term FENCED is incorrect for the concept that I have in mind. All languages that are implemented using a VM could benefit from the same remote UDF protocol. Java, C#, perhaps even Perl or Ruby. The flag that I'd like to have would control 'in-process' versus 'remote'. I'm not too keen on the term FENCED, since it, in the PL/Java case will lead to poorer isolation. Multiple threads running in the same JVM will be able to share data and a JVM crash will affect all connected sessions. Then again, perhaps it's a bad idea to have this in the function declaration in the first place. A custom GUC parameter might be a better choice. It will not be possible to have some functions use the in-process approach and others to execute remotely but I doubt that will matter that much. I'm still eager to hear what it is in the current PL/Java that you consider fundamental unresolvable problems. Regards, Thomas Hallgren
On Tue, 2006-03-28 at 17:48 +0200, Thomas Hallgren wrote: > Simon Riggs wrote: > > Just some thoughts from afar: DB2 supports in-process and out-of-process > > external function calls (UDFs) that it refers to as UNFENCED and FENCED > > procedures. For Java only, IBM have moved to supporting *only* FENCED > > procedures for Java functions, i.e. having a single JVM for all > > connections. > > > Are you sure about this? Yes. > As I recall it a FENCED stored procedure executed in a remote JVM > of it's own. A parameter could be used that either caused a new JVM to be instantiated for > each stored procedure call or to be kept for the duration of the session. The former would > yield really horrible performance but keep memory utilization at a minimum. The latter would > get a more acceptable performance but waste more memory (in par with PL/Java today). In the previous release, yes. > > That approach definitely does increase the invocation time, but it > > significantly reduces the resources associated with the JVM, as well as > > allowing memory management to be more controllable (bliss...). So the > > overall picture could be more CPU and memory resources for each > > connection in the connection pool. > > > My very crude measurements indicate that the overhead of using a separate JVM is between > 6-15MB of real memory per connection. Today, you get about 10MB/$ and servers configured > with 4GB RAM or more are not uncommon. > > I'm not saying that the overhead doesn't matter. Of course it does. But the time when you > needed to be extremely conservative with memory usage has passed. It might be far less > expensive to buy some extra memory then to invest in SMP architectures to minimize IPC overhead. > > My point is, even fairly large app-servers (using connection pools with up to 200 > simultaneous connections) can run using relatively inexpensive boxes such as an AMD64 based > server with 4GB RAM and show very good throughput with the current implementation. Memory is cheap, memory bandwidth is not. All CPUs have limited cache resources, so the more mem you waste, the less efficient your CPUs will be. That effects the way you do things, sure. 1GB lookup table: no problem. 10MB wasted memory retrieval: lots of dead CPU time. > > If you have a few small Java functions centralisation would not be good, > > but if you have a whole application architecture with many connections > > executing reasonable chunks of code then this can be a win. > > > One thing to remembered is that a 'chunk of code' that executes in a remote JVM and uses > JDBC will be hit by the IPC overhead on each interaction over the JDBC connection. I.e. the > overhead is not just limited to the actual call of the UDF, it's also imposed on all > database accesses that the UDF makes in turn. > > > > In that environment we used Java for major database functions, with SQL > > functions for small extensions. > > > My guess is that those major database functions did a fair amount of JDBC. Am I right? Not once I'd reviewed them... > > Also the Java invocation time we should be celebrating is that by having > > Java in the database the Java<->DB time is much less than it would be if > > we had a Java stack sitting on another server. > > > > I think the cases when you have a Tomcat or JBoss sitting on the same physical server as the > actual database are very common. One major reason being that you don't want network overhead > between the middle tier and the backend. Moving logic into the database instead of keeping > it in the middle tier is often done to get rid of the last hurdle, the overhead of IPC. I can see the performance argument for both, but supporting both, especially in a mix-and-match architecture is much harder. Anyway, just trying to add some additional perspective. Best Regards, Simon Riggs
On 28-Mar-06, at 10:48 AM, Thomas Hallgren wrote: > Hi Simon, > Thanks for your input. All good points. I actually did some work > using Java stored procedures on DB2 a while back but I had managed > to forget (or repress :-) ) all about the FENCED/NOT FENCED stuff. > The current discussion definitely puts it in a different > perspective. I think PL/Java has a pretty good 'NOT FENCED' > implementation, as does many other PL's, but no PL has yet come up > with a FENCED solution. What exactly is a FENCED solution ? If it is simply a remote connection to a single JVM then pl-j already does that. > > This FENCED/NOT FENCED terminology would be a good way to > differentiate between the two approaches. Any chance of that syntax > making it into the PostgreSQL grammar, should the need arise? > > Some more comments inline: > > Simon Riggs wrote: >> Just some thoughts from afar: DB2 supports in-process and out-of- >> process >> external function calls (UDFs) that it refers to as UNFENCED and >> FENCED >> procedures. For Java only, IBM have moved to supporting *only* FENCED >> procedures for Java functions, i.e. having a single JVM for all >> connections. > > > Are you sure about this? As I recall it a FENCED stored procedure > executed in a remote JVM of it's own. A parameter could be used > that either caused a new JVM to be instantiated for each stored > procedure call or to be kept for the duration of the session. The > former would yield really horrible performance but keep memory > utilization at a minimum. The latter would get a more acceptable > performance but waste more memory (in par with PL/Java today). > > >> Each connection's Java function runs as a thread on a >> single dedicated JVM-only process. > If that was true, then different threads could share dirty session > data. I wanted to do that using DB2 but found it impossible. That > was a while back though. > >> That approach definitely does increase the invocation time, but it >> significantly reduces the resources associated with the JVM, as >> well as >> allowing memory management to be more controllable (bliss...). So the >> overall picture could be more CPU and memory resources for each >> connection in the connection pool. > My very crude measurements indicate that the overhead of using a > separate JVM is between 6-15MB of real memory per connection. > Today, you get about 10MB/$ and servers configured with 4GB RAM or > more are not uncommon. > > I'm not saying that the overhead doesn't matter. Of course it does. > But the time when you needed to be extremely conservative with > memory usage has passed. It might be far less expensive to buy some > extra memory then to invest in SMP architectures to minimize IPC > overhead. > > My point is, even fairly large app-servers (using connection pools > with up to 200 simultaneous connections) can run using relatively > inexpensive boxes such as an AMD64 based server with 4GB RAM and > show very good throughput with the current implementation. > > >> If you have a few small Java functions centralisation would not be >> good, >> but if you have a whole application architecture with many >> connections >> executing reasonable chunks of code then this can be a win. > One thing to remembered is that a 'chunk of code' that executes in > a remote JVM and uses JDBC will be hit by the IPC overhead on each > interaction over the JDBC connection. I.e. the overhead is not just > limited to the actual call of the UDF, it's also imposed on all > database accesses that the UDF makes in turn. > > >> In that environment we used Java for major database functions, >> with SQL >> functions for small extensions. > My guess is that those major database functions did a fair amount > of JDBC. Am I right? > > >> Also the Java invocation time we should be celebrating is that by >> having >> Java in the database the Java<->DB time is much less than it would >> be if >> we had a Java stack sitting on another server. > > I think the cases when you have a Tomcat or JBoss sitting on the > same physical server as the actual database are very common. One > major reason being that you don't want network overhead between the > middle tier and the backend. Moving logic into the database instead > of keeping it in the middle tier is often done to get rid of the > last hurdle, the overhead of IPC. > > > Regards, > Thomas Hallgren > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
On 28-Mar-06, at 12:11 PM, Thomas Hallgren wrote: > Tom Lane wrote: >> Thomas Hallgren <thomas@tada.se> writes: >> >>> This FENCED/NOT FENCED terminology would be a good way to >>> differentiate between the two approaches. Any chance of that syntax >>> making it into the PostgreSQL grammar, should the need arise? >>> >> >> Of what value would it be to have it in the grammar? The behavior >> would >> be entirely internal to any particular PL in any case. >> >> > Not necessarily but perhaps the term FENCED is incorrect for the > concept that I have in mind. > > All languages that are implemented using a VM could benefit from > the same remote UDF protocol. Java, C#, perhaps even Perl or Ruby. > The flag that I'd like to have would control 'in-process' versus > 'remote'. > > I'm not too keen on the term FENCED, since it, in the PL/Java case > will lead to poorer isolation. Multiple threads running in the same > JVM will be able to share data and a JVM crash will affect all > connected sessions. When was the last time you saw a JVM crash ? These are very rare now. In any case if it does fail, it's a JVM bug and can happen to any code running and take the server down if it is in process. > > Then again, perhaps it's a bad idea to have this in the function > declaration in the first place. A custom GUC parameter might be a > better choice. It will not be possible to have some functions use > the in-process approach and others to execute remotely but I doubt > that will matter that much. > > I'm still eager to hear what it is in the current PL/Java that you > consider fundamental unresolvable problems. > > Regards, > Thomas Hallgren > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly >
Dave Cramer wrote: > > What exactly is a FENCED solution ? If it is simply a remote > connection to a single JVM then pl-j already does that. Last time I tried to use pl-j (in order to build a mutual test platform), I didn't manage to make it compile due to missing artifacts and it wasn't ported to Windows. Lazslo filed a JIRA bug on that but since then (August last year) I've seen no activity in the project. Is it still alive? Is anyone using it? Regards, Thomas Hallgren
The last time I talked to him Laszlo said he is working on it again. Dave On 28-Mar-06, at 2:21 PM, Thomas Hallgren wrote: > Dave Cramer wrote: >> >> What exactly is a FENCED solution ? If it is simply a remote >> connection to a single JVM then pl-j already does that. > Last time I tried to use pl-j (in order to build a mutual test > platform), I didn't manage to make it compile due to missing > artifacts and it wasn't ported to Windows. Lazslo filed a JIRA bug > on that but since then (August last year) I've seen no activity in > the project. Is it still alive? Is anyone using it? > > Regards, > Thomas Hallgren > >
Dave Cramer wrote: > >> >> I'm not too keen on the term FENCED, since it, in the PL/Java case >> will lead to poorer isolation. Multiple threads running in the same >> JVM will be able to share data and a JVM crash will affect all >> connected sessions. > When was the last time you saw a JVM crash ? These are very rare now. I think that's somewhat dependent on what JVM you're using. For the commercial ones, BEA, IBM, and Sun, i fully agree. > In any case if it does fail, it's a JVM bug and can happen to any code > running and take the server down if it is in process. Crash is perhaps not the right word. My point concerned level of isolation. Code that is badly written may have serious impact on other threads in the same JVM. Let's say you cause an OutOfMemoryException or an endless loop. The former will render the JVM completely useless and the latter will cause low scheduling prio. If the same thing happens using an in-process JVM, the problem is isolated to that one session. Regards, Thomas Hallgren
Thomas Hallgren wrote: > Martijn, > > I tried a Socket approach. Using the new IO stuff that arrived with Java > 1.4 (SocketChannel etc.), the performance is really good. Especially on > Linux where an SMP machine show a 1 to 1.5 ratio between one process > doing ping-pong between two threads and two processes doing ping-pong > using a socket. That's acceptable overhead indeed and I don't think I'll > be able to trim it much using a shared memory approach (the thread > scenario uses Java monitor locks. That's the most efficient lightweight > locking implementation I've come across). > > One downside is that on a Windows box, the ratio between the threads and > the processes scenario seems to be 1 to 5 which is a bit worse. I've > heard that Solaris too is less efficient then Linux in this respect. > > The real downside is that a call from SQL to PL/Java using the current > in-process approach is really fast. It takes about 5 micro secs on my > 2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 > micro secs on Linux and 64 micro secs on Windows. That's an overhead of > between 440% and 1300% due to context switching alone. Yet, for some > applications, perhaps that overhead is acceptable? It should be compared > to the high memory consumption that the in-process approach undoubtedly > results in (which in turn might lead to less optimal use of CPU caches > and, if memory is insufficient, more time spent doing swapping). > > Given those numbers, it would be interesting to hear what the community > as a whole thinks about this. Assuming by "community" you mean developers not normally involved in hackers, then: 1) As a developer, the required debugging time increases greatly when one session can effect (or crash) all the other sessions. This in turn drives up the cost of development. Unless some guarantees could be had against this sort of intermittent runtime bugginess, I would be less likely to opt for PL/Java and exposing myself to the potential cost overruns. 2) As a speed freak, I'm going to code things in C, not Java. So the appeal of Java must come from something other than speed, such as stability and faster development cycles. My opinion is that it all depends whether you can hammer down a reliable solution that has the necessary stability guarantees. Splitting the middle, trying to get performance benefits at the cost of stability, would seem to make PL/Java a sort of lukewarm solution on the speed side, and a lukewarm solution on the stability side. I doubt I could get excited about it. mark