Thread: stack depth limit exceeded problem.

stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

23 September 2005, 08:19:14

Hi,
I have a problem with PL/Java that, if it's going to have a good 
solution, requires your help.

PL/Java runs a JVM. Since a JVM is multi threaded, PL/Java goes to 
fairly  extreme measures to ensure that only one thread at a time can 
access the backend. So far, this have worked well but there is one small 
problem. Here's a use-case:

Someone loads a library that contains a method that spawns a new thread. 
That thread is the first to access some class. The class loader will now 
make an attempt to load it. PL/Java uses SPI to load classes so a call 
is made to SPI. This call is not made from the main thread that 
originally called the PL/Java function. That thread is suspended at this 
point.

Now, the check_stack_depth() in postgres.c is called. The new thread has 
a stack of it's own of course, so it fails.

I know that multi threading is very controversial and I'm in no way 
asking that the backend should support it. What I would like is a 
workaround for my problem. The easiest way would be if I could change 
the stack_base_ptr temporarily when this happens, a try/catch that kicks 
in when I detect a call from a thread other then main. The only other 
solution is to set the max_stack_depth to a ridiculously high value and 
effectively turn stack checking off. I don't want to do that.

Any opinions on this?

Kind regards,
Thomas Hallgren

Re: stack depth limit exceeded problem.

From

Tom Lane

Date:

23 September 2005, 11:22:05

Thomas Hallgren <thhal@mailblocks.com> writes:
> Someone loads a library that contains a method that spawns a new thread. 

They already broke the backend when they did that.  max_stack_depth is
just the tip of the iceberg.
        regards, tom lane

Re: stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

23 September 2005, 11:29:41

Tom Lane wrote:

>Thomas Hallgren <thhal@mailblocks.com> writes:
>  
>
>>Someone loads a library that contains a method that spawns a new thread. 
>>    
>>
>
>They already broke the backend when they did that.  max_stack_depth is
>just the tip of the iceberg.
>  
>
I knew I'd get a response like that from you :-)

Why is the backend broken? There's no concurrency issue. Only one thread 
is executing.

Regards,
Thomas Hallgren

Re: stack depth limit exceeded problem.

From

Oliver Jowett

Date:

23 September 2005, 20:30:55

Thomas Hallgren wrote:

> PL/Java runs a JVM. Since a JVM is multi threaded, PL/Java goes to
> fairly  extreme measures to ensure that only one thread at a time can
> access the backend. So far, this have worked well but there is one small
> problem. [...]

I assume this means you have a single lock serializing requests to the
backend?

If you can't solve the depth checking problem (Tom doesn't seem to like
the idea of multiple threads calling into the backend..), what about
turning the original thread (i.e. the "main" backend thread) into a
"backend interface thread" that does nothing but feed callbacks into the
backend on request? Then run all the user code in a separate thread that
passes backend requests to the interface thread rather than directly
executing them. If it starts extra threads which makes DB requests, the
mechanism stays the same..

-O

Re: stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

24 September 2005, 05:34:51

Oliver Jowett wrote:

>Thomas Hallgren wrote:
>
>  
>
>>PL/Java runs a JVM. Since a JVM is multi threaded, PL/Java goes to
>>fairly  extreme measures to ensure that only one thread at a time can
>>access the backend. So far, this have worked well but there is one small
>>problem. [...]
>>    
>>
>
>I assume this means you have a single lock serializing requests to the
>backend?
>  
>
Yes, of course. I also make sure that the main thread cannot return 
until another thread that is servicing a backend request has completed. 
There's absolutely no way two threads can execute backend code 
simultaniously.

>If you can't solve the depth checking problem (Tom doesn't seem to like
>the idea of multiple threads calling into the backend..), what about
>turning the original thread (i.e. the "main" backend thread) into a
>"backend interface thread" that does nothing but feed callbacks into the
>backend on request? Then run all the user code in a separate thread that
>passes backend requests to the interface thread rather than directly
>executing them. If it starts extra threads which makes DB requests, the
>mechanism stays the same..
>  
>
I though about that. The drawback is that each and every call must spawn 
a new thread, no matter how trivial that call might be. If you do a 
select from a table with 10,000 records and execute a function for each 
record, you get 20,000 context switches. Avoiding that kind of overhead 
is one of the motivating factors for keeping the VM in-process.

I don't rule out such a solution but I'd like to have a discussion with 
Tom and iron out what the problems are when one thread at a time is 
allowed to execute. Perhaps I can solve them.

Regards,
Thomas Hallgren

Re: stack depth limit exceeded problem.

From

Martijn van Oosterhout

Date:

24 September 2005, 07:09:25

On Sat, Sep 24, 2005 at 10:34:42AM +0200, Thomas Hallgren wrote:
> Oliver Jowett wrote:
> >I assume this means you have a single lock serializing requests to the
> >backend?
> >
> Yes, of course. I also make sure that the main thread cannot return
> until another thread that is servicing a backend request has completed.
> There's absolutely no way two threads can execute backend code
> simultaniously.

Ok, I have a question. PostgreSQL uses sigsetjmp/siglongjmp to handle
errors in the backend. If you're changing the stack, how do you avoid
the siglongjmp jumping back to a different stack? Or do you somehow
avoid this problem altogether?

> I though about that. The drawback is that each and every call must spawn
> a new thread, no matter how trivial that call might be. If you do a
> select from a table with 10,000 records and execute a function for each
> record, you get 20,000 context switches. Avoiding that kind of overhead
> is one of the motivating factors for keeping the VM in-process.

Well, on linux at least context switches are quite cheap. However, how
does Java handle the possibility that functions never return. Do you
wrap each call in a PG_TRY/PG_CATCH to propegate errors?

Tricky issues...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

24 September 2005, 07:27:06

Martijn van Oosterhout wrote:

>On Sat, Sep 24, 2005 at 10:34:42AM +0200, Thomas Hallgren wrote:
>  
>
>>Oliver Jowett wrote:
>>    
>>
>>>I assume this means you have a single lock serializing requests to the
>>>backend?
>>>
>>>      
>>>
>>Yes, of course. I also make sure that the main thread cannot return 
>>until another thread that is servicing a backend request has completed. 
>>There's absolutely no way two threads can execute backend code 
>>simultaniously.
>>    
>>
>
>Ok, I have a question. PostgreSQL uses sigsetjmp/siglongjmp to handle
>errors in the backend. If you're changing the stack, how do you avoid
>the siglongjmp jumping back to a different stack? Or do you somehow
>avoid this problem altogether?
>  
>
All calls use a PG_TRY/PG_CATCH. So yes, I think I avoid that problem 
altogether.

>>I though about that. The drawback is that each and every call must spawn 
>>a new thread, no matter how trivial that call might be. If you do a 
>>select from a table with 10,000 records and execute a function for each 
>>record, you get 20,000 context switches. Avoiding that kind of overhead 
>>is one of the motivating factors for keeping the VM in-process.
>>    
>>
>
>Well, on linux at least context switches are quite cheap.
>
I know. And as I said, I don't rule out such a solution. But however 
cheap, there's still a performance penalty and added complexity. I 
rather avoid both if I can. At least until I know what the real problem 
is with the solution that I propose.

> However, how
>does Java handle the possibility that functions never return. Do you
>wrap each call in a PG_TRY/PG_CATCH to propegate errors?
>  
>
Yes. All backend exceptions are cought in a PG_CATCH and then propagated 
to Java as a ServerException. If there's no catch in the Java code, they 
are "rethrown" by the java_call_handler. This time with jump buffer that 
was setup by the backend when it invoked the call_handler.

There's also a barrier that will prevent any further calls from the Java 
code once an exception has been thrown by the backend unless that call 
was wrapped in a savepoint construct. A savepoint rollback will "unlock" 
the barrier (this is not related to the thread issue of course).

Regards,
Thomas Hallgren

Re: stack depth limit exceeded problem.

From

Martijn van Oosterhout

Date:

24 September 2005, 08:13:08

On Sat, Sep 24, 2005 at 12:26:58PM +0200, Thomas Hallgren wrote:
> Yes. All backend exceptions are cought in a PG_CATCH and then propagated
> to Java as a ServerException. If there's no catch in the Java code, they
> are "rethrown" by the java_call_handler. This time with jump buffer that
> was setup by the backend when it invoked the call_handler.
>
> There's also a barrier that will prevent any further calls from the Java
> code once an exception has been thrown by the backend unless that call
> was wrapped in a savepoint construct. A savepoint rollback will "unlock"
> the barrier (this is not related to the thread issue of course).

Well, you seem to have dealt with the obvious issues I can see. I
imagine you need also to worry about things like signal handling. Is
there no way to reserve a stack just for PostgreSQL and switch to that
stack, rather than switch threads (although, the stack is really the
only thing that differentiates threads anyway...).

Linux has sigaltstack so you can catch the stack overflow signal (and
other signals obviously, but that's its main use), but it's not terribly
portable. What you really need to do is set the stack_base_ptr every
time you execute postgres with a new stack; that preserves existing
semantics.

Signals are the only way the kernel can pass control unexpectedly so if
you handle those, postgres would never know it's threaded. I do wonder
if there are any other assumptions made...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

24 September 2005, 09:38:43

Martijn van Oosterhout wrote:> Linux has sigaltstack so you can catch the stack overflow signal (and> other signals
obviously,but that's its main use), but it's not terribly> portable.>
 
I rely on the signal handler that the JVM uses for page-faults (which a 
stack overflow generally amounts to) and fpe exeptions so I know that 
they will generate java exceptions in a controlled way (which I in turn 
translate to elog(ERROR) on the main thread).
> What you really need to do is set the stack_base_ptr every> time you execute postgres with a new stack; that
preservesexisting> semantics.>
 
Exactly!. What I'd really like to do in threads other than main is:

void* currentBase = switchStackBase(stackBaseOfMyThread);
PG_TRY
{  /* service the call here */  switchStackBase(currentBase);
}
PG_CATCH
{  switchStackBase(currentBase);  /* generate Java exception as usual */
}
> Signals are the only way the kernel can pass control unexpectedly so if> you handle those, postgres would never know
it'sthreaded. I do wonder> if there are any other assumptions made...>> Have a nice day,
 

You too. And thanks for all your input.

Regards,
Thomas Hallgren

Re: stack depth limit exceeded problem.

From

Martijn van Oosterhout

Date:

24 September 2005, 10:54:21

On Sat, Sep 24, 2005 at 02:38:35PM +0200, Thomas Hallgren wrote:
> Martijn van Oosterhout wrote:
> > Linux has sigaltstack so you can catch the stack overflow signal (and
> > other signals obviously, but that's its main use), but it's not terribly
> > portable.
> >
> I rely on the signal handler that the JVM uses for page-faults (which a
> stack overflow generally amounts to) and fpe exeptions so I know that
> they will generate java exceptions in a controlled way (which I in turn
> translate to elog(ERROR) on the main thread).

Well, actually, what I was thinking is if someone sends a -INT or -TERM
to the backend, which thread will catch it? You have to block it in
every thread except the one you want to catch it in if you want to
control it. This means that for any signal handler that PostgreSQL
installs, you need to intercept it with a wrapper function to make sure
it runs in the right stack.

Actually, while running backend code, you're probably fine since the
elog stuff will handle it. But if a signal is received while the JVM is
running, the signal handler will get the stack of the JVM. Now,
PostgreSQLs signal handlers tend not to do much so you may be safe.
They tend not to throws errors, but who knows...

Still, this is all solvable I think...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: stack depth limit exceeded problem.

From

Thomas Hallgren

Date:

25 September 2005, 08:14:19

Martijn van Oosterhout wrote:

>>I rely on the signal handler that the JVM uses for page-faults (which a 
>>stack overflow generally amounts to) and fpe exeptions so I know that 
>>they will generate java exceptions in a controlled way (which I in turn 
>>translate to elog(ERROR) on the main thread).
>>    
>>
>
>Well, actually, what I was thinking is if someone sends a -INT or -TERM
>to the backend, which thread will catch it? You have to block it in
>every thread except the one you want to catch it in if you want to
>control it. This means that for any signal handler that PostgreSQL
>installs, you need to intercept it with a wrapper function to make sure
>it runs in the right stack.
>
>Actually, while running backend code, you're probably fine since the
>elog stuff will handle it. But if a signal is received while the JVM is
>running, the signal handler will get the stack of the JVM. Now,
>PostgreSQLs signal handlers tend not to do much so you may be safe.
>They tend not to throws errors, but who knows...
>
>Still, this is all solvable I think...
>  
>
Yes, the signal handling in PL/Java needs a bit more work. Interrupts 
doesn't work well when using PL/Java at present. This is what I plan to 
do (I think this is what you mean too, right?).

Many threads are spawned by the JVM and will never enter the backend. I 
can't control this and I can't add thread initialization code. Hence, I 
have no way of blocking signals on a per-thread basis the normal way. 
Instead, all PostgreSQL handlers that might break when called from 
another thread must be replaced by a wrapper that checks the interrupted 
thread. Since an arbitrary thread will receive the signal, the wrapper 
must sometimes dispatch the signal to another thread. The final receiver 
of the signal must be either the thread that currently executes a 
backend request, or if no such thread exists, the main thread.

PL/Java will be limited to platforms that support that signals are 
dispatched to specific threads. I don't consider that a limitation. I 
think many JVM's have the same restriction.

Regards,
Thomas Hallgren