Re: Reasoning behind process instead of thread based - Mailing list pgsql-general

From Thomas Hallgren
Subject Re: Reasoning behind process instead of thread based
Date
Msg-id 418025D3.5090205@mailblocks.com
Whole thread Raw
In response to Re: Reasoning behind process instead of thread based  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Reasoning behind process instead of thread based  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom Lane wrote:
> Right.  Depending on your OS you may be able to catch a signal that
> would kill a thread and keep it from killing the whole process, but
> this still leaves you with a process memory space that may or may not
> be corrupted.  Continuing in that situation is not cool, at least not
> according to the Postgres project's notions of reliable software design.
>
There can't be any "may or may not" involved. You must of course know
what went wrong.

It is very common that you either get a null pointer exception (attempt
to access address zero), that your stack will hit a write protected page
(stack overflow), or that you get some sort of arithemtic exception.
These conditions can be trapped and gracefully handled. The signal
handler must be able to check the cause of the exception. This usually
involves stack unwinding and investingating the state of the CPU at the
point where the signal was generated. The process must be terminated if
the reason is not a recognized one.

Out of memory can be managed using thread local allocation areas
(similar to MemoryContext) and killing a thread based on some criteria
when no more memory is available. A criteria could be the thread that
encountered the problem, the thread that consumes the most memory, the
thread that was least recently active, or something else.

> It should be pointed out that when we get a hard backend crash, Postgres
> will forcibly terminate all the backends and reinitialize; which means
> that in terms of letting concurrent sessions keep going, we are not any
> more forgiving than a single-address-space multithreaded server.  The
> real bottom line here is that we have good prospects of confining the
> damage done by the failed process: it's unlikely that anything bad will
> happen to already-committed data on disk or that any other sessions will
> return wrong answers to their clients before we are able to kill them.
> It'd be a lot harder to say that with any assurance for a multithreaded
> server.
>
I'm not sure I follow. You will be able to bring all threads of one
process to a halt much faster than you can kill a number of external
processes. Killing the multithreaded process is more like pulling the plug.

Regards,
Thomas Hallgren

pgsql-general by date:

Previous
From: Robby Russell
Date:
Subject: Re: interval to seconds conversion. How?
Next
From: Michael Fuhr
Date:
Subject: Re: interval to seconds conversion. How?