Re: What happens if I create new threads from within a postgresql function? - Mailing list pgsql-general

From Merlin Moncure
Subject Re: What happens if I create new threads from within a postgresql function?
Date
Msg-id CAHyXU0yXEVESLCR0H9ti_Md+JGj7ZmNnnMhnZUzxgV+sHcTQ5A@mail.gmail.com
Whole thread Raw
In response to What happens if I create new threads from within a postgresql function?  (Seref Arikan <serefarikan@kurumsalteknoloji.com>)
Responses Re: What happens if I create new threads from within a postgresql function?  (Bruce Momjian <bruce@momjian.us>)
Re: What happens if I create new threads from within a postgresql function?  (Seref Arikan <serefarikan@kurumsalteknoloji.com>)
List pgsql-general
On Mon, Feb 18, 2013 at 5:10 AM, Seref Arikan
<serefarikan@kurumsalteknoloji.com> wrote:
> Greetings,
> What would happen if I create multiple threads from within a postgresql
> function written in C?
> I have the opportunity to do parallel processing on binary data, and I need
> to create multiple threads to do that.
> If I can ensure that all my threads complete their work before I exit my
> function, would this cause any trouble ?
> I am aware of postgresql's single threaded nature when executing queries,
> but is this a limitation for custom multi threaded code use in C based
> functions?
> I can't see any problems other than my custom spawn threads living beyond my
> function's execution and memory/resource allocation issues, but if I can
> handle them, should not I be safe?
>
> I believe I've seen someone applying a similar principle to use GPUs with
> postgresql, and I'm quite interested in giving this a try, unless I'm
> missing something.

Some things immediately jump to mind:
*) backend library routines are not multi-thread safe.  Notably, the
SPI interface and the memory allocator, but potentially anything.  So
your spawned threads should avoid calling the backend API.  I don't
even know if it's safe to call malloc.

*) postgres exception handling can burn you, so I'd be stricter than
"before I exit my function"...really, you need to make sure threads
terminate before any potentially exception throwing backend routine
fires, which is basically all of them including palloc memory
allocation and interrupt checking.  So, we must understand that:

While your threads are executing, your query can't be cancelled --
only a hard kill will take the database down.  If you're ok with that
risk, then go for it.  If you're not, then I'd thinking about
sendinging the bytea through a protocol to a threaded processing
server running outside of the database.  More work and slower
(protocol overhead), but much more robust.

merlin

pgsql-general by date:

Previous
From: Albe Laurenz
Date:
Subject: Re: Immutable functions, Exceptions and the Query Optimizer
Next
From: Magnus Hagander
Date:
Subject: Re: could not link file "pg_xlog/xlogtemp.72606"