Thread: Multithread Query Planner

Multithread Query Planner

From
Frederico
Date:
Hi folks.

Is there any restriction in create and start threads inside Postgres?

I'm trying to develop a multithread planner, and some times is raised a exception of access memory.

I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same
memorycontext of root process and create a new context to each new thread, but doesn't worked. 


Any tips?

Att,

Fred

Enviado via iPad

Re: Multithread Query Planner

From
Christopher Browne
Date:
On Fri, Jan 13, 2012 at 3:14 PM, Frederico <zepfred@gmail.com> wrote:
> Hi folks.
>
> Is there any restriction in create and start threads inside Postgres?
>
> I'm trying to develop a multithread planner, and some times is raised a exception of access memory.
>
> I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same
memorycontext of root process and create a new context to each new thread, but doesn't worked.
 
>
>
> Any tips?

Yes, don't try to use threads.


<http://wiki.postgresql.org/wiki/Developer_FAQ#Why_don.27t_you_use_threads.2C_raw_devices.2C_async-I.2FO.2C_.3Cinsert_your_favorite_wizz-bang_feature_here.3E.3F>

... threads are not currently used instead of multiple processes for
backends because:
   Historically, threads were poorly supported and buggy.   An error in one backend can corrupt other backends if
they're
threads within a single process   Speed improvements using threads are small compared to the
remaining backend startup time.   The backend code would be more complex.   Terminating backend processes allows the OS
tocleanly and quickly
 
free all resources, protecting against memory and file descriptor
leaks and making backend shutdown cheaper and faster   Debugging threaded programs is much harder than debugging
worker
processes, and core dumps are much less useful   Sharing of read-only executable mappings and the use of
shared_buffers means processes, like threads, are very memory
efficient   Regular creation and destruction of processes helps protect
against memory fragmentation, which can be hard to manage in
long-running processes

There's a pretty large burden of reasons *not* to use threads, and
while some of them have diminished in importance, most have not.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: Multithread Query Planner

From
Dimitri Fontaine
Date:
Christopher Browne <cbbrowne@gmail.com> writes:
> Yes, don't try to use threads.
>
>
<http://wiki.postgresql.org/wiki/Developer_FAQ#Why_don.27t_you_use_threads.2C_raw_devices.2C_async-I.2FO.2C_.3Cinsert_your_favorite_wizz-bang_feature_here.3E.3F>
>
> ... threads are not currently used instead of multiple processes for
> backends because:

I would only add that the backend code is really written in a process
based perspective, with a giant number of private variables that are in
fact global variables.

Trying to “clean” that out in order to get to threads… wow.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Multithread Query Planner

From
Frederico
Date:
This means it's possible use threads?

Att,

Fred

Enviado via iPad

Em 13/01/2012, às 20:47, Dimitri Fontaine <dimitri@2ndQuadrant.fr> escreveu:

> Christopher Browne <cbbrowne@gmail.com> writes:
>> Yes, don't try to use threads.
>>
>>
<http://wiki.postgresql.org/wiki/Developer_FAQ#Why_don.27t_you_use_threads.2C_raw_devices.2C_async-I.2FO.2C_.3Cinsert_your_favorite_wizz-bang_feature_here.3E.3F>
>>
>> ... threads are not currently used instead of multiple processes for
>> backends because:
>
> I would only add that the backend code is really written in a process
> based perspective, with a giant number of private variables that are in
> fact global variables.
>
> Trying to “clean” that out in order to get to threads… wow.
>
> Regards,
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Multithread Query Planner

From
Dimitri Fontaine
Date:
Frederico <zepfred@gmail.com> writes:
> This means it's possible use threads?

The short answer is “no”.
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Multithread Query Planner

From
Thomas Munro
Date:
On 13 January 2012 20:14, Frederico <zepfred@gmail.com> wrote:
> I'm trying to develop a multithread planner, and some times is raised a exception of access memory.

I was a bit confused about what you are trying to do -- somehow
use concurrency during the planning phase, or during
execution (maybe having produced concurrency-aware plans)?

Here is my naive thought: Since threads are not really an option
as explained by others, you could use helper processes to
implement executor concurrency, by replacing nodes with proxies
that talk to helper processes (perhaps obtained from a
per-cluster pool).  The proxy nodes would send their child
subplans and the information needed to get the appropriate
snapshot, and receive tuples via some kind of IPC (perhaps
shmem-backed queues or pipes or whatever).

A common use case in other RDBMSs is running queries over
multiple partitions using parallelism.  In the above scheme that
could be done if the children of Append nodes were candidates for
emigration to helper processes.  OTOH there are some plans
produced by UNION and certain kinds of OR that could probably
benefit.

There may be some relevant stuff in PostgreSQL-XC?


Re: Multithread Query Planner

From
Yeb Havinga
Date:
On 2012-01-13 21:14, Frederico wrote:
> Hi folks.
>
> Is there any restriction in create and start threads inside Postgres?
>
> I'm trying to develop a multithread planner, and some times is raised a exception of access memory.
>
> I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same
memorycontext of root process and create a new context to each new thread, but doesn't worked.
 
>
>
> Any tips?

Not sure if it is of any use to you, but the vldb paper 'Parallelizing 
Query Optimization' http://www.vldb.org/pvldb describes a experimental 
implementation in PostgreSQL.

regards,
Yeb



Re: Multithread Query Planner

From
Merlin Moncure
Date:
On Fri, Jan 13, 2012 at 2:29 PM, Christopher Browne <cbbrowne@gmail.com> wrote:
> On Fri, Jan 13, 2012 at 3:14 PM, Frederico <zepfred@gmail.com> wrote:
>> Hi folks.
>>
>> Is there any restriction in create and start threads inside Postgres?
>>
>> I'm trying to develop a multithread planner, and some times is raised a exception of access memory.
>>
>> I'm debugging the code to see if is a bug in the planner, but until now, I still not found. I tried to use the same
memorycontext of root process and create a new context to each new thread, but doesn't worked.
 
>>
>>
>> Any tips?
>
> Yes, don't try to use threads.
>
>
<http://wiki.postgresql.org/wiki/Developer_FAQ#Why_don.27t_you_use_threads.2C_raw_devices.2C_async-I.2FO.2C_.3Cinsert_your_favorite_wizz-bang_feature_here.3E.3F>
>
> ... threads are not currently used instead of multiple processes for
> backends because:

Yes, but OP is proposing to use multiple threads inside the forked
execution process.  That's a completely different beast.  Many other
databases support parallel execution of a single query and it might
very well be better/easier to do that with threads.

merlin


Re: Multithread Query Planner

From
Robert Haas
Date:
On Mon, Jan 23, 2012 at 2:45 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> Yes, but OP is proposing to use multiple threads inside the forked
> execution process.  That's a completely different beast.  Many other
> databases support parallel execution of a single query and it might
> very well be better/easier to do that with threads.

I doubt it.  Almost nothing in the backend is thread-safe.  You can't
acquire a heavyweight lock, a lightweight lock, or a spinlock. You
can't do anything that might elog() or ereport().  None of those
things are reentrant.  Consequently, you can't do anything that
involves reading or pinning a buffer, making a syscache lookup, or
writing WAL.  You can't even  do something like parallelize the
qsort() of a chunk of data that's already been read into a private
buffer... because you'd have to call the comparison functions for the
data type, and they might elog() or ereport().  Of course, in certain
special cases (like int4) you could make it safe, but it's hard for to
imagine anyone wanting to go to that amount of effort for such a small
payoff.

If we're going to do parallel query in PG, and I think we are going to
need to do that eventually, we're going to need a system where large
chunks of work can be handed off, as in the oft-repeated example of
parallelizing an append node by executing multiple branches
concurrently.  That's where the big wins are.  And that means either
overhauling the entire backend to make it thread-safe, or using
multiple backends.  The latter will be hard, but it'll still be a lot
easier than the former.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Multithread Query Planner

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I doubt it.  Almost nothing in the backend is thread-safe.  You can't
> acquire a heavyweight lock, a lightweight lock, or a spinlock. You
> can't do anything that might elog() or ereport().  None of those
> things are reentrant.

Not to mention palloc, another extremely fundamental and non-reentrant
subsystem.

Possibly we could work on making all that stuff re-entrant, but it would
be a huge amount of work for a distant and uncertain payoff.
        regards, tom lane


Re: Multithread Query Planner

From
Robert Haas
Date:
On Tue, Jan 24, 2012 at 11:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I doubt it.  Almost nothing in the backend is thread-safe.  You can't
>> acquire a heavyweight lock, a lightweight lock, or a spinlock. You
>> can't do anything that might elog() or ereport().  None of those
>> things are reentrant.
>
> Not to mention palloc, another extremely fundamental and non-reentrant
> subsystem.
>
> Possibly we could work on making all that stuff re-entrant, but it would
> be a huge amount of work for a distant and uncertain payoff.

Right.  I think it makes more sense to try to get parallelism working
first with the infrastructure we have.  Converting to use threading,
if we ever do it at all, should be something we view as a later
performance optimization.  But I suspect we won't want to do it
anyway; I think there will be easier ways to get where we want to be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Multithread Query Planner

From
"Pierre C"
Date:
>> Not to mention palloc, another extremely fundamental and non-reentrant
>> subsystem.
>>
>> Possibly we could work on making all that stuff re-entrant, but it would
>> be a huge amount of work for a distant and uncertain payoff.

> Right.  I think it makes more sense to try to get parallelism working
> first with the infrastructure we have.  Converting to use threading,
> if we ever do it at all, should be something we view as a later
> performance optimization.  But I suspect we won't want to do it
> anyway; I think there will be easier ways to get where we want to be.

Multithreading got fashionable with the arrival of the Dual-core CPU a few
years ago. However, multithreading as it is used currently has a huge
problem : usually, threads share all of their memory. This opens the door
to an infinite number of hard to find bugs, and more importantly, defeats
the purpose.

"Re-entrant palloc()" is nonsense. Suppose you can make a reentrant
palloc() which scales OK at 2 threads thanks to a cleverly placed atomic
instruction. How is it going to scale on 64 cores ? On HP's new 1000-core
ARM server with non-uniform memory access ? Probably it would suck very
very badly... not to mention the horror of multithreaded exception-safe
deallocation when 1 thread among many blows up on an error...

For the ultimate in parallelism, ask a FPGA guy. Is he using shared memory
to wire together his 12000 DSP blocks ? Nope, he's using isolated
Processes which share nothing and communicate through FIFOs and hardware
message passing. Like shell pipes, basically. Or Erlang.

Good parallelism = reduce shared state and communicate through
data/message channels.

Shared-everything multithreading is going to be in a lot of trouble on
future many-core machines. Incidentally, Postgres, with its Processes,
sharing only what is needed, has a good head start...

With more and more cores coming, you guys are going to have to fight to
reduce the quantity of shared state between processes, not augment it by
using shared memory threads !...

Say you want to parallelize sorting.
Sorting is a black-box with one input data pipe and one output data pipe.
Data pipes are good for parallelism, just like FIFOs. FPGA guys love black
boxes with FIFOs between them.

Say you manage to send tuples through a FIFO like zeromq. Now you can even
run the sort on another machine and allow it to use all the RAM if you
like. Now split the black box in two black boxes (qsort and merge),
instanciate as many qsort boxes as necessary, and connect that together
with pipes. Run some boxes on some of this machine's cores, some other
boxes on another machine, etc. That would be very flexible (and scalable).

Of course the black box has a small backdoor : some comparison functions
can access shared state, which is basically *the* issue (not reentrant
stuff, which you do not need).


Re: Multithread Query Planner

From
"Fred&Dani&Pandora"
Date:
Ok, thanks.

Att,

Fred

2012/1/24 Robert Haas <robertmhaas@gmail.com>
On Tue, Jan 24, 2012 at 11:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I doubt it.  Almost nothing in the backend is thread-safe.  You can't
>> acquire a heavyweight lock, a lightweight lock, or a spinlock. You
>> can't do anything that might elog() or ereport().  None of those
>> things are reentrant.
>
> Not to mention palloc, another extremely fundamental and non-reentrant
> subsystem.
>
> Possibly we could work on making all that stuff re-entrant, but it would
> be a huge amount of work for a distant and uncertain payoff.

Right.  I think it makes more sense to try to get parallelism working
first with the infrastructure we have.  Converting to use threading,
if we ever do it at all, should be something we view as a later
performance optimization.  But I suspect we won't want to do it
anyway; I think there will be easier ways to get where we want to be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Multithread Query Planner

From
Robert Haas
Date:
On Fri, Jan 27, 2012 at 2:56 PM, Pierre C <lists@peufeu.com> wrote:
>> Right.  I think it makes more sense to try to get parallelism working
>> first with the infrastructure we have.  Converting to use threading,
>> if we ever do it at all, should be something we view as a later
>> performance optimization.  But I suspect we won't want to do it
>> anyway; I think there will be easier ways to get where we want to be.
>
> Multithreading got fashionable with the arrival of the Dual-core CPU a few
> years ago. However, multithreading as it is used currently has a huge
> problem : usually, threads share all of their memory. This opens the door to
> an infinite number of hard to find bugs, and more importantly, defeats the
> purpose.
>
> "Re-entrant palloc()" is nonsense. Suppose you can make a reentrant palloc()
> which scales OK at 2 threads thanks to a cleverly placed atomic instruction.
> How is it going to scale on 64 cores ? On HP's new 1000-core ARM server with
> non-uniform memory access ? Probably it would suck very very badly... not to
> mention the horror of multithreaded exception-safe deallocation when 1
> thread among many blows up on an error...

There are academic papers out there on how to build a thread-safe,
highly concurrent memory allocator.  You seem to be assuming that
everyone doing allocations would need to compete for access to a
single freelist, or something like that, which is simply not true.  A
lot of effort and study has been put into figuring out how to get past
bottlenecks in this area, because there is a lot of multi-threaded
code out there that needs to surmount these problems.  I don't believe
that the problem is that it can't be done, but rather that we haven't
done it.

> For the ultimate in parallelism, ask a FPGA guy. Is he using shared memory
> to wire together his 12000 DSP blocks ? Nope, he's using isolated Processes
> which share nothing and communicate through FIFOs and hardware message
> passing. Like shell pipes, basically. Or Erlang.

I'm not sure we can conclude much from this example.  The programming
style of people using FPGAs is probably governed by the nature of the
interface and the type of computation they are doing rather than
anything else.

> Good parallelism = reduce shared state and communicate through data/message
> channels.
>
> Shared-everything multithreading is going to be in a lot of trouble on
> future many-core machines. Incidentally, Postgres, with its Processes,
> sharing only what is needed, has a good head start...
>
> With more and more cores coming, you guys are going to have to fight to
> reduce the quantity of shared state between processes, not augment it by
> using shared memory threads !...

I do agree that it's important to reduce shared state.  We've seen
some optimizations this release cycle that work precisely because they
cut down on the rate at which cache lines must be passed between
cores, and it's pretty clear that we need to go farther in that
direction.  On the other hand, I think it's a mistake to confuse the
programming model with the amount of shared state.  In a
multi-threaded programming model there is likely to be a lot more
memory that is technically "shared" in the sense that any thread could
technically access it.  But if the application is coded in such a way
that actual sharing is minimal, then it's not necessarily any worse
than a process model as far as concurrency is concerned.  Threading
provides a couple of key advantages which, with our process model, we
can't get: it avoids the cost of a copy-on-write operation every time
a child is forked, and it allows arbitrary amounts of memory rather
than being limited to a single shared memory segment that must be
sized in advance.  The major disadvantage is really with robustness,
not performance, I think: in a threaded environment, with a shared
address space, the consequences of a random memory stomp will be less
predictable.

> Say you want to parallelize sorting.
> Sorting is a black-box with one input data pipe and one output data pipe.
> Data pipes are good for parallelism, just like FIFOs. FPGA guys love black
> boxes with FIFOs between them.
>
> Say you manage to send tuples through a FIFO like zeromq. Now you can even
> run the sort on another machine and allow it to use all the RAM if you like.
> Now split the black box in two black boxes (qsort and merge), instanciate as
> many qsort boxes as necessary, and connect that together with pipes. Run
> some boxes on some of this machine's cores, some other boxes on another
> machine, etc. That would be very flexible (and scalable).
>
> Of course the black box has a small backdoor : some comparison functions can
> access shared state, which is basically *the* issue (not reentrant stuff,
> which you do not need).

Well, you do need reentrant stuff, if the comparator does anything
non-trivial.  It's easy to imagine that comparing strings or dates or
whatever is a trivial operation that's done without allocating any
memory or throwing any errors, but it's not really true.  I think the
challenge of using GPU acceleration or JIT or threading or other
things that are used in really high-performance computing is going to
be that a lot of our apparently-trivial operations are actually, well,
not so trivial, because we have error checks, overflow checks,
nontrivial encoding/decoding from the on-disk format, etc.  There's a
tendency to wave that stuff away as peripheral, but I think that's a
mistake.  Someone who knows how to do it can probably write a
muti-threaded, just-in-time-compiled, and/or GPU-accelerated program
in an afternoon that solves pretty complex problems much more quickly
than PostgreSQL, but doing it without throwing all the error checks
and on numeric as well as int4 and in a way that's portable to every
architecture we support - ah, well, there's the hard part.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company