Thread: Threads
Hi all, I am sure, many of you would like to delete this message before reading, hold on. :-) There is much talk about threading on this list and the idea is always deferred for want of robust thread models across all supported platforms and feasibility of gains v/s efforts required. I think threads are useful in difference situations namely parallelising blocking conditions and using multiple CPUs. Attached is a framework that I ported to C from a C++ server I have written. It has threadpool and threads implementation based on pthreads. This code expects minimum pthreads implementation and does not assume anything on threads part (e.g kernel threads or not etc.) I request hackers on this list to take a look at it. It should be easily pluggable in any source code and is released without any strings for any use. This framework allows to plug-in the worker function and argument on the fly. The threads created are sleeping by default and can be woken up s and when required. I propose to use it incrementally in postgresql. Let's start with I/O. When a block of data is being read, rather than blocking for read, we can set up creator-consumer link between two threads That we way can utilize that I/O time in a overlapped fashion. Further threads can be useful when the server has more CPUs. It can spread CPU intensive work to different threads such as index creation or sorting. This way we can utilise idle CPU which we can not as of now. There are many advantages that I can see. 1)Threads can be optionally turned on/off depending upon the configuration. So we can entirely keep existing functionality and convert them one-by-one to threaded application. 2)For each functionality we can have two code branches, one that do not use threads i.e. current code base and one that can use threads. Agreed the binary will be bit bloated but that would give enormous flexibility. If we find a thread implementation buggy, we simply switch it off either in compilation or inconfiguration. 3) Not much efforts should be required to plug code into this model. The idea of using threads is to assign exclusive work to each thread. So that should not require much of a locking. In case of using multiple CPUs, separate functions need be written that can handle the things in a thread-safe fashion. Also a merger function would be required which would merge results of worker threads. That would be totally additional. I would say two threads per CPU per back-end should be a reasonable default as that would cover I/O blocking well. Of course unless threading is turned off in build or in configuration. Please note that I have tested the code in C++ and my C is rusty. Quite likely there are bugs in the code. I will stress test the code on monday but I would like to seek an opinion on this as soon as possible. ( Hey but it compiles clean..) If required I can post example usage of this code, but I don't think that should be necessary.:-) ByeShridhar
Please no threading threads!!! Has anyone calculated the interval and period of "PostgreSQL needs threads" posts? The *ONLY* advantage threading has over multiple processes is the time and resources used in creating new processes. That being said, I admit that creating a threaded program is easier than one with multiple processes, but PostgreSQL is already there and working. Drawbacks to a threaded model: (1) One thread screws up, the whole process dies. In a multiple process application this is not too much of an issue. (2) Heap fragmentation. In a long uptime application, such as a database, heap fragmentation is an important consideration. With multiple processes, each process manages its own heap and what ever fragmentation that exists goes away when the connection is closed. A threaded server is far more vulnerable because the heap has to manage many threads and the heap has to stay active and unfragmented in perpetuity. This is why Windows applications usually end up using 2G of memory after 3 months of use. (Well, this AND memory leaks) (3) Stack space. In a threaded application they are more limits to stack usage. I'm not sure, but I bet PostgreSQL would have a problem with a fixed size stack, I know the old ODBC driver did. (4) Lock Contention. The various single points of access in a process have to be serialized for multiple threads. heap allocation, deallocation, etc all have to be managed. In a multple process model, these resources would be separated by process contexts. (5) Lastly, why bother? Seriously? Process creation time is an issue true, but its an issue with threads as well, just not as bad. Anyone who is looking for performance should be using a connection pooling mechanism as is done in things like PHP. I have done both threaded and process servers. The threaded servers are easier to write. The process based severs are more robust. From an operational point of view, a "select foo from bar where x > y" will take he same amount of time.
> -----Original Message----- > From: mlw [mailto:pgsql@mohawksoft.com] > Sent: Friday, January 03, 2003 12:47 PM > To: Shridhar Daithankar > Cc: PGHackers > Subject: Re: [HACKERS] Threads > > > Please no threading threads!!! > > Has anyone calculated the interval and period of "PostgreSQL needs > threads" posts? > > The *ONLY* advantage threading has over multiple processes is > the time > and resources used in creating new processes. Threading is absurdly easier to do portably than fork(). Will you fork() successfully on MVS, VMS, OS/2, Win32? On some operating systems, thread creation is absurdly faster than process creation (many orders of magnitude). > That being said, I admit that creating a threaded program is > easier than > one with multiple processes, but PostgreSQL is already there > and working. > > Drawbacks to a threaded model: > > (1) One thread screws up, the whole process dies. In a > multiple process > application this is not too much of an issue. If you use C++ you can try/catch and nothing bad happens to anything but the naughty thread. > (2) Heap fragmentation. In a long uptime application, such as a > database, heap fragmentation is an important consideration. With > multiple processes, each process manages its own heap and what ever > fragmentation that exists goes away when the connection is closed. A > threaded server is far more vulnerable because the heap has to manage > many threads and the heap has to stay active and unfragmented in > perpetuity. This is why Windows applications usually end up > using 2G of > memory after 3 months of use. (Well, this AND memory leaks) Poorly written applications leak memory. Fragmentation is a legitimate concern. > (3) Stack space. In a threaded application they are more > limits to stack > usage. I'm not sure, but I bet PostgreSQL would have a problem with a > fixed size stack, I know the old ODBC driver did. A single server with 20 threads will consume less total free store memory and automatic memory than 20 servers. You have to decide how much stack to give a thread, that's true. > (4) Lock Contention. The various single points of access in a process > have to be serialized for multiple threads. heap allocation, > deallocation, etc all have to be managed. In a multple process model, > these resources would be separated by process contexts. Semaphores are more complicated than critical sections. If anything, a shared memory approach is more problematic and fragile, especially when porting to multiple operating systems. > (5) Lastly, why bother? Seriously? Process creation time is an issue > true, but its an issue with threads as well, just not as bad. > Anyone who > is looking for performance should be using a connection pooling > mechanism as is done in things like PHP. > > I have done both threaded and process servers. The threaded > servers are > easier to write. The process based severs are more robust. From an > operational point of view, a "select foo from bar where x > > y" will take > he same amount of time. Probably true. I think a better solution is a server that can start threads or processes or both. But that's neither here nor there and I'm certainly not volunteering to write it. Here is a solution to the dilemma. Make the one who suggests the feature be the first volunteer on the team that writes it. Is it a FAQ? If not, it ought to be.
On Fri, 2003-01-03 at 14:47, mlw wrote: > Please no threading threads!!! > Ya, I'm very pro threads but I've long since been sold on no threads for PostgreSQL. AIO on the other hand... ;) Your summary so accurately addresses the issue it should be a whole FAQ entry on threads and PostgreSQL. :) > Drawbacks to a threaded model: > > (1) One thread screws up, the whole process dies. In a multiple process > application this is not too much of an issue. > > (2) Heap fragmentation. In a long uptime application, such as a > database, heap fragmentation is an important consideration. With > multiple processes, each process manages its own heap and what ever > fragmentation that exists goes away when the connection is closed. A > threaded server is far more vulnerable because the heap has to manage > many threads and the heap has to stay active and unfragmented in > perpetuity. This is why Windows applications usually end up using 2G of > memory after 3 months of use. (Well, this AND memory leaks) These are things that can't be stressed enough. IMO, these are some of the many reasons why applications running on MS platforms tend to have much lower application and system up times (that and resources leaks which are inherent to the platform). BTW, if you do much in the way of threaded coding, there is libHorde which is a heap library for heavily threaded, memory hungry applications. It excels in performance, reduces heap lock contention (maintains multiple heaps in a very thread smart manner), and goes a long way toward reducing heap fragmentation which is common for heavily memory based, threaded applications. > (3) Stack space. In a threaded application they are more limits to stack > usage. I'm not sure, but I bet PostgreSQL would have a problem with a > fixed size stack, I know the old ODBC driver did. > Most modern thread implementations use a page guard on the stack to determine if it needs to grow or not. Generally speaking, for most modern platforms which support threading, stack considerations rarely become an issue. > (5) Lastly, why bother? Seriously? Process creation time is an issue > true, but its an issue with threads as well, just not as bad. Anyone who > is looking for performance should be using a connection pooling > mechanism as is done in things like PHP. > > I have done both threaded and process servers. The threaded servers are > easier to write. The process based severs are more robust. From an > operational point of view, a "select foo from bar where x > y" will take > he same amount of time. > I agree with this, however, using threads does open the door for things like splitting queries and sorts across multiple CPUs. Something the current process model, which was previously agreed on, would not be able to address because of cost. Example: "select foo from bar where x > y order by foo ;", could be run on multiple CPUs if the sort were large enough to justify. After it's all said and done, I do agree that threading just doesn't seem like a good fit for PostgreSQL. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
On Fri, 2003-01-03 at 14:52, Dann Corbit wrote: > > -----Original Message----- > > (1) One thread screws up, the whole process dies. In a > > multiple process > > application this is not too much of an issue. > > If you use C++ you can try/catch and nothing bad happens to anything but > the naughty thread. That doesn't protect against the type of issues he's talking about. Invalid pointer reference is a very common snafu which really hoses threaded applications. Not to mention resource leaks AND LOCKED resources which are inherently an issue on Win32. Besides, it's doubtful that PostgreSQL is going to be rewritten in C++ so bringing up try/catch is pretty much an invalid argument. > > > (2) Heap fragmentation. In a long uptime application, such as a > > database, heap fragmentation is an important consideration. With > > multiple processes, each process manages its own heap and what ever > > fragmentation that exists goes away when the connection is closed. A > > threaded server is far more vulnerable because the heap has to manage > > many threads and the heap has to stay active and unfragmented in > > perpetuity. This is why Windows applications usually end up > > using 2G of > > memory after 3 months of use. (Well, this AND memory leaks) > > Poorly written applications leak memory. Fragmentation is a legitimate > concern. And well written applications which attempt to safely handle segfaults, etc., often leak memory and lock resources like crazy. On Win32, depending on the nature of the resources, once this happens, even process termination will not free/unlock the resources. > > (4) Lock Contention. The various single points of access in a process > > have to be serialized for multiple threads. heap allocation, > > deallocation, etc all have to be managed. In a multple process model, > > these resources would be separated by process contexts. > > Semaphores are more complicated than critical sections. If anything, a > shared memory approach is more problematic and fragile, especially when > porting to multiple operating systems. And critical sections lead to low performance on SMP systems for Win32 platforms. No task can switch on ANY CPU for the duration of the critical section. It's highly recommend by MS as the majority of Win32 applications expect uniprocessor systems and they are VERY fast. As soon as multiple processors come into the mix, critical sections become a HORRIBLE idea if any soft of scalability is desired. > Is it a FAQ? If not, it ought to be. I agree. I think mlw's list of reasons should be added to a faq. It terse yet says it all! -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
> > >I am sure, many of you would like to delete this message before reading, hold >on. :-) > I'm afraid most posters did not read the message. Those who replied "Why bother?" did not address your challenge: >I think threads are useful in difference situations namely parallelising >blocking conditions and using multiple CPUs. > > This is indeed one of the few good reasons for threads. Indeed, large/robust systems use a mix. The consensus of the group is that those who do the work are not ready for threads. Which is fine. Looking into my crystal ball, I see that it will happen, though it appears so far away. bbaker
Greg Copeland wrote: >On Fri, 2003-01-03 at 14:47, mlw wrote: > > >>Please no threading threads!!! >> >> >> > >Ya, I'm very pro threads but I've long since been sold on no threads for >PostgreSQL. AIO on the other hand... ;) > >Your summary so accurately addresses the issue it should be a whole FAQ >entry on threads and PostgreSQL. :) > Thanks! I do like threads myself. Love them! Loving them, however does not mean that one should ignore their weaknesses. I have a PHP session handler (msession) which is threaded, but I am very careful with memory allocation, locks, and so on. I also do a lot of padding in memory allocations. I know it is wasteful in the short term, but it keeps the little gnats from hosing up the heap. >>Drawbacks to a threaded model: >> >>(1) One thread screws up, the whole process dies. In a multiple process >>application this is not too much of an issue. >> >>(2) Heap fragmentation. In a long uptime application, such as a >>database, heap fragmentation is an important consideration. With >>multiple processes, each process manages its own heap and what ever >>fragmentation that exists goes away when the connection is closed. A >>threaded server is far more vulnerable because the heap has to manage >>many threads and the heap has to stay active and unfragmented in >>perpetuity. This is why Windows applications usually end up using 2G of >>memory after 3 months of use. (Well, this AND memory leaks) >> >> > > >These are things that can't be stressed enough. IMO, these are some of >the many reasons why applications running on MS platforms tend to have >much lower application and system up times (that and resources leaks >which are inherent to the platform). > >BTW, if you do much in the way of threaded coding, there is libHorde >which is a heap library for heavily threaded, memory hungry >applications. It excels in performance, reduces heap lock contention >(maintains multiple heaps in a very thread smart manner), and goes a >long way toward reducing heap fragmentation which is common for heavily >memory based, threaded applications. > Thank's I'll take a look. > > >>(3) Stack space. In a threaded application they are more limits to stack >>usage. I'm not sure, but I bet PostgreSQL would have a problem with a >>fixed size stack, I know the old ODBC driver did. >> >> >> > >Most modern thread implementations use a page guard on the stack to >determine if it needs to grow or not. Generally speaking, for most >modern platforms which support threading, stack considerations rarely >become an issue. > One of my projects, msesson, I wrote a SQL (PG and ODBC) plugin. The main system thread didn't crash, the server threads went down quickly. I had to bump the thread stack up to 250K to work. That doesn't sound like much, but if you have 200 connections to your server, thats a lot of memory that has to be fit into the process space. >>(5) Lastly, why bother? Seriously? Process creation time is an issue >>true, but its an issue with threads as well, just not as bad. Anyone who >>is looking for performance should be using a connection pooling >>mechanism as is done in things like PHP. >> >>I have done both threaded and process servers. The threaded servers are >>easier to write. The process based severs are more robust. From an >>operational point of view, a "select foo from bar where x > y" will take >>he same amount of time. >> >> >> > >I agree with this, however, using threads does open the door for things >like splitting queries and sorts across multiple CPUs. Something the >current process model, which was previously agreed on, would not be able >to address because of cost. > >Example: "select foo from bar where x > y order by foo ;", could be run >on multiple CPUs if the sort were large enough to justify. > >After it's all said and done, I do agree that threading just doesn't >seem like a good fit for PostgreSQL. > Yes, absolutely, if PostgreSQL ever grew threads, I think that should be the focus, forget the threaded connection crap, threaded queries!! How about this: select T1.foo, X1.bar from (select * from T) as T1, (select * from X) as X1 where T1.id = X1.id The two sub queries could execute in parallel. That would rock! > > >
----- Original Message ----- From: "Greg Copeland" <greg@CopelandConsulting.Net> Sent: January 03, 2003 4:45 PM > > > (1) One thread screws up, the whole process dies. In a > > > multiple process > > > application this is not too much of an issue. > > > > If you use C++ you can try/catch and nothing bad happens to anything but > > the naughty thread. > > That doesn't protect against the type of issues he's talking about. > Invalid pointer reference is a very common snafu which really hoses > threaded applications. Not to mention resource leaks AND LOCKED > resources which are inherently an issue on Win32. (1) is an issue only for user-level threads. And besides... ----- Original Message ----- From: "Dann Corbit" <DCorbit@connx.com> Sent: January 03, 2003 3:52 PM > Here is a solution to the dilemma. Make the one who suggests the > feature be the first volunteer on the team that writes it. .. and that's exactly what Shridhar did - he's sent in the code _already_ in his post, allowing framework to plug in the model into PG, as he says allowing turning on and off threads where appropriate and keeping the current model as well. But noone bothered to go over it if it makes sense. ----- Original Message ----- From: "Greg Copeland" <greg@CopelandConsulting.Net> Sent: January 03, 2003 4:45 PM > > Is it a FAQ? If not, it ought to be. > > I agree. I think mlw's list of reasons should be added to a faq. It > terse yet says it all! <http://developer.postgresql.org/readtext.php?src/FAQ/FAQ_DEV.html+Developers-FAQ#1.9> But it's not as complete as this threaded thread. -s
"Serguei Mokhov" <mokhov@cs.concordia.ca> writes: >>> (1) One thread screws up, the whole process dies. In a >>> multiple process application this is not too much of an issue. > (1) is an issue only for user-level threads. Uh, what other kind of thread have you got in mind here? I suppose the lack-of-cross-thread-protection issue would go away if our objective was only to use threads for internal parallelism in each backend instance (ie, you still have one process per connection, but internally it would use multiple threads to process subqueries in parallel). Of course that gives up the hope of faster connection startup that has always been touted as a major reason to want Postgres to be threaded... regards, tom lane
On Fri, 2003-01-03 at 19:34, Tom Lane wrote: > "Serguei Mokhov" <mokhov@cs.concordia.ca> writes: > >>> (1) One thread screws up, the whole process dies. In a > >>> multiple process application this is not too much of an issue. > > > (1) is an issue only for user-level threads. > Umm. No. User or system level threads, the statement is true. If a thread kills over, the process goes with it. Furthermore, on Win32 platforms, it opens a whole can of worms no matter how you care to address it. > Uh, what other kind of thread have you got in mind here? > > I suppose the lack-of-cross-thread-protection issue would go away if > our objective was only to use threads for internal parallelism in each > backend instance (ie, you still have one process per connection, but > internally it would use multiple threads to process subqueries in > parallel). > Several have previously spoken about a hybrid approach (ala Apache). IIRC, it was never ruled out but it was simply stated that no one had the energy to put into such a concept. > Of course that gives up the hope of faster connection startup that has > always been touted as a major reason to want Postgres to be threaded... > > regards, tom lane Faster startup, should never be the primary reason as there are many ways to address that issue already. Connection pooling and caching are by far, the most common way to address this issue. Not only that, but by definition, it's almost an oxymoron. If you really need high performance, you shouldn't be using transient connections, no matter how fast they are. This, in turn, brings you back to persistent connections or connection pools/caches. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
<br /><br /> Greg Copeland wrote:<br /><blockquote cite="mid1041646276.15927.202.camel@mouse.copelandconsulting.net" type="cite"><prewrap=""> </pre><blockquote type="cite"><pre wrap="">Of course that gives up the hope of faster connectionstartup that has always been touted as a major reason to want Postgres to be threaded... regards, tom lane </pre></blockquote><pre wrap=""> Faster startup, should never be the primary reason as there are many ways to address that issue already. Connection pooling and caching are by far, the most common way to address this issue. Not only that, but by definition, it's almost an oxymoron. If you really need high performance, you shouldn't be using transient connections, no matter how fast they are. This, in turn, brings you back to persistent connections or connection pools/caches.</pre></blockquote> Connection time should *never* be in the critical path. There, I've said it!!People who complain about connection time are barking up the wrong tree. Regardless of the methodology, EVERY OS hasissues with thread creation, process creation, the memory allocation, and system manipulation required to manage it.Under load this is ALWAYS slower. <br /><br /> I think that if there is ever a choice, "do I make startup time faster?"or "Do I make PostgreSQL not need a dump/restore for upgrade" the upgrade problem has a much higher impact to realPostgreSQL sites.<br />
On Fri, 2003-01-03 at 21:39, mlw wrote: > Connection time should *never* be in the critical path. There, I've > said it!! People who complain about connection time are barking up the > wrong tree. Regardless of the methodology, EVERY OS has issues with > thread creation, process creation, the memory allocation, and system > manipulation required to manage it. Under load this is ALWAYS slower. > > I think that if there is ever a choice, "do I make startup time > faster?" or "Do I make PostgreSQL not need a dump/restore for upgrade" > the upgrade problem has a much higher impact to real PostgreSQL sites. Exactly. Trying to speed up something that shouldn't be in the critical path is exactly what I'm talking about. I completely agree with you! -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Also remember that in even well developed OS's like FreeBSD, all a process's threads will execute only on one CPU. This might change in FreeBSD 5.0, but still a threaded app (such as MySQL) cannot use mutliple CPUs on a FreeBSD system. Chris On Fri, 3 Jan 2003, mlw wrote: > Please no threading threads!!! > > Has anyone calculated the interval and period of "PostgreSQL needs > threads" posts? > > The *ONLY* advantage threading has over multiple processes is the time > and resources used in creating new processes. > > That being said, I admit that creating a threaded program is easier than > one with multiple processes, but PostgreSQL is already there and working. > > Drawbacks to a threaded model: > > (1) One thread screws up, the whole process dies. In a multiple process > application this is not too much of an issue. > > (2) Heap fragmentation. In a long uptime application, such as a > database, heap fragmentation is an important consideration. With > multiple processes, each process manages its own heap and what ever > fragmentation that exists goes away when the connection is closed. A > threaded server is far more vulnerable because the heap has to manage > many threads and the heap has to stay active and unfragmented in > perpetuity. This is why Windows applications usually end up using 2G of > memory after 3 months of use. (Well, this AND memory leaks) > > (3) Stack space. In a threaded application they are more limits to stack > usage. I'm not sure, but I bet PostgreSQL would have a problem with a > fixed size stack, I know the old ODBC driver did. > > (4) Lock Contention. The various single points of access in a process > have to be serialized for multiple threads. heap allocation, > deallocation, etc all have to be managed. In a multple process model, > these resources would be separated by process contexts. > > (5) Lastly, why bother? Seriously? Process creation time is an issue > true, but its an issue with threads as well, just not as bad. Anyone who > is looking for performance should be using a connection pooling > mechanism as is done in things like PHP. > > I have done both threaded and process servers. The threaded servers are > easier to write. The process based severs are more robust. From an > operational point of view, a "select foo from bar where x > y" will take > he same amount of time. > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
> Umm. No. User or system level threads, the statement is true. If a > thread kills over, the process goes with it. Furthermore, on Win32 Hm. This is a database system. If one of the backend processes dies unexpectedly, I'm not sure I would trust the consistency and state of the others. Or maybe I'm just being chicken. -- Kaare Rasmussen --Linux, spil,-- Tlf: 3816 2582 Kaki Data tshirts, merchandize Fax: 3816 2501 Howitzvej 75 Åben 12.00-18.00 Email: kar@kakidata.dk 2000 Frederiksberg Lørdag 12.00-16.00 Web: www.suse.dk
On Sat, 2003-01-04 at 06:59, Kaare Rasmussen wrote: > > Umm. No. User or system level threads, the statement is true. If a > > thread kills over, the process goes with it. Furthermore, on Win32 > > Hm. This is a database system. If one of the backend processes dies > unexpectedly, I'm not sure I would trust the consistency and state of the > others. > > Or maybe I'm just being chicken. I'd call that being wise. That's the problem with using threads. Should a thread do something naughty, the state of the entire process is in question. This is true regardless if it is a user mode, kernel mode, or hybrid thread implementation. That's the power of using the process model that is currently in use. Should it do something naughty, we bitch and complain politely, throw our hands in the air and exit. We no longer have to worry about the state and validity of that backend. This creates a huge systemic reliability surplus. This is also why the concept of a hybrid thread/process implementation keeps coming to the surface on the list. If you maintain the process model and only use threads for things that ONLY relate to the single process (single session/connection), should a thread cause a problem, you can still throw you hands in the air and exit just as is done now without causing problems for, or questioning the validity of, other backends. The cool thing about such a concept is that it still opens the door for things like parallel sorts and queries as it relates to a single backend. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
On Saturday 04 January 2003 03:20 am, you wrote: > >I am sure, many of you would like to delete this message before reading, > > hold on. :-) > > I'm afraid most posters did not read the message. Those who replied > > "Why bother?" did not address your challenge: Our challenges may be..;-) Anyway you are absolutely right. Looks like evrybody thought it as an attempt to convert postgresql to a thread per connection model. > >I think threads are useful in difference situations namely parallelising > >blocking conditions and using multiple CPUs. > > This is indeed one of the few good reasons for threads. Indeed, > large/robust systems use a mix. > > The consensus of the group is that those who do the work are not ready > for threads. Which is fine. Looking into my crystal ball, I see that > it will happen, though it appears so far away. I hope it happens and I will be able to contribute to it if I can. Shridhar
>>>>> "Shridhar" == Shridhar Daithankar <shridhar_daithankar@persistent.co.in> writes: Shridhar> On Saturday 04 January 2003 03:20 am, you wrote: >> >I am sure, many of you would like to delete this message >> before reading, > hold on. :-) >> >> I'm afraid most posters did not read the message. Those who >> replied >> >> "Why bother?" did not address your challenge: Shridhar> Our challenges may be..;-) Not having threading does reduce some of the freedom we've been having in our work. But then we have ripped the process model a fair bit and we have the freedom of an entirely new process to deal with data streams entering the system and we're experimenting with threading for asynchronous I/O there. However, in general I agree with the spirit of the previous messages in this thread that threading isn't the main issue for PG. One thing that I missed so far in the threading thread. Context switches are (IMHO) far cheaper between threads, because you save TLB flushes. Whether this makes a real difference in a data intensive application, I don't know. I wonder how easy it is to measure the x86 counters to see TLB flushes/misses. In a database system, even if one process dies, I'd be very chary of trusting it. So I am not too swayed by the fact that a process-per-connection gets you better isolation. BTW, many commercial database systems also use per-process models on Unix. However they are very aggressive with connection sharing and reuse - even to the point of reusing the same process for multiple active connections .. maybe at transaction boundaries. Good when a connection is maintained for a long duaration with short-lived transactions separated by fair amouns of time. Moreover, in db2 for instance, the same code base is used for both per-thread and per-process models - in other words, the entire code is MT-safe, and the scheduling mechanism is treated as a policy (Win32 is MT, and some Unices MP). AFAICT though, postgres code, such as perhaps the memory contexts is not MT-safe (of course the bufferpool/shmem accesses are safe). -- Pip-pip Sailesh http://www.cs.berkeley.edu/~sailesh
Hello all, it's very interesting to see the discussion of "threads" again. I've portet PostgreSQL to a "thread-per-connection" model based on pthreads and it is functional. Most of the work was finding all the static globals in the sourcefiles and swapping them between threads and freeing memory if a thread terminates. (PostgreSQL isn't written very clean in the aspects of memory handling). My version of the thread-based PostgreSQL is not very efficient at the moment because I haven't done any optimisation of the code to better support threads and I'm using just a simple semaphore to control switching of data but this could be a starting point for others who want to see this code. If this direction will be taken seriously I'm very willing to help. If someone is interested in the code I can send a zip file to everyone who wants. Ulrich ---------------------------------- This e-mail is virus scanned Diese e-mail ist virusgeprueft
On 6 Jan 2003 at 12:22, Ulrich Neumann wrote: > Hello all, > If someone is interested in the code I can send a zip file to everyone > who wants. I suggest you preserver your work. The reason I suggested thread are mainly two folds. 1) Get I/O time used fuitfully 2) Use multiple CPU better. It will not require as much code cleaning as your efforts might had. However your work will be very useful if somebody decides to use thread in any fashion in core postgresql. I was hoping for bit more optimistic response given that what I suggested was totally optional at any point of time but very important from performance point. Besides the change would have been gradual as required.. Anyway.. ByeShridhar -- Robot, n.: University administrator.
On Mon, 2003-01-06 at 05:36, Shridhar Daithankar wrote: > On 6 Jan 2003 at 12:22, Ulrich Neumann wrote: > > > Hello all, > > If someone is interested in the code I can send a zip file to everyone > > who wants. > > I suggest you preserver your work. The reason I suggested thread are mainly two > folds. > > 1) Get I/O time used fuitfully AIO may address this without the need for integrated threading. Arguably, from the long thread that last appeared on the topic of AIO, some hold that AIO doesn't even offer anything beyond the current implementation. As such, it's highly doubtful that integrated threading is going to offer anything beyond what a sound AIO implementation can achieve. > 2) Use multiple CPU better. > Multiple processes tend to universally support multiple CPUs better than does threading. On some platforms, the level of threading support is currently only user mode implementations which means no additional CPU use. Furthermore, some platforms where user-mode threads are defacto, they don't even allow for scheduling bias resulting is less work being accomplished within the same time interval (work slice must be divided between n-threads within the process, all of which run on a single CPU). > It will not require as much code cleaning as your efforts might had. However > your work will be very useful if somebody decides to use thread in any fashion > in core postgresql. > > I was hoping for bit more optimistic response given that what I suggested was > totally optional at any point of time but very important from performance > point. Besides the change would have been gradual as required.. > Speaking for my self, I probably would of been more excited if the offered framework had addressed several issues. The short list is: o Code needs to be more robust. It shouldn't be calling exit directly as, I believe, it should be allowing for PostgreSQL to clean up some. Correct me as needed. I would of also expected the code of adopted PostgreSQL's semantics and mechanisms as needed (error reporting, etc). I do understand it was an initial attempt to simply get something in front of some eyes and have something to talk about. Just the same, I was expecting something that we could actually pull the trigger with. o Code isn't very portable. Looked fairly okay for pthread platforms, however, there is new emphasis on the Win32 platform. I think it would be a mistake to introduce something as significant as threading without addressing Win32 from the get-go. o I would desire a more highly abstracted/portable interface which allows for different threading and synchronization primitives to be used. Current implementation is tightly coupled to pthreads. Furthermore, on platforms such as Solaris, I would hope it would easily allow for plugging in its native threading primitives which are touted to be much more efficient than pthreads on said platform. o Code is not commented. I would hope that adding new code for something as important as threading would be commented. o Code is fairly trivial and does not address other primitives (semaphores, mutexs, conditions, TSS, etc) portably which would be required for anything but the most trivial of threaded work. This is especially true in such an application where data IS the application. As such, you must reasonably assume that threads need some form of portable serialization primitives, not to mention mechanisms for non-trivial communication. o Does not address issues such as thread signaling or status reporting. o Pool interface is rather simplistic. Does not currently support concepts such as wake pool, stop pool, pool status, assigning a pool to work, etc. In fact, it's not altogether obvious what the capabilities intent is of the current pool implementation. o Doesn't seem to address any form of thread communication facilities (mailboxes, queues, etc). There are probably other things that I can find if I spend more than just a couple of minutes looking at the code. Honestly, I love threads but I can see that the current code offering is not much more than a token in its current form. No offense meant. After it's all said and done, I'd have to see a lot more meat before I'd be convinced that threading is ready for PostgreSQL; from both a social and technological perspective. Regards, -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
On 6 Jan 2003 at 6:48, Greg Copeland wrote: > > 1) Get I/O time used fuitfully > AIO may address this without the need for integrated threading. > Arguably, from the long thread that last appeared on the topic of AIO, > some hold that AIO doesn't even offer anything beyond the current > implementation. As such, it's highly doubtful that integrated threading > is going to offer anything beyond what a sound AIO implementation can > achieve. Either way, a complete aio or threading implementation is not available on major platforms that postgresql runs. Linux definitely does not have one, last I checked. If postgresql is not using aio or threading, we should start using one of them, is what I feel. What do you say? > > 2) Use multiple CPU better. > Multiple processes tend to universally support multiple CPUs better than > does threading. On some platforms, the level of threading support is > currently only user mode implementations which means no additional CPU > use. Furthermore, some platforms where user-mode threads are defacto, > they don't even allow for scheduling bias resulting is less work being > accomplished within the same time interval (work slice must be divided > between n-threads within the process, all of which run on a single CPU). The frame-work I have posted, threading is optional at build and should be a configuration option if it gets integrated. So for the platforms that can not spread threads across multiple CPUs, it can simply be turned off.. > Speaking for my self, I probably would of been more excited if the > offered framework had addressed several issues. The short list is: > > o Code needs to be more robust. It shouldn't be calling exit directly > as, I believe, it should be allowing for PostgreSQL to clean up some. > Correct me as needed. I would of also expected the code of adopted > PostgreSQL's semantics and mechanisms as needed (error reporting, etc). > I do understand it was an initial attempt to simply get something in > front of some eyes and have something to talk about. Just the same, I > was expecting something that we could actually pull the trigger with. That could be done. > > o Code isn't very portable. Looked fairly okay for pthread platforms, > however, there is new emphasis on the Win32 platform. I think it would > be a mistake to introduce something as significant as threading without > addressing Win32 from the get-go. If you search for "pthread" in thread.c, there are not many instances. Same goes for thread.h. From what I understand windows threading, it would be less than 10 minutes job to #ifdef the pthread related part on either file. It is just that I have not played with windows threading and nor I am inclined to...;-) > > o I would desire a more highly abstracted/portable interface which > allows for different threading and synchronization primitives to be > used. Current implementation is tightly coupled to pthreads. > Furthermore, on platforms such as Solaris, I would hope it would easily > allow for plugging in its native threading primitives which are touted > to be much more efficient than pthreads on said platform. Same as above. If there can be two cases separated with #ifdef, there can be more.. But what is important is to have a thread that can be woken up as and when required with any function desired. That is the basic idea. > o Code is not commented. I would hope that adding new code for > something as important as threading would be commented. Agreed. > o Code is fairly trivial and does not address other primitives > (semaphores, mutexs, conditions, TSS, etc) portably which would be > required for anything but the most trivial of threaded work. This is > especially true in such an application where data IS the application. > As such, you must reasonably assume that threads need some form of > portable serialization primitives, not to mention mechanisms for > non-trivial communication. I don't get this. Probably I should post a working example. It is not threads responsibility to make a function thread safe which is changed on the fly. The function has to make sure that it is thread safe. That is altogether different effort.. > o Does not address issues such as thread signaling or status reporting. From what I learnt from pthreads on linux, I would not mix threads and signals. One can easily add code in runner function that disables any signals for thread while the thread starts running. This would leave original signal handling mechanism in place. As far as status reporting is concerned, the thread sould be initiated while back-end starts and terminated with backend termination. What is about status reporting? > o Pool interface is rather simplistic. Does not currently support > concepts such as wake pool, stop pool, pool status, assigning a pool to > work, etc. In fact, it's not altogether obvious what the capabilities > intent is of the current pool implementation. Could you please elaborate? I am using same interface in c++ for a server application and never faced a problem like that..;-) > o Doesn't seem to address any form of thread communication facilities > (mailboxes, queues, etc). Not part of this abstraction of threading mechanism. Intentionally left out to keep things clean. > There are probably other things that I can find if I spend more than > just a couple of minutes looking at the code. Honestly, I love threads > but I can see that the current code offering is not much more than a > token in its current form. No offense meant. None taken. Point is it is useful and that is enough for me. If you could elaborate examples for any problems you see, I can probably modify it. (Code documentation is what I will do now) > After it's all said and done, I'd have to see a lot more meat before I'd > be convinced that threading is ready for PostgreSQL; from both a social > and technological perspective. Tell me about it.. ByeShridhar -- What's this script do? unzip ; touch ; finger ; mount ; gasp ; yes ; umount ; sleepHint for the answer: not everything is computer-oriented. Sometimes you'rein a sleeping bag, camping out.(Contributed by Frans van der Zande.)
On Tue, 2003-01-07 at 02:00, Shridhar Daithankar wrote: > On 6 Jan 2003 at 6:48, Greg Copeland wrote: > > > 1) Get I/O time used fuitfully > > AIO may address this without the need for integrated threading. > > Arguably, from the long thread that last appeared on the topic of AIO, > > some hold that AIO doesn't even offer anything beyond the current > > implementation. As such, it's highly doubtful that integrated threading > > is going to offer anything beyond what a sound AIO implementation can > > achieve. > > Either way, a complete aio or threading implementation is not available on > major platforms that postgresql runs. Linux definitely does not have one, last > I checked. > There are two or three significant AIO implementation efforts currently underway for Linux. One such implementation is available from the Red Hat Server Edition (IIRC) and has been available for some time now. I believe Oracle is using it. SGI also has an effort and I forget where the other one comes from. Nonetheless, I believe it's going to be a hard fought battle to get AIO implemented simply because I don't think anyone, yet, can truly argue a case on the gain vs effort. > If postgresql is not using aio or threading, we should start using one of them, > is what I feel. What do you say? > I did originally say that I'd like to see an AIO implementation. Then again, I don't current have a position to stand other than simply saying it *might* perform better. ;) Not exactly a position that's going to win the masses over. > > was expecting something that we could actually pull the trigger with. > > That could be done. > I'm sure it can, but that's probably the easiest item to address. > > > > o Code isn't very portable. Looked fairly okay for pthread platforms, > > however, there is new emphasis on the Win32 platform. I think it would > > be a mistake to introduce something as significant as threading without > > addressing Win32 from the get-go. > > If you search for "pthread" in thread.c, there are not many instances. Same > goes for thread.h. From what I understand windows threading, it would be less > than 10 minutes job to #ifdef the pthread related part on either file. > > It is just that I have not played with windows threading and nor I am inclined > to...;-) > Well, the method above is going to create a semi-ugly mess. I've written thread abstraction layers which cover OS/2, NT, and pthreads. Each have subtle distinction. What really needs to be done is the creation of another abstraction layer which your current code would sit on top of. That way, everything contained within is clear and easy to read. The big bonus is that as additional threading implementations need to be added, only the "low-level" abstraction stuff needs to modified. Done properly, each thread implementation would be it's own module requiring little #if clutter. As you can see, that's a fair amount of work and far from where the code currently is. > > > > o I would desire a more highly abstracted/portable interface which > > allows for different threading and synchronization primitives to be > > used. Current implementation is tightly coupled to pthreads. > > Furthermore, on platforms such as Solaris, I would hope it would easily > > allow for plugging in its native threading primitives which are touted > > to be much more efficient than pthreads on said platform. > > Same as above. If there can be two cases separated with #ifdef, there can be > more.. But what is important is to have a thread that can be woken up as and > when required with any function desired. That is the basic idea. > Again, there's a lot of work in creating a well formed abstraction layer for all of the mechanics that are required. Furthermore, different thread implementations have slightly different semantics which further complicates things. Worse, some types of primitives are simply not available with some thread implementations. That means those platforms require it to be written from the primitives that are available on the platform. Yet more work. > > o Code is fairly trivial and does not address other primitives > > (semaphores, mutexs, conditions, TSS, etc) portably which would be > > required for anything but the most trivial of threaded work. This is > > especially true in such an application where data IS the application. > > As such, you must reasonably assume that threads need some form of > > portable serialization primitives, not to mention mechanisms for > > non-trivial communication. > > I don't get this. Probably I should post a working example. It is not threads > responsibility to make a function thread safe which is changed on the fly. The > function has to make sure that it is thread safe. That is altogether different > effort.. You're right, it's not the thread's responsibility, however, it is the threading toolkit's. In this case, you're offering to be the toolkit which functions across two platforms, just for starters. Reasonably, you should expect a third to quickly follow. > > > o Does not address issues such as thread signaling or status reporting. > > >From what I learnt from pthreads on linux, I would not mix threads and signals. > One can easily add code in runner function that disables any signals for thread > while the thread starts running. This would leave original signal handling > mechanism in place. > > As far as status reporting is concerned, the thread sould be initiated while > back-end starts and terminated with backend termination. What is about status > reporting? > > > o Pool interface is rather simplistic. Does not currently support > > concepts such as wake pool, stop pool, pool status, assigning a pool to > > work, etc. In fact, it's not altogether obvious what the capabilities > > intent is of the current pool implementation. > > Could you please elaborate? I am using same interface in c++ for a server > application and never faced a problem like that..;-) > > > > o Doesn't seem to address any form of thread communication facilities > > (mailboxes, queues, etc). > > Not part of this abstraction of threading mechanism. Intentionally left out to > keep things clean. > > > There are probably other things that I can find if I spend more than > > just a couple of minutes looking at the code. Honestly, I love threads > > but I can see that the current code offering is not much more than a > > token in its current form. No offense meant. > > None taken. Point is it is useful and that is enough for me. If you could > elaborate examples for any problems you see, I can probably modify it. (Code > documentation is what I will do now) > > > After it's all said and done, I'd have to see a lot more meat before I'd > > be convinced that threading is ready for PostgreSQL; from both a social > > and technological perspective. > > Tell me about it.. > Long story short, if PostgreSQL is to use threads, it shouldn't be handicapped by having a very limited subset of functionality. With the code that has been currently submitted, I don't believe you could even effectively implement a parallel sort. To get an idea of the types of things that would be needed, check out the ACE Toolkit. There are a couple of other fairly popular toolkits as well. Nonetheless, it's a significant effort and the current code is a long ways off from being usable. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Greg Copeland <greg@CopelandConsulting.Net> writes: > That's the power of using the process model that is currently in use. Should > it do something naughty, we bitch and complain politely, throw our hands in > the air and exit. We no longer have to worry about the state and validity of > that backend. You missed the point of his post. If one process in your database does something nasty you damn well should worry about the state of and validity of the entire database, not just that one backend. Are you really sure you caught the problem before it screwed up the data in shared memory? On disk? This whole topic is in need of some serious FUD-dispelling and careful analysis. Here's a more calm explanation of the situation on this particular point. Perhaps I'll follow up with something on IO concurrency later. The point in consideration here is really memory isolation. Threads by default have zero isolation between threads. They can all access each other's memory even including their stack. Most of that memory is in fact only needed by a single thread. Processes by default have complete memory isolation. However postgres actually weakens that by doing a lot of work in a shared memory pool. That memory gets exactly the same protection as it would get in a threaded model, which is to say none. So the reality is that if you have a bug most likely you've only corrupted the local data which can be easily cleaned up either way. In the thread model there's also the unlikely but scary risk that you've damaged other threads' memory. And in either case there's the possibility that you've damaged the shared pool which is unrecoverable. In theory minimising the one case of corrupting other threads' local data shouldn't make a big difference to the risk in the case of an assertion failure. I'm not sure in practice if that's true though. Processes probably reduce the temptation to do work in the shared area too. -- greg
On Tue, 2003-01-07 at 12:21, Greg Stark wrote: > Greg Copeland <greg@CopelandConsulting.Net> writes: > > > That's the power of using the process model that is currently in use. Should > > it do something naughty, we bitch and complain politely, throw our hands in > > the air and exit. We no longer have to worry about the state and validity of > > that backend. > > You missed the point of his post. If one process in your database does > something nasty you damn well should worry about the state of and validity of > the entire database, not just that one backend. > I can assure you I did not miss the point. No idea why you're continuing to spell it out. In this case, it appears the quotation is being taken out of context or it was originally stated in an improper context. > Are you really sure you caught the problem before it screwed up the data in > shared memory? On disk? > > > This whole topic is in need of some serious FUD-dispelling and careful > analysis. Here's a more calm explanation of the situation on this particular > point. Perhaps I'll follow up with something on IO concurrency later. > Hmmm. Not sure what needs to be dispelled since I've not seen any FUD. > The point in consideration here is really memory isolation. Threads by default > have zero isolation between threads. They can all access each other's memory > even including their stack. Most of that memory is in fact only needed by a > single thread. > Again, this has been covered already. > Processes by default have complete memory isolation. However postgres actually > weakens that by doing a lot of work in a shared memory pool. That memory gets > exactly the same protection as it would get in a threaded model, which is to > say none. > Again, this has all been covered, more or less. You're comments seem to imply that you did not fully read what has been said on the topic thus far or that you misunderstood something that was said. Of course, it's also possible that I may of said something out of it's proper context which may be confusing you. I think it's safe to say I don't have any further comment unless something new is being brought to the table. Should there be something new to cover, I'm happy to talk about it. At this point, however, it appears that it's been beat to death already. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Greg Stark <gsstark@mit.edu> writes: > You missed the point of his post. If one process in your database does > something nasty you damn well should worry about the state of and validity of > the entire database, not just that one backend. Right. And in fact we do blow away all the processes when any one of them crashes or panics. Nonetheless, memory isolation between processes is a Good Thing, because it reduces the chances that a process gone wrong will cause damage via other processes before they can be shut down. Here is a simple example of a scenario where that isolation buys us something: suppose that we have a bug that tromps on memory starting at some point X until it falls off the sbrk boundary and dumps core. (There are plenty of ways to make that happen, such as miscalculating the length of a memcpy or memset operation as -1.) Such a bug causes no serious damage in isolation, because the process suffering the failure will be in a tight data-copying or data-zeroing loop until it gets the SIGSEGV exception. It won't do anything bad based on all the data structures it has clobbered during its march to the end of memory. However, put that same bug in a multithreading context, and it becomes entirely possible that some other thread will be dispatched and will try to make use of already-clobbered data structures before the ultimate SIGSEGV exception happens. Now you have the potential for unlimited trouble. In general, isolation buys you some safety anytime there is a delay between the occurrence of a failure and its detection. > Processes by default have complete memory isolation. However postgres > actually weakens that by doing a lot of work in a shared memory > pool. That memory gets exactly the same protection as it would get in > a threaded model, which is to say none. Yes. We try to minimize the risk by keeping the shared memory pool relatively small and not doing more than we have to in it. (For example, this was one of the arguments against creating a shared plan cache.) It's also very helpful that in most platforms, shared memory is not address-wise contiguous to normal memory; thus for example a process caught in a memset death march will hit a SIGSEGV before it gets to the shared memory block. It's interesting to note that this can be made into an argument for not making shared_buffers very large: the larger the fraction of your address space that the shared buffers occupy, the larger the chance that a wild store will overwrite something you'd wish it didn't. I can't recall anyone having made that point during our many discussions of appropriate shared_buffer sizing. > So the reality is that if you have a bug most likely you've only corrupted the > local data which can be easily cleaned up either way. In the thread model > there's also the unlikely but scary risk that you've damaged other threads' > memory. And in either case there's the possibility that you've damaged the > shared pool which is unrecoverable. In a thread model, *most* of the accessible memory space would be shared with other threads, at least potentially. So I think you're wrong to categorize the second case as unlikely. regards, tom lane
On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote: > Also remember that in even well developed OS's like FreeBSD, all a > process's threads will execute only on one CPU. I would say that that's not terribly well developed. Solaris will split a single processes' threads over multiple CPUs, and I expect most other major vendors Unixes will as well. In the world of free software, the next release of NetBSD will do the same. (The scheduler activations system, which support m userland to n kernel threads mapping, was recently merged from its branch into NetBSD-current.) From my experience, threaded sorts would be a big win. I managed to shave index generation time for a large table from about 12 hours to about 8 hours by generating two indices in parallel after I'd added a primary key to the table. It would have been much more of a win to be able to generate the primary key followed by other indexes with parallel sorts rather than having to generate the primary key on one CPU (while the other remains idle), wait while that completes, generate two more indices, and then generate the last one . cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote: > > Also remember that in even well developed OS's like FreeBSD, all a > process's threads will execute only on one CPU. I doubt that - it certainly isn't the case on Linux and Solaris. A thread may *start* execution on the same CPU as it's parent, but native threads are not likely to be constrained to a specific CPU with an SMP OS. -- Steve Wampler <swampler@noao.edu> National Solar Observatory
On Thursday 23 January 2003 08:42 pm, you wrote: > On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote: > > Also remember that in even well developed OS's like FreeBSD, all a > > process's threads will execute only on one CPU. > > I doubt that - it certainly isn't the case on Linux and Solaris. > A thread may *start* execution on the same CPU as it's parent, but > native threads are not likely to be constrained to a specific CPU > with an SMP OS. I am told that linuxthreads port available on freeBSD uses rfork and is capable of using multiple CPUs within a single process. Native freeBSD threads can not do that. Need to check that with freeBSD5.0. Shridhar
On Thu, 2003-01-23 at 09:12, Steve Wampler wrote: > On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote: > > > > Also remember that in even well developed OS's like FreeBSD, all a > > process's threads will execute only on one CPU. > > I doubt that - it certainly isn't the case on Linux and Solaris. > A thread may *start* execution on the same CPU as it's parent, but > native threads are not likely to be constrained to a specific CPU > with an SMP OS. You are correct. When spawning additional threads, should an idle CPU be available, it's very doubtful that the new thread will show any bias toward the original thread's CPU. Most modern OS's do run each thread within a process spread across n-CPUs. Those that don't are probably attempting to modernize as we speak. -- Greg Copeland <greg@copelandconsulting.net> Copeland Computer Consulting
Greg Copeland wrote:<br /><blockquote cite="mid1043344940.2714.4.camel@mouse.copelandconsulting.net" type="cite"><pre wrap="">OnThu, 2003-01-23 at 09:12, Steve Wampler wrote: </pre><blockquote type="cite"><pre wrap="">On Sat, 4 Jan 2003, ChristopherKings-Lynne wrote: </pre><blockquote type="cite"><pre wrap="">Also remember that in even well developed OS'slike FreeBSD, all a process's threads will execute only on one CPU. </pre></blockquote><pre wrap="">I doubt that - it certainly isn't thecase on Linux and Solaris. A thread may *start* execution on the same CPU as it's parent, but native threads are not likely to be constrained to a specific CPU with an SMP OS. </pre></blockquote><pre wrap=""> You are correct. When spawning additional threads, should an idle CPU be available, it's very doubtful that the new thread will show any bias toward the original thread's CPU. Most modern OS's do run each thread within a process spread across n-CPUs. Those that don't are probably attempting to modernize as we speak</pre></blockquote> AFAIK, FreeBSD is one of the OSes that are trying to modernize. LastI looked it did not have kernel threads.<br /><blockquote cite="mid1043344940.2714.4.camel@mouse.copelandconsulting.net"type="cite"><pre wrap=""> </pre></blockquote>