Thread: Reasoning behind process instead of thread based arch?
Hello! I have a couple of final ( I hope, for your sake ) questions regarding PostgreSQL. I understand PostgreSQL uses processes rather than threads. I found this statement in the archives: "The developers agree that multiple processes provide more benefits (mostly in stability and robustness) than costs (more connection startup costs). The startup costs are easily overcome by using connection pooling. " Please explain why it is more stable and robust? More from the above statement: "Also, each query can only use one processor; a single query can't be executed in parallel across many CPUs. However, several queries running concurrently will be spread across the available CPUs." And it is because of the PostgreSQL process architecture that a query can't be executed by many CPU:s right? Although I wonder if this is the case in MySQL. It only says in their manual that each connection is a thread. Also, MySQL has a library for embedded aplications, the say: "We also provide MySQL Server as an embedded multi-threaded library that you can link into your application to get a smaller, faster, easier-to-manage product." Do PostgreSQL offer anything similar? Thank you for your time. Tim
nd02tsk@student.hig.se writes: > "The developers agree that multiple processes provide > more benefits (mostly in stability and robustness) than costs (more > connection startup costs). The startup costs are easily overcome by > using connection pooling. > " > > Please explain why it is more stable and robust? Because threads share the same memory space, a runaway thread can corrupt the entire system by writing to the wrong part of memory. With separate processes, the only data that is shared is that which is meant to be shared, which reduces the potential for such damage. > "Also, each query can only use one processor; a single query can't be > executed in parallel across many CPUs. However, several queries running > concurrently will be spread across the available CPUs." > > And it is because of the PostgreSQL process architecture that a query > can't be executed by many CPU:s right? There's no theoretical reason that a query couldn't be split across multiple helper processes, but no one's implemented that feature--it would be a pretty major job. > Also, MySQL has a library for embedded aplications, the say: > > "We also provide MySQL Server as an embedded multi-threaded library that > you can link into your application to get a smaller, faster, > easier-to-manage product." > > Do PostgreSQL offer anything similar? No. See the archives for extensive discussion of why PG doesn't do this. -Doug
On Wed, 2004-10-27 at 09:56, nd02tsk@student.hig.se wrote: > Hello! > > I have a couple of final ( I hope, for your sake ) questions regarding > PostgreSQL. > > I understand PostgreSQL uses processes rather than threads. I found this > statement in the archives: > > "The developers agree that multiple processes provide > more benefits (mostly in stability and robustness) than costs (more > connection startup costs). The startup costs are easily overcome by > using connection pooling. > " > > Please explain why it is more stable and robust? More from the above > statement: This question shows up every 6 months or so. You might wanna search the archives (I use google to do that, but YMMV with the postgresql site's search engine.) Basically, there are a few issues with threading that pop up their ugly heads. One: Not all OSes thread libraries are created equal. There was a nasty bug in one of the BSDs that causes MySQL to crash a couple years ago that drove them nuts. So programming a threaded implementation means you have the vagaries of different levels of quality and robustness of thread libraries to deal with. Two: If a single process in a multi-process application crashes, that process alone dies. The buffer is flushed, and all the other child processes continue happily along. In a multi-threaded environment, when one thread dies, they all die. Three: Multi-threaded applications can be prone to race conditions that are VERY hard to troubleshoot, especially if they occur once every million or so times the triggering event happens. On some operating systems, like Windows and Solaris, processes are expensive, while threads are cheap, so to speak. this is not the case in Linux or BSD, where the differences are much smaller, and the multi-process design suffers no great disadvantage. > "Also, each query can only use one processor; a single query can't be > executed in parallel across many CPUs. However, several queries running > concurrently will be spread across the available CPUs." > > And it is because of the PostgreSQL process architecture that a query > can't be executed by many CPU:s right? Although I wonder if this is the > case in MySQL. It only says in their manual that each connection is a > thread. Actually, if it were converted to multi-threaded tomorrow, it would still be true, because the postgresql engine isn't designed to split off queries into constituent parts to be executed by seperate threads or processes. Conversely, if one wished to implement it, one could likely patch postgresql to break up parts of queries to different child processes of the current child process (grand child processes so to speak) that would allow a query to hit multiple CPUs. > Also, MySQL has a library for embedded aplications, the say: > > "We also provide MySQL Server as an embedded multi-threaded library that > you can link into your application to get a smaller, faster, > easier-to-manage product." > > Do PostgreSQL offer anything similar? No, because in that design, if your application crashes, so does, by extension, your database. Now, I'd argue that if I had to choose between which database to have crash in the middle of transactions, I'd pick PostgreSQL, it's generally considered a bad thing to have a database crash mid transaction. PostgreSQL is more robust about crash recovery, but still... That's another subject that shows up every x months, an embedded version of PostgreSQL. Basically, the suggestion is to use something like SQLlite, which is built to be embedded, and therefore has a much lower footprint than PostgreSQL could ever hope to achieve. No one wants their embedded library using up gobs of RAM and disk space when it's just handling one thread / process doing one thing. It's like delivering Pizzas with a Ferrari, you could do it, it just eouldn't make a lot of sense.
> > On some operating systems, like Windows and Solaris, processes are > expensive, while threads are cheap, so to speak. this is not the case > in Linux or BSD, where the differences are much smaller, and the > multi-process design suffers no great disadvantage. Even on Windows or Solaris you can use techniques like persistent connections or connection pooling to eliminate the process overhead. > Actually, if it were converted to multi-threaded tomorrow, it would > still be true, because the postgresql engine isn't designed to split off > queries into constituent parts to be executed by seperate threads or > processes. Conversely, if one wished to implement it, one could likely > patch postgresql to break up parts of queries to different child > processes of the current child process (grand child processes so to > speak) that would allow a query to hit multiple CPUs. > I would be curious as to what this would actually gain. Of course there are corner cases but I rarely find that it is the CPU that is doing all the work, thus splitting the query may not do you any good. In theory I guess being able to break it up and execute it to different CPUs could cause the results to process faster, but I wonder if it would be a large enough benefit to even notice? >>"We also provide MySQL Server as an embedded multi-threaded library that >>you can link into your application to get a smaller, faster, >>easier-to-manage product." >> >>Do PostgreSQL offer anything similar? No, it isn't really designed to do that. Like Oracle also is not a database you would Embed. > pick PostgreSQL, it's generally considered a bad thing to have a > database crash mid transaction. PostgreSQL is more robust about crash > recovery, but still... > > That's another subject that shows up every x months, an embedded version > of PostgreSQL. Basically, the suggestion is to use something like > SQLlite, which is built to be embedded, and therefore has a much lower > footprint than PostgreSQL could ever hope to achieve. No one wants > their embedded library using up gobs of RAM and disk space when it's > just handling one thread / process doing one thing. It's like > delivering Pizzas with a Ferrari, you could do it, it just eouldn't make > a lot of sense. > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
>Two: If a > single process in a multi-process application crashes, that process > alone dies. The buffer is flushed, and all the other child processes > continue happily along. In a multi-threaded environment, when one > thread dies, they all die. So this means that if a single connection thread dies in MySQL, all connections die? Seems rather serious. I am doubtful that is how they have implemented it.
On Wed, Oct 27, 2004 at 07:47:03PM +0200, nd02tsk@student.hig.se wrote: > >Two: If a > > single process in a multi-process application crashes, that process > > alone dies. The buffer is flushed, and all the other child processes > > continue happily along. In a multi-threaded environment, when one > > thread dies, they all die. > > > So this means that if a single connection thread dies in MySQL, all > connections die? > > Seems rather serious. I am doubtful that is how they have implemented it. It's part of the design of threads. If a thread does an invalid lookup, it's the *process* (ie all threads) that receives the signal and it's the *process* that dies. Just like a SIGSTOP stops all threads and a SIGTERM terminates them all. Signals are shared between threads. Now, you could ofcourse catch these signals but you only have one address space shared between all the threads, so if you want to exit to get a new process image (because something is corrupted), you have to close all connections. And indeed, the one MySQL server I can see is four threads. Nasty. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
nd02tsk@student.hig.se wrote: >>Two: If a >>single process in a multi-process application crashes, that process >>alone dies. The buffer is flushed, and all the other child processes >>continue happily along. In a multi-threaded environment, when one >>thread dies, they all die. > > > > So this means that if a single connection thread dies in MySQL, all > connections die? > > Seems rather serious. I am doubtful that is how they have implemented it. > That all depends on how you define crash. If a thread causes an unhandled signal to be raised such as an illegal memory access or a floating point exception, the process will die, hence killing all threads. But a more advanced multi-threaded environment will install handlers for such signals that will handle the error gracefully. It might not even be necesarry to kill the offending thread. Some conditions are harder to handle than others, such as stack overflow and out of memory, but it can be done. So to state that multi-threaded environments in general kills all threads when one thread chrashes is not true. Having said that, I have no clue as to how advanced MySQL is in this respect. Regards, Thomas Hallgren
On Wed, Oct 27, 2004 at 05:56:16PM +0200, nd02tsk@student.hig.se wrote: > > I understand PostgreSQL uses processes rather than threads. I found this > statement in the archives: > > "The developers agree that multiple processes provide > more benefits (mostly in stability and robustness) than costs (more > connection startup costs). The startup costs are easily overcome by > using connection pooling." > > Please explain why it is more stable and robust? I can't speak for the developers, but here are my thoughts: A critical problem in a thread could terminate the entire process or corrupt its data. If the database server were threaded, such problems would affect the entire server. With each connection handled by a separate process, a critical error is more likely to affect only the connection that had the problem; the rest of the server survives unscathed. > "Also, each query can only use one processor; a single query can't be > executed in parallel across many CPUs. However, several queries running > concurrently will be spread across the available CPUs." > > And it is because of the PostgreSQL process architecture that a query > can't be executed by many CPU:s right? Although I wonder if this is the > case in MySQL. It only says in their manual that each connection is a > thread. I don't know if MySQL can use multiple threads for a single query; it might simply be using one thread per connection instead of a one process per connection. If that's the case, then queries executed by a particular connection are still single-threaded, the same as in PostgreSQL. A database that uses a separate process for each connection could still employ multiple threads within each process if somebody could figure out a way to distribute a query amongst the threads. I don't know what the PostgreSQL developers' thoughts on that are. A disadvantage of threads is that some systems (e.g., FreeBSD 4) implement threads in userland and threads don't take advantage of multiple CPUs. On such systems, using multiple processes better employs additional CPUs. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Martijn van Oosterhout <kleptog@svana.org> writes: > ... Signals are shared between threads. Now, you could ofcourse catch > these signals but you only have one address space shared between all > the threads, so if you want to exit to get a new process image (because > something is corrupted), you have to close all connections. Right. Depending on your OS you may be able to catch a signal that would kill a thread and keep it from killing the whole process, but this still leaves you with a process memory space that may or may not be corrupted. Continuing in that situation is not cool, at least not according to the Postgres project's notions of reliable software design. It should be pointed out that when we get a hard backend crash, Postgres will forcibly terminate all the backends and reinitialize; which means that in terms of letting concurrent sessions keep going, we are not any more forgiving than a single-address-space multithreaded server. The real bottom line here is that we have good prospects of confining the damage done by the failed process: it's unlikely that anything bad will happen to already-committed data on disk or that any other sessions will return wrong answers to their clients before we are able to kill them. It'd be a lot harder to say that with any assurance for a multithreaded server. regards, tom lane
Tom Lane wrote: > Right. Depending on your OS you may be able to catch a signal that > would kill a thread and keep it from killing the whole process, but > this still leaves you with a process memory space that may or may not > be corrupted. Continuing in that situation is not cool, at least not > according to the Postgres project's notions of reliable software design. > There can't be any "may or may not" involved. You must of course know what went wrong. It is very common that you either get a null pointer exception (attempt to access address zero), that your stack will hit a write protected page (stack overflow), or that you get some sort of arithemtic exception. These conditions can be trapped and gracefully handled. The signal handler must be able to check the cause of the exception. This usually involves stack unwinding and investingating the state of the CPU at the point where the signal was generated. The process must be terminated if the reason is not a recognized one. Out of memory can be managed using thread local allocation areas (similar to MemoryContext) and killing a thread based on some criteria when no more memory is available. A criteria could be the thread that encountered the problem, the thread that consumes the most memory, the thread that was least recently active, or something else. > It should be pointed out that when we get a hard backend crash, Postgres > will forcibly terminate all the backends and reinitialize; which means > that in terms of letting concurrent sessions keep going, we are not any > more forgiving than a single-address-space multithreaded server. The > real bottom line here is that we have good prospects of confining the > damage done by the failed process: it's unlikely that anything bad will > happen to already-committed data on disk or that any other sessions will > return wrong answers to their clients before we are able to kill them. > It'd be a lot harder to say that with any assurance for a multithreaded > server. > I'm not sure I follow. You will be able to bring all threads of one process to a halt much faster than you can kill a number of external processes. Killing the multithreaded process is more like pulling the plug. Regards, Thomas Hallgren
Thomas Hallgren <thhal@mailblocks.com> writes: > Tom Lane wrote: >> Right. Depending on your OS you may be able to catch a signal that >> would kill a thread and keep it from killing the whole process, but >> this still leaves you with a process memory space that may or may not >> be corrupted. > It is very common that you either get a null pointer exception (attempt > to access address zero), that your stack will hit a write protected page > (stack overflow), or that you get some sort of arithemtic exception. > These conditions can be trapped and gracefully handled. That argument has zilch to do with the question at hand. If you use a coding style in which these things should be considered recoverable errors, then setting up a signal handler to recover from them works about the same whether the process is multi-threaded or not. The point I was trying to make is that when an unrecognized trap occurs, you have to assume not only that the current thread of execution is a lost cause, but that it may have clobbered any memory it can get its hands on. > I'm not sure I follow. You will be able to bring all threads of one > process to a halt much faster than you can kill a number of external > processes. Speed is not even a factor in this discussion; or do you habitually spend time optimizing cases that aren't supposed to happen? The point here is circumscribing how much can go wrong before you realize you're in trouble. regards, tom lane
Tom Lane wrote: > That argument has zilch to do with the question at hand. If you use a > coding style in which these things should be considered recoverable > errors, then setting up a signal handler to recover from them works > about the same whether the process is multi-threaded or not. The point > I was trying to make is that when an unrecognized trap occurs, you have > to assume not only that the current thread of execution is a lost cause, > but that it may have clobbered any memory it can get its hands on. > I'm just arguing that far from all signals are caused by unrecoverable errors and that threads causing them can be killed individually and gracefully. I can go further and say that in some multi-threaded environments you as a developer don't even have the opportunity to corrupt memory. In such environments the recognized traps are the only ones you encounter unless the environment is corrupt in itself. In addition, there are a number of techniques that can be used to make it impossible for the threads to unintentionally interfere with each others memory. I'm not at all contesting the fact that a single-threaded server architecture is more bug-tolerant and in some ways easier to manage. What I'm trying to say is that it is very possible to write even better, yet very reliable servers using a multi-threaded architecture and high quality code. > ... The point here is circumscribing how much can go wrong before you > realize you're in trouble. > Ok now I do follow. With respect to my last comment about speed, I guess it's long overdue to kill this thread now. Let's hope the forum stays intact :-) Regards, Thomas Hallgren
nd02tsk@student.hig.se wrote: >So Thomas, you say you like the PostgreSQL process based modell better >than the threaded one used by MySQL. But you sound like the opposite. I'd >like to know why you like processes more. > > Ok, let me try and explain why I can be perceived as a scatterbrain :-). PostgreSQL is a very stable and well functioning product. It is one of the few databases out there that has a well documented way of adding plugins written in C and quite a few plugins exists today. You have all the server side languages, (PL/pgsql PL/Perl, PL/Tcl, PL/Java, etc.), and a plethora of custom functions and other utilities. Most of this is beyond the control of the PostgreSQL core team since it's not part of the core product. It would be extremely hard to convert everything into a multi-threaded environment and it would be even harder to maintain the very high quality that would be required. I think PostgreSQL in it's current shape, is ideal for a distributed, Open Source based conglomerate of products. The high quality core firmly controlled by the core team, in conjunction with all surrounding features, brings you DBMS functionality that is otherwise unheard of in the free software market. I believe that this advantage is very much due to the simplicity and bug-resilient single-threaded design of the PostgreSQL. My only regret is that the PL/Java, to which I'm the father, is confined to one connection only. But that too has some advantages in terms of simplicity and reliability. So far PostgreSQL At present, I'm part of a team that develops a very reliable multi-threaded system (a Java VM). In this role, I've learned a lot about how high performance thread based systems can be made. If people on this list wants to dismiss multi-threaded systems, I feel they should do it based on facts. It's more than possible to build a great multi-threaded server. It is my belief that as PostgreSQL get more representation in the high end market where the advantages of multi-threaded solutions get more and more apparent, it will find that the competition from a performance standpoint is sometimes overwhelming. I can't say anything about MySQL robustness because I haven't used it much. Perhaps the code quality is indeed below what is required for a multi-threaded system, perhaps not. I choose PostgreSQL over MySQL because MySQL lacks some of the features that I feel are essential, because it does some things dead wrong, and because it is dual licensed. Hope that cleared up some of the confusion. Regards, Thomas Hallgren
[processes vs threads stuff deleted] In any modern and reasonable Unix-like OS, there's very little difference between the multi-process or the multi-thread model. _Default_ behaviour is different, e.g. memory is shared by default for threads, but processes can share memory as well. There are very few features threads have that processes don't, and vice versa. And if the OS is good enough, there are hardly performance issues. I think that it would be interesting to discuss multi(processes/threades) model vs mono (process/thread). Mono as in _one_ single process/thread per CPU, not one per session. That is, moving all the "scheduling" between sessions entirely to userspace. The server gains almost complete control over the data structures allocated per session, and the resources allocated _to_ sessions. I bet this is very theoretical since it'd require a complete redesign of some core stuff. And I have strong concerns about portability. Still, it could be interesting. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
Marco Colombo wrote: > [processes vs threads stuff deleted] > > In any modern and reasonable Unix-like OS, there's very little difference > between the multi-process or the multi-thread model. _Default_ behaviour > is different, e.g. memory is shared by default for threads, but processes > can share memory as well. There are very few features threads have > that processes don't, and vice versa. And if the OS is good enough, > there are hardly performance issues. > Most servers have a desire to run on Windows-NT and I would consider Solaris a "modern and reasonable Unix-like OS". On both, you will find a significant performance difference. I think that's true for Irix as well. Your statement is very true for Linux based OS'es though. > I think that it would be interesting to discuss multi(processes/threades) > model vs mono (process/thread). Mono as in _one_ single process/thread > per CPU, not one per session. That is, moving all the "scheduling" > between sessions entirely to userspace. The server gains almost complete > control over the data structures allocated per session, and the resources > allocated _to_ sessions. > I think what you mean is user space threads. In the Java community known as "green" threads, Windows call it "fibers". That approach has been more or less abandoned by Sun, BEA, and other Java VM manufacturers since a user space scheduler is confined to one CPU, one process, and unable to balance the scheduling with other processes and their threads. A kernel scheduler might be slightly heavier but it does a much better job. Regards, Thomas Hallgren
On Thu, Oct 28, 2004 at 02:44:55PM +0200, Marco Colombo wrote: > I think that it would be interesting to discuss multi(processes/threades) > model vs mono (process/thread). Mono as in _one_ single process/thread > per CPU, not one per session. That is, moving all the "scheduling" > between sessions entirely to userspace. The server gains almost complete > control over the data structures allocated per session, and the resources > allocated _to_ sessions. This is how DB2 and Oracle work. Having scheduling control is very interesting, but I'm not sure it needs to be accomplished this way. There are other advantages too; in both products you have a single pool of sort memory; you can allocate as much memory to sorting as you want without the risk of exceeding it. PostgreSQL can't do this and it makes writing code that wants a lot of sort memory a real pain. Of course this could probably be solved without going to a 'mono process' model. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
nd02tsk@student.hig.se writes: >>Two: If a >> single process in a multi-process application crashes, that process >> alone dies. The buffer is flushed, and all the other child processes >> continue happily along. In a multi-threaded environment, when one >> thread dies, they all die. > > So this means that if a single connection thread dies in MySQL, all > connections die? Yes, that's right. > Seems rather serious. I am doubtful that is how they have > implemented it. If it's a multithreaded application, then there is nothing to doubt about the matter. If any thread dies, the whole process croaks, and there's no choice in the matter. If a thread has been corrupted to the point of crashing, then the entire process has been corrupted. -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/linuxxian.html A VAX is virtually a computer, but not quite.
On Thu, 28 Oct 2004, Thomas Hallgren wrote: > Marco Colombo wrote: >> [processes vs threads stuff deleted] >> >> In any modern and reasonable Unix-like OS, there's very little difference >> between the multi-process or the multi-thread model. _Default_ behaviour >> is different, e.g. memory is shared by default for threads, but processes >> can share memory as well. There are very few features threads have >> that processes don't, and vice versa. And if the OS is good enough, >> there are hardly performance issues. >> > Most servers have a desire to run on Windows-NT and I would consider Solaris > a "modern and reasonable Unix-like OS". On both, you will find a significant > performance difference. I think that's true for Irix as well. Your statement > is very true for Linux based OS'es though. See the "if the OS is good enough" part... :-) AFAIK, many techniques developed under Linux have been included in recent releases of other OSes. I haven't seen the source, of course. If recent Solaris still has processes which are actually "heavy", well I call that "an old legacy (mis-)feature on a modern and reasonable OS"... Back in '93, Mr. Gates used to state: "NT is Unix". If it's not the case yet, well, it's not _my_ fault. >> I think that it would be interesting to discuss multi(processes/threades) >> model vs mono (process/thread). Mono as in _one_ single process/thread >> per CPU, not one per session. That is, moving all the "scheduling" >> between sessions entirely to userspace. The server gains almost complete >> control over the data structures allocated per session, and the resources >> allocated _to_ sessions. >> > I think what you mean is user space threads. In the Java community known as > "green" threads, Windows call it "fibers". That approach has been more or > less abandoned by Sun, BEA, and other Java VM manufacturers since a user > space scheduler is confined to one CPU, one process, and unable to balance > the scheduling with other processes and their threads. A kernel scheduler > might be slightly heavier but it does a much better job. > > Regards, > Thomas Hallgren No. I just meant "scheduling" between PG sessions. I'm not interested in userspace threads. Those are general purpose solutions, with the drawbacks you pointed out. I mean an entirely event driven server. The trickiest part is to handle N-way. On 1-way, it's quite a clear and well-defined model. I'm not going to say it's easy. I'd like to move the discussion away from the sterile processes vs threads issue. Most differences there are platform specific anyway. The model is the same: one thread of execution per session. I'm proposing a new model entirely (well I'm proposing a _discussion_ on a model vs. model basis and not implementation vs implementation of the same model). If you read this thread, you'll notice most people miss the point: either processes or threads, the model is the same, many many actors that share a big part of their memory. The problems are the same, too. Should we buy the fact that processes are safer? Of course, it's not the case, when they share such a big memory segment. The chance of a runaway pointer thrashing some important shared data is almost the same for both processes and threads. If one backend crashes for a SIGSEGV, I'd bet nothing on the shared mem not being corrupted somehow. My point being: how about [discussing of] a completely different model instead? .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
Marco, > I mean an entirely event driven server. The trickiest part is to handle > N-way. On 1-way, it's quite a clear and well-defined model. You need to clarify this a bit. You say that the scheduler is in user-space, yet there's only one thread per process and one process per CPU. You state that instead of threads, you want it to be completely event driven. In essence that would mean serving one event per CPU from start to end at any given time. What is an event in this case? Where did it come from? How will this system serve concurrent users? Regards, Thomas Hallgren
On Thu, 28 Oct 2004, Thomas Hallgren wrote: > Marco, > >> I mean an entirely event driven server. The trickiest part is to handle >> N-way. On 1-way, it's quite a clear and well-defined model. > > You need to clarify this a bit. > > You say that the scheduler is in user-space, yet there's only one thread per > process and one process per CPU. You state that instead of threads, you want > it to be completely event driven. In essence that would mean serving one > event per CPU from start to end at any given time. What is an event in this > case? Where did it come from? How will this system serve concurrent users? Let's take a look at the bigger picture. We need to serve many clients, that is many sessions, that is many requests (queries) at the same time. Since there may be more than one active request, we need to schedule them in some way. That's what I meant with "session scheduler". The traditional accept&fork model doesn't handle that directly: by creating one process per session, it relays on the process scheduler in the kernel. I state this is suboptimal, both for extra resources allocated to each session, and for the kernel policies not being perfectly tailored to the job of scheduling PG sessions (*). Not to mention the postmaster has almost no control over these policies. Now, threads help a bit in reducing the per session overhead. But that's more an implementation detail, and it's _very_ platform specific. Switching to threads has a great impact on many _details_ of the server, the benefits depend a lot on the platform, but the model is just the same, with the same essential problems. Many big changes for little gain. Let's explore, at least in theory, the advantages of a completely different model (that implies a lot of changes too, of course - but for something). You ask what an event is? An event can be: - input from a connection (usually a new query); - notification that I/O needed by a pending query has completed; - if we don't want a single query starve the server, an alarm of kind (I think this is a corner case, but still possible;) - something else I haven't thought about. At any given moment, there are many pending queries. Most of them will be waiting for I/O to complete. That's how the server handles concurrent users. > > Regards, > Thomas Hallgren (*) They're oriented to general purpose processes. Think of how CPU usage affects relative priorities. In a DB context, there may be other criteria of greater significance. Roughly speaking, the larger the part of the data a single session holds locked, the sooner it should be completed. The kernel has no knowledge of this. To the kernel, "big" processes are those that are using a lot of CPU. And the policy is to slow them down. To a DB, a "big" queries are those that force the most serialization ("lock a lot"), and they should be completed as soon as possible. .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@ESI.it
Marco, > You ask what an event is? An event can be: > - input from a connection (usually a new query); > - notification that I/O needed by a pending query has completed; > - if we don't want a single query starve the server, an alarm of kind > (I think this is a corner case, but still possible;) > - something else I haven't thought about. Sounds very much like a description of the preemption points that a user-space thread scheduler would use. > At any given moment, there are many pending queries. Most of them > will be waiting for I/O to complete. That's how the server handles > concurrent users. In order to determine from where an event origins, say an I/O complete event, you need to associate some structure with the I/O operation. That structure defines the logical flow of all events for one particular session or query, and as such it's not far from a lightweigth thread. The only difference is that your "thread" resumes execution in a logical sense (from the event loop) rather than a physical program counter position. The resource consumption/performance would stay more or less the same. > (*) They're oriented to general purpose processes. Think of how CPU > usage affects relative priorities. In a DB context, there may be > other criteria of greater significance. Roughly speaking, the larger > the part of the data a single session holds locked, the sooner it should > be completed. The kernel has no knowledge of this. To the kernel, > "big" processes are those that are using a lot of CPU. And the policy is > to slow them down. To a DB, a "big" queries are those that force the most > serialization ("lock a lot"), and they should be completed as soon as > possible. Criteria based prioritisation is very interesting but I think your model has some flaws: - Since the kernel has no idea your process servers a lot of sessions _it_ will be considered a "big" process. - If a process/thread will do lots of I/O waits (likely for a "big" query) it's unlikely that the kernel will consider it a CPU hog. - Most big queries are read-only and hence, do not lock a lot of things. - PostgreSQL uses MVCC which brings the concurrent lock problem down to a minimum, even for queries that are not read-only. - Giving big queries a lot of resources is not the desired behavior in many cases. - Your scheduler is confined to one CPU and cannot react to the system as a whole. I think it is more important that the scheduler can balance _all_ sessions among _all_ available resources on the machine. Regards, Thomas Hallgren