Re: Reasoning behind process instead of thread based - Mailing list pgsql-general
From | Thomas Hallgren |
---|---|
Subject | Re: Reasoning behind process instead of thread based |
Date | |
Msg-id | thhal-0MHJgAgnc3kAQCzTIv4IV+kAB1OUrG2@mailblocks.com Whole thread Raw |
In response to | Re: Reasoning behind process instead of thread based (Martijn van Oosterhout <kleptog@svana.org>) |
List | pgsql-general |
Martijn, > I honestly don't think you could really do a much better job of > scheduling than the kernel. The kernel has a much better idea of what > processes are waiting on, and more importantly, what other work is > happening on the same machine that also needs CPU time. > I agree 100% with Martijn. Below is a reply that I sent to Marco some days ago, although for some reason it was never received by the mailing list. ---- Marco, > You ask what an event is? An event can be: > - input from a connection (usually a new query); > - notification that I/O needed by a pending query has completed; > - if we don't want a single query starve the server, an alarm of kind > (I think this is a corner case, but still possible;) > - something else I haven't thought about. Sounds very much like a description of the preemption points that a user-space thread scheduler would use. > At any given moment, there are many pending queries. Most of them > will be waiting for I/O to complete. That's how the server handles > concurrent users. In order to determine from where an event origins, say an I/O complete event, you need to associate some structure with the I/O operation. That structure defines the logical flow of all events for one particular session or query, and as such it's not far from a lightweigth thread. The only difference is that your "thread" resumes execution in a logical sense (from the event loop) rather than a physical program counter position. The resource consumption/performance would stay more or less the same. > (*) They're oriented to general purpose processes. Think of how CPU > usage affects relative priorities. In a DB context, there may be > other criteria of greater significance. Roughly speaking, the larger > the part of the data a single session holds locked, the sooner it should > be completed. The kernel has no knowledge of this. To the kernel, > "big" processes are those that are using a lot of CPU. And the policy is > to slow them down. To a DB, a "big" queries are those that force the most > serialization ("lock a lot"), and they should be completed as soon as > possible. Criteria based prioritisation is very interesting but I think your model has some flaws: - Since the kernel has no idea your process servers a lot of sessions _it_ will be considered a "big" process. - If a process/thread will do lots of I/O waits (likely for a "big" query) it's unlikely that the kernel will consider it a CPU hog. - Most big queries are read-only and hence, do not lock a lot of things. - PostgreSQL uses MVCC which brings the concurrent lock problem down to a minimum, even for queries that are not read-only. - Giving big queries a lot of resources is not the desired behavior in many cases. - Your scheduler is confined to one CPU and cannot react to the system as a whole. I think it is more important that the scheduler can balance _all_ sessions among _all_ available resources on the machine. Regards, Thomas Hallgren
pgsql-general by date: