Re: System load consideration before spawning parallel workers - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: System load consideration before spawning parallel workers |
Date | |
Msg-id | aca21e2c-746d-36aa-103a-275ce24bb395@2ndquadrant.com Whole thread Raw |
In response to | Re: System load consideration before spawning parallel workers (Haribabu Kommi <kommi.haribabu@gmail.com>) |
Responses |
Re: System load consideration before spawning parallel workers
Re: System load consideration before spawning parallel workers Re: System load consideration before spawning parallel workers Re: System load consideration before spawning parallel workers Re: System load consideration before spawning parallel workers |
List | pgsql-hackers |
On 8/16/16 3:39 AM, Haribabu Kommi wrote: > Yes, we need to consider many parameters as a system load, not just only > the CPU. Here I attached a POC patch that implements the CPU load > calculation and decide the number of workers based on the available CPU > load. The load calculation code is not an optimized one, there are many ways > that can used to calculate the system load. This is just for an example. I see a number of discussion points here: We don't yet have enough field experience with the parallel query facilities to know what kind of use patterns there are and what systems for load management we need. So I think building a highly specific system like this seems premature. We have settings to limit process numbers, which seems OK as a start, and those knobs have worked reasonably well in other areas (e.g., max connections, autovacuum). We might well want to enhance this area, but we'll need more experience and information. If we think that checking the CPU load is a useful way to manage process resources, why not apply this to more kinds of processes? I could imagine that limiting connections by load could be useful. Parallel workers is only one specific niche of this problem. As I just wrote in another message in this thread, I don't trust system load metrics very much as a gatekeeper. They are reasonable for long-term charting to discover trends, but there are numerous potential problems for using them for this kind of resource control thing. All of this seems very platform specific, too. You have Windows-specific code, but the rest seems very Linux-specific. The dstat tool I had never heard of before. There is stuff with cgroups, which I don't know how portable they are across different Linux installations. Something about Solaris was mentioned. What about the rest? How can we maintain this in the long term? How do we know that these facilities actually work correctly and not cause mysterious problems? There is a bunch of math in there that is not documented much. I can't tell without reverse engineering the code what any of this is supposed to do. My suggestion is that we focus on refining the process control numbers that we already have. We had extensive discussions about that during 9.6 beta. We have related patches in the commit fest right now. Many ideas have been posted. System admins are generally able to count their CPUs and match that to the number of sessions and jobs they need to run.Everything beyond that could be great but seems prematurebefore we have the basics figured out. Maybe a couple of hooks could be useful to allow people to experiment with this. But the hooks should be more general, as described above. But I think a few GUC settings that can be adjusted at run time could be sufficient as well. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: