Thread: System load consideration before spawning parallel workers
we observed that spawning the specified number of parallel workers for every query that satisfies for parallelism is sometimes leading to performance drop compared to improvement during the peak system load with other processes. Adding more processes to the system is leading to more context switches thus it reducing the performance of other SQL operations. In order to avoid this problem, how about adding some kind of system load consideration into account before spawning the parallel workers? This may not be a problem for some users, so instead of adding the code into the core for the system load calculation and etc, how about providing some additional hook in the code? so that user who wants to consider the system load registers the function and this hooks provides the number of parallel workers that can be started. comments? Regards, Hari Babu Fujitsu Australia
On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > we observed that spawning the specified number of parallel workers for > every query that satisfies for parallelism is sometimes leading to > performance drop compared to improvement during the peak system load > with other processes. Adding more processes to the system is leading > to more context switches thus it reducing the performance of other SQL > operations. > Have you consider to tune using max_worker_processes, basically I think even if you have kept the moderate value for max_parallel_workers_per_gather, the number of processes might increase if total number allowed is much bigger. Are the total number of parallel workers more than number of CPU's/cores in the system? If yes, I think that might be one reason for seeing performance degradation. > In order to avoid this problem, how about adding some kind of system > load consideration into account before spawning the parallel workers? > Hook could be a possibility, but not sure how users are going to decide the number of parallel workers, there might be other backends as well which can consume resources. I think we might need some form of throttling w.r.t assignment of parallel workers to avoid system overload. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Jul 29, 2016 at 8:48 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi > <kommi.haribabu@gmail.com> wrote: >> we observed that spawning the specified number of parallel workers for >> every query that satisfies for parallelism is sometimes leading to >> performance drop compared to improvement during the peak system load >> with other processes. Adding more processes to the system is leading >> to more context switches thus it reducing the performance of other SQL >> operations. >> > > Have you consider to tune using max_worker_processes, basically I > think even if you have kept the moderate value for > max_parallel_workers_per_gather, the number of processes might > increase if total number allowed is much bigger. > > Are the total number of parallel workers more than number of > CPU's/cores in the system? If yes, I think that might be one reason > for seeing performance degradation. Tuning max_worker_processes may work. But the problem here is, During the peak load test, it is observed that setting parallel is leading to drop in performance. The main point here is, even if user set all the configurations properly to use only the free resources as part of parallel query, in case if a sudden load increase can cause some performance problems. >> In order to avoid this problem, how about adding some kind of system >> load consideration into account before spawning the parallel workers? >> > > Hook could be a possibility, but not sure how users are going to > decide the number of parallel workers, there might be other backends > as well which can consume resources. I think we might need some form > of throttling w.r.t assignment of parallel workers to avoid system > overload. There are some utilities and functions that are available to calculate the current system load, based on the available resources and system load, the module can allow the number of parallel workers that can start. In my observation, adding this calculation will add some overhead for simple queries. Because of this reason, i feel this can be hook function, only for the users who want it, can be loaded. Regards, Hari Babu Fujitsu Australia
On 01/08/16 18:08, Haribabu Kommi wrote: > On Fri, Jul 29, 2016 at 8:48 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi >> <kommi.haribabu@gmail.com> wrote: >>> we observed that spawning the specified number of parallel workers for >>> every query that satisfies for parallelism is sometimes leading to >>> performance drop compared to improvement during the peak system load >>> with other processes. Adding more processes to the system is leading >>> to more context switches thus it reducing the performance of other SQL >>> operations. >>> >> Have you consider to tune using max_worker_processes, basically I >> think even if you have kept the moderate value for >> max_parallel_workers_per_gather, the number of processes might >> increase if total number allowed is much bigger. >> >> Are the total number of parallel workers more than number of >> CPU's/cores in the system? If yes, I think that might be one reason >> for seeing performance degradation. > Tuning max_worker_processes may work. But the problem here is, During the > peak load test, it is observed that setting parallel is leading to > drop in performance. > > The main point here is, even if user set all the configurations properly to use > only the free resources as part of parallel query, in case if a sudden > load increase > can cause some performance problems. > >>> In order to avoid this problem, how about adding some kind of system >>> load consideration into account before spawning the parallel workers? >>> >> Hook could be a possibility, but not sure how users are going to >> decide the number of parallel workers, there might be other backends >> as well which can consume resources. I think we might need some form >> of throttling w.r.t assignment of parallel workers to avoid system >> overload. > There are some utilities and functions that are available to calculate the > current system load, based on the available resources and system load, > the module can allow the number of parallel workers that can start. In my > observation, adding this calculation will add some overhead for simple > queries. Because of this reason, i feel this can be hook function, only for > the users who want it, can be loaded. > > > Regards, > Hari Babu > Fujitsu Australia > > Possibly look how make does it with the '-l' flag? '-l 8' don't start more process when load is 8 or greater, works on Linux at least... Cheers, Gavin
On 8/1/16 1:08 AM, Haribabu Kommi wrote: > There are some utilities and functions that are available to calculate the > current system load, based on the available resources and system load, > the module can allow the number of parallel workers that can start. In my > observation, adding this calculation will add some overhead for simple > queries. Because of this reason, i feel this can be hook function, only for > the users who want it, can be loaded. I think we need to provide more tools to allow users to control system behavior on a more dynamic basis. How many workers to launch is a good example. There's more reasons than just CPU that parallel workers can help (IO being an obvious one, but possible other things like GPU). Another example is allowing users to alter the selection process used by autovac workers. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On Fri, Aug 5, 2016 at 9:46 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 8/1/16 1:08 AM, Haribabu Kommi wrote: >> >> There are some utilities and functions that are available to calculate the >> current system load, based on the available resources and system load, >> the module can allow the number of parallel workers that can start. In my >> observation, adding this calculation will add some overhead for simple >> queries. Because of this reason, i feel this can be hook function, only >> for >> the users who want it, can be loaded. > > > I think we need to provide more tools to allow users to control system > behavior on a more dynamic basis. How many workers to launch is a good > example. There's more reasons than just CPU that parallel workers can help > (IO being an obvious one, but possible other things like GPU). Another > example is allowing users to alter the selection process used by autovac > workers. Yes, we need to consider many parameters as a system load, not just only the CPU. Here I attached a POC patch that implements the CPU load calculation and decide the number of workers based on the available CPU load. The load calculation code is not an optimized one, there are many ways that can used to calculate the system load. This is just for an example. Regards, Hari Babu Fujitsu Australia
Attachment
On 8/1/16 2:17 AM, Gavin Flower wrote: > Possibly look how make does it with the '-l' flag? > > '-l 8' don't start more process when load is 8 or greater, works on > Linux at least... The problem with that approach is that it takes about a minute for the load averages figures to be updated, by which time you have already thrashed your system. You can try this out by building PostgreSQL this way. Please save your work first, because you might have to hard-reboot your system. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 8/16/16 3:39 AM, Haribabu Kommi wrote: > Yes, we need to consider many parameters as a system load, not just only > the CPU. Here I attached a POC patch that implements the CPU load > calculation and decide the number of workers based on the available CPU > load. The load calculation code is not an optimized one, there are many ways > that can used to calculate the system load. This is just for an example. I see a number of discussion points here: We don't yet have enough field experience with the parallel query facilities to know what kind of use patterns there are and what systems for load management we need. So I think building a highly specific system like this seems premature. We have settings to limit process numbers, which seems OK as a start, and those knobs have worked reasonably well in other areas (e.g., max connections, autovacuum). We might well want to enhance this area, but we'll need more experience and information. If we think that checking the CPU load is a useful way to manage process resources, why not apply this to more kinds of processes? I could imagine that limiting connections by load could be useful. Parallel workers is only one specific niche of this problem. As I just wrote in another message in this thread, I don't trust system load metrics very much as a gatekeeper. They are reasonable for long-term charting to discover trends, but there are numerous potential problems for using them for this kind of resource control thing. All of this seems very platform specific, too. You have Windows-specific code, but the rest seems very Linux-specific. The dstat tool I had never heard of before. There is stuff with cgroups, which I don't know how portable they are across different Linux installations. Something about Solaris was mentioned. What about the rest? How can we maintain this in the long term? How do we know that these facilities actually work correctly and not cause mysterious problems? There is a bunch of math in there that is not documented much. I can't tell without reverse engineering the code what any of this is supposed to do. My suggestion is that we focus on refining the process control numbers that we already have. We had extensive discussions about that during 9.6 beta. We have related patches in the commit fest right now. Many ideas have been posted. System admins are generally able to count their CPUs and match that to the number of sessions and jobs they need to run.Everything beyond that could be great but seems prematurebefore we have the basics figured out. Maybe a couple of hooks could be useful to allow people to experiment with this. But the hooks should be more general, as described above. But I think a few GUC settings that can be adjusted at run time could be sufficient as well. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > As I just wrote in another message in this thread, I don't trust system > load metrics very much as a gatekeeper. They are reasonable for > long-term charting to discover trends, but there are numerous potential > problems for using them for this kind of resource control thing. As a note in support of that, sendmail has a "feature" to suppress service if system load gets above X, which I have never found to do anything except result in self-DOSing. The load spike might not have anything to do with the service that is trying to un-spike things. Even if it does, Peter is correct to note that the response delay is much too long to form part of a useful feedback loop. It could be all right for scheduling activities whose length is comparable to the load average measurement interval, but not for short-term decisions. regards, tom lane
On 02/09/16 04:44, Peter Eisentraut wrote: > On 8/1/16 2:17 AM, Gavin Flower wrote: >> Possibly look how make does it with the '-l' flag? >> >> '-l 8' don't start more process when load is 8 or greater, works on >> Linux at least... > The problem with that approach is that it takes about a minute for the > load averages figures to be updated, by which time you have already > thrashed your system. > > You can try this out by building PostgreSQL this way. Please save your > work first, because you might have to hard-reboot your system. > Hmm... I've built several versions of pg this way, without any obvious problems! Looking at top, suggests that the load averages never go much above 8, and are usually less. This is the bash script I use: #!/bin/bash # postgresql-build.sh VERSION='9.5.0' TAR_FILE="postgresql-$VERSION.tar.bz2" echo 'TAR_FILE['$TAR_FILE']' tar xvf $TAR_FILE PORT='--with-pgport=5433' ############################ std is 5432 BASE_DIR="postgresql-$VERSION" echo 'BASE_DIR['$BASE_DIR']' cd $BASE_DIR PREFIX="--prefix=/usr/local/lib/postgres-$VERSION" echo 'PREFIX['$PREFIX']' LANGUAGES='--with-python' echo 'LANGUAGES['$LANGUAGES']' SECURITY='--with-openssl --with-pam --with-ldap' echo 'PREFIX['$PREFIX']' XML='--with-libxml --with-libxslt' echo 'SECURITY['$SECURITY']' TZDATA='--with-system-tzdata=/usr/share/zoneinfo' echo 'TZDATA['$TZDATA']' ##DEBUG='--enable-debug' ##echo 'DEBUG['$DEBUG']' ./configure $PREFIX $LANGUAGES $SECURITY $XML $TZDATA $DEBUG time make -j7 -l8 && time make -j7 -l8 check Cheers, Gavin
On 02/09/16 05:01, Peter Eisentraut wrote: > On 8/16/16 3:39 AM, Haribabu Kommi wrote: [...] >> All of this seems very platform specific, too. You have >> Windows-specific code, but the rest seems very Linux-specific. The >> dstat tool I had never heard of before. There is stuff with cgroups, >> which I don't know how portable they are across different Linux >> installations. Something about Solaris was mentioned. What about the >> rest? How can we maintain this in the long term? How do we know that >> these facilities actually work correctly and not cause mysterious problems? [...] I think that we should not hobble pg in Linux, because of limitations of other O/S's like those from Microsoft! On the safe side, if a feature has insufficient evidence of working in a particular O/S, then it should not be default enabled for that O/S. If a feature is useful in Linux, but not elsewhere: then pg should still run in the other O/S's but the documentation should reflect that. Cheers,. Gavin
Gavin Flower <GavinFlower@archidevsys.co.nz> writes: > On 02/09/16 04:44, Peter Eisentraut wrote: >> You can try this out by building PostgreSQL this way. Please save your >> work first, because you might have to hard-reboot your system. > Hmm... I've built several versions of pg this way, without any obvious > problems! I'm a little skeptical of that too. However, I'd note that with a "make" you're not likely to care, or possibly even notice, if the thing does something like go completely to sleep for a little while, or if some sub-jobs proceed well while others do not. The fact that "-l 8" works okay for make doesn't necessarily translate to more-interactive use cases. regards, tom lane
On Thu, Sep 1, 2016 at 01:01:35PM -0400, Peter Eisentraut wrote: > Maybe a couple of hooks could be useful to allow people to experiment > with this. But the hooks should be more general, as described above. > But I think a few GUC settings that can be adjusted at run time could be > sufficient as well. Couldn't SQL sessions call a PL/Perl function that could query the OS and set max_parallel_workers_per_gather appropriately? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 9/2/16 4:07 PM, Bruce Momjian wrote: > Couldn't SQL sessions call a PL/Perl function that could query the OS > and set max_parallel_workers_per_gather appropriately? I'd certainly like to see a greater ability to utilize "hooks" without resorting to C. "hooks" in quotes because while some hooks need to be in C to be of practical use, others (such as a parallelization limit or controlling autovacuum) might not. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On Thu, Sep 1, 2016 at 10:01 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 8/16/16 3:39 AM, Haribabu Kommi wrote: >> Yes, we need to consider many parameters as a system load, not just only >> the CPU. Here I attached a POC patch that implements the CPU load >> calculation and decide the number of workers based on the available CPU >> load. The load calculation code is not an optimized one, there are many ways >> that can used to calculate the system load. This is just for an example. > > I see a number of discussion points here: > > We don't yet have enough field experience with the parallel query > facilities to know what kind of use patterns there are and what systems > for load management we need. So I think building a highly specific > system like this seems premature. We have settings to limit process > numbers, which seems OK as a start, and those knobs have worked > reasonably well in other areas (e.g., max connections, autovacuum). We > might well want to enhance this area, but we'll need more experience and > information. > > If we think that checking the CPU load is a useful way to manage process > resources, why not apply this to more kinds of processes? I could > imagine that limiting connections by load could be useful. Parallel > workers is only one specific niche of this problem. +1 to all of this, particularly the point about parallel workers being one niche aspect of an overall problem. What I'd like to see in this area first is our moving away from the work_mem model. I think it makes a lot of sense to manage memory currently capped by the catch-all work_mem setting as a shared resource, to be dynamically doled out among backends according to availability, priority, and possibly other considerations. I see the 9.6 work on external sort as a building piece for that, as it removed the one thing that was sensitive to work_mem in a surprising, unpredictable way. -- Peter Geoghegan
On Fri, Sep 2, 2016 at 3:01 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant. com> wrote:
On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> Yes, we need to consider many parameters as a system load, not just only
> the CPU. Here I attached a POC patch that implements the CPU load
> calculation and decide the number of workers based on the available CPU
> load. The load calculation code is not an optimized one, there are many ways
> that can used to calculate the system load. This is just for an example.
I see a number of discussion points here:
We don't yet have enough field experience with the parallel query
facilities to know what kind of use patterns there are and what systems
for load management we need. So I think building a highly specific
system like this seems premature. We have settings to limit process
numbers, which seems OK as a start, and those knobs have worked
reasonably well in other areas (e.g., max connections, autovacuum). We
might well want to enhance this area, but we'll need more experience and
information.
Yes, I agree that parallel query is a new feature and we cannot decide it's
affect now itself.
If we think that checking the CPU load is a useful way to manage process
resources, why not apply this to more kinds of processes? I could
imagine that limiting connections by load could be useful. Parallel
workers is only one specific niche of this problem.
Yes, I agree that parallel is only one problem.
How about Postmater calculates the CPU and etc load on the system and
update it in a shared location where every backend can access the details.
Using that, we can decide what operations to control. Using some GUC
specified interval, Postmater updates the system load, so this will not affect
the performance of other backends.
As I just wrote in another message in this thread, I don't trust system
load metrics very much as a gatekeeper. They are reasonable for
long-term charting to discover trends, but there are numerous potential
problems for using them for this kind of resource control thing.
All of this seems very platform specific, too. You have
Windows-specific code, but the rest seems very Linux-specific. The
dstat tool I had never heard of before. There is stuff with cgroups,
which I don't know how portable they are across different Linux
installations. Something about Solaris was mentioned. What about the
rest? How can we maintain this in the long term? How do we know that
these facilities actually work correctly and not cause mysterious problems?
The CPU load calculation patch is a POC patch, i didn't evaluate it's behavior
in all platforms.
Maybe a couple of hooks could be useful to allow people to experiment
with this. But the hooks should be more general, as described above.
But I think a few GUC settings that can be adjusted at run time could be
sufficient as well.
With the GUC settings of parallel it is possible to control the behavior where
it improves the performance because of more parallel workers when there is
very less load on the system. In case if the system load increases and use of
more parallel workers can add the overhead instead of improvement to existing
current behavior when the load is high.
In such cases, the number of parallel workers needs to be reduced with change
in GUC settings. Instead of that, I just thought, how about if we do the same
automatically.
Regards,
Hari Babu
Fujitsu Australia
It seems clear that this patch design is not favored by the community, so I'm setting the patch as rejected in the CF app. I think there is interest in managing system resources better, but I don't know what that would look like. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services