Thread: System load consideration before spawning parallel workers

System load consideration before spawning parallel workers

From
Haribabu Kommi
Date:
we observed that spawning the specified number of parallel workers for
every query that satisfies for parallelism is sometimes leading to
performance drop compared to improvement during the peak system load
with other processes. Adding more processes to the system is leading
to more context switches thus it reducing the performance of other SQL
operations.

In order to avoid this problem, how about adding some kind of system
load consideration into account before spawning the parallel workers?

This may not be a problem for some users, so instead of adding the
code into the core for the system load calculation and etc, how about
providing some additional hook in the code? so that user who wants to
consider the system load registers the function and this hooks
provides the number of parallel workers that can be started.

comments?

Regards,
Hari Babu
Fujitsu Australia



Re: System load consideration before spawning parallel workers

From
Amit Kapila
Date:
On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> we observed that spawning the specified number of parallel workers for
> every query that satisfies for parallelism is sometimes leading to
> performance drop compared to improvement during the peak system load
> with other processes. Adding more processes to the system is leading
> to more context switches thus it reducing the performance of other SQL
> operations.
>

Have you consider to tune using max_worker_processes, basically I
think even if you have kept the moderate value for
max_parallel_workers_per_gather, the number of processes might
increase if total number allowed is much bigger.

Are the total number of parallel workers more than number of
CPU's/cores in the system? If yes, I think that might be one reason
for seeing performance degradation.

> In order to avoid this problem, how about adding some kind of system
> load consideration into account before spawning the parallel workers?
>

Hook could be a possibility, but not sure how users are going to
decide the number of parallel workers, there might be other backends
as well which can consume resources.  I think we might need some form
of throttling w.r.t assignment of parallel workers to avoid system
overload.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: System load consideration before spawning parallel workers

From
Haribabu Kommi
Date:
On Fri, Jul 29, 2016 at 8:48 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi
> <kommi.haribabu@gmail.com> wrote:
>> we observed that spawning the specified number of parallel workers for
>> every query that satisfies for parallelism is sometimes leading to
>> performance drop compared to improvement during the peak system load
>> with other processes. Adding more processes to the system is leading
>> to more context switches thus it reducing the performance of other SQL
>> operations.
>>
>
> Have you consider to tune using max_worker_processes, basically I
> think even if you have kept the moderate value for
> max_parallel_workers_per_gather, the number of processes might
> increase if total number allowed is much bigger.
>
> Are the total number of parallel workers more than number of
> CPU's/cores in the system? If yes, I think that might be one reason
> for seeing performance degradation.

Tuning max_worker_processes may work. But the problem here is, During the
peak load test, it is observed that setting parallel is leading to
drop in performance.

The main point here is, even if user set all the configurations properly to use
only the free resources as part of parallel query, in case if a sudden
load increase
can cause some performance problems.

>> In order to avoid this problem, how about adding some kind of system
>> load consideration into account before spawning the parallel workers?
>>
>
> Hook could be a possibility, but not sure how users are going to
> decide the number of parallel workers, there might be other backends
> as well which can consume resources.  I think we might need some form
> of throttling w.r.t assignment of parallel workers to avoid system
> overload.

There are some utilities and functions that are available to calculate the
current system load, based on the available resources and system load,
the module can allow the number of parallel workers that can start. In my
observation, adding this calculation will add some overhead for simple
queries. Because of this reason, i feel this can be hook function, only for
the users who want it, can be loaded.


Regards,
Hari Babu
Fujitsu Australia



Re: System load consideration before spawning parallel workers

From
Gavin Flower
Date:
On 01/08/16 18:08, Haribabu Kommi wrote:
> On Fri, Jul 29, 2016 at 8:48 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Fri, Jul 29, 2016 at 11:26 AM, Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>>> we observed that spawning the specified number of parallel workers for
>>> every query that satisfies for parallelism is sometimes leading to
>>> performance drop compared to improvement during the peak system load
>>> with other processes. Adding more processes to the system is leading
>>> to more context switches thus it reducing the performance of other SQL
>>> operations.
>>>
>> Have you consider to tune using max_worker_processes, basically I
>> think even if you have kept the moderate value for
>> max_parallel_workers_per_gather, the number of processes might
>> increase if total number allowed is much bigger.
>>
>> Are the total number of parallel workers more than number of
>> CPU's/cores in the system? If yes, I think that might be one reason
>> for seeing performance degradation.
> Tuning max_worker_processes may work. But the problem here is, During the
> peak load test, it is observed that setting parallel is leading to
> drop in performance.
>
> The main point here is, even if user set all the configurations properly to use
> only the free resources as part of parallel query, in case if a sudden
> load increase
> can cause some performance problems.
>
>>> In order to avoid this problem, how about adding some kind of system
>>> load consideration into account before spawning the parallel workers?
>>>
>> Hook could be a possibility, but not sure how users are going to
>> decide the number of parallel workers, there might be other backends
>> as well which can consume resources.  I think we might need some form
>> of throttling w.r.t assignment of parallel workers to avoid system
>> overload.
> There are some utilities and functions that are available to calculate the
> current system load, based on the available resources and system load,
> the module can allow the number of parallel workers that can start. In my
> observation, adding this calculation will add some overhead for simple
> queries. Because of this reason, i feel this can be hook function, only for
> the users who want it, can be loaded.
>
>
> Regards,
> Hari Babu
> Fujitsu Australia
>
>
Possibly look how make does it with the '-l' flag?

'-l 8' don't start more process when load is 8 or greater, works on 
Linux at least...


Cheers,
Gavin





Re: System load consideration before spawning parallel workers

From
Jim Nasby
Date:
On 8/1/16 1:08 AM, Haribabu Kommi wrote:
> There are some utilities and functions that are available to calculate the
> current system load, based on the available resources and system load,
> the module can allow the number of parallel workers that can start. In my
> observation, adding this calculation will add some overhead for simple
> queries. Because of this reason, i feel this can be hook function, only for
> the users who want it, can be loaded.

I think we need to provide more tools to allow users to control system 
behavior on a more dynamic basis. How many workers to launch is a good 
example. There's more reasons than just CPU that parallel workers can 
help (IO being an obvious one, but possible other things like GPU). 
Another example is allowing users to alter the selection process used by 
autovac workers.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: System load consideration before spawning parallel workers

From
Haribabu Kommi
Date:
On Fri, Aug 5, 2016 at 9:46 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 8/1/16 1:08 AM, Haribabu Kommi wrote:
>>
>> There are some utilities and functions that are available to calculate the
>> current system load, based on the available resources and system load,
>> the module can allow the number of parallel workers that can start. In my
>> observation, adding this calculation will add some overhead for simple
>> queries. Because of this reason, i feel this can be hook function, only
>> for
>> the users who want it, can be loaded.
>
>
> I think we need to provide more tools to allow users to control system
> behavior on a more dynamic basis. How many workers to launch is a good
> example. There's more reasons than just CPU that parallel workers can help
> (IO being an obvious one, but possible other things like GPU). Another
> example is allowing users to alter the selection process used by autovac
> workers.

Yes, we need to consider many parameters as a system load, not just only
the CPU. Here I attached a POC patch that implements the CPU load
calculation and decide the number of workers based on the available CPU
load. The load calculation code is not an optimized one, there are many ways
that can used to calculate the system load. This is just for an example.


Regards,
Hari Babu
Fujitsu Australia

Attachment

Re: System load consideration before spawning parallel workers

From
Peter Eisentraut
Date:
On 8/1/16 2:17 AM, Gavin Flower wrote:
> Possibly look how make does it with the '-l' flag?
> 
> '-l 8' don't start more process when load is 8 or greater, works on 
> Linux at least...

The problem with that approach is that it takes about a minute for the
load averages figures to be updated, by which time you have already
thrashed your system.

You can try this out by building PostgreSQL this way.  Please save your
work first, because you might have to hard-reboot your system.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: System load consideration before spawning parallel workers

From
Peter Eisentraut
Date:
On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> Yes, we need to consider many parameters as a system load, not just only
> the CPU. Here I attached a POC patch that implements the CPU load
> calculation and decide the number of workers based on the available CPU
> load. The load calculation code is not an optimized one, there are many ways
> that can used to calculate the system load. This is just for an example.

I see a number of discussion points here:

We don't yet have enough field experience with the parallel query
facilities to know what kind of use patterns there are and what systems
for load management we need.  So I think building a highly specific
system like this seems premature.  We have settings to limit process
numbers, which seems OK as a start, and those knobs have worked
reasonably well in other areas (e.g., max connections, autovacuum).  We
might well want to enhance this area, but we'll need more experience and
information.

If we think that checking the CPU load is a useful way to manage process
resources, why not apply this to more kinds of processes?  I could
imagine that limiting connections by load could be useful.  Parallel
workers is only one specific niche of this problem.

As I just wrote in another message in this thread, I don't trust system
load metrics very much as a gatekeeper.  They are reasonable for
long-term charting to discover trends, but there are numerous potential
problems for using them for this kind of resource control thing.

All of this seems very platform specific, too.  You have
Windows-specific code, but the rest seems very Linux-specific.  The
dstat tool I had never heard of before.  There is stuff with cgroups,
which I don't know how portable they are across different Linux
installations.  Something about Solaris was mentioned.  What about the
rest?  How can we maintain this in the long term?  How do we know that
these facilities actually work correctly and not cause mysterious problems?

There is a bunch of math in there that is not documented much.  I can't
tell without reverse engineering the code what any of this is supposed
to do.

My suggestion is that we focus on refining the process control numbers
that we already have.  We had extensive discussions about that during
9.6 beta.  We have related patches in the commit fest right now.  Many
ideas have been posted.  System admins are generally able to count their
CPUs and match that to the number of sessions and jobs they need to run.Everything beyond that could be great but seems
prematurebefore we
 
have the basics figured out.

Maybe a couple of hooks could be useful to allow people to experiment
with this.  But the hooks should be more general, as described above.
But I think a few GUC settings that can be adjusted at run time could be
sufficient as well.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: System load consideration before spawning parallel workers

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> As I just wrote in another message in this thread, I don't trust system
> load metrics very much as a gatekeeper.  They are reasonable for
> long-term charting to discover trends, but there are numerous potential
> problems for using them for this kind of resource control thing.

As a note in support of that, sendmail has a "feature" to suppress service
if system load gets above X, which I have never found to do anything
except result in self-DOSing.  The load spike might not have anything to
do with the service that is trying to un-spike things.  Even if it does,
Peter is correct to note that the response delay is much too long to form
part of a useful feedback loop.  It could be all right for scheduling
activities whose length is comparable to the load average measurement
interval, but not for short-term decisions.
        regards, tom lane



Re: System load consideration before spawning parallel workers

From
Gavin Flower
Date:
On 02/09/16 04:44, Peter Eisentraut wrote:
> On 8/1/16 2:17 AM, Gavin Flower wrote:
>> Possibly look how make does it with the '-l' flag?
>>
>> '-l 8' don't start more process when load is 8 or greater, works on
>> Linux at least...
> The problem with that approach is that it takes about a minute for the
> load averages figures to be updated, by which time you have already
> thrashed your system.
>
> You can try this out by building PostgreSQL this way.  Please save your
> work first, because you might have to hard-reboot your system.
>
Hmm...  I've built several versions of pg this way, without any obvious 
problems!

Looking at top, suggests that the load averages never go much above 8, 
and are usually less.

This is the bash script I use:


#!/bin/bash
# postgresql-build.sh


VERSION='9.5.0'

TAR_FILE="postgresql-$VERSION.tar.bz2"
echo 'TAR_FILE['$TAR_FILE']'
tar xvf $TAR_FILE

PORT='--with-pgport=5433'  ############################ std is 5432

BASE_DIR="postgresql-$VERSION"
echo 'BASE_DIR['$BASE_DIR']'
cd $BASE_DIR

PREFIX="--prefix=/usr/local/lib/postgres-$VERSION"
echo 'PREFIX['$PREFIX']'

LANGUAGES='--with-python'
echo 'LANGUAGES['$LANGUAGES']'

SECURITY='--with-openssl --with-pam --with-ldap'
echo 'PREFIX['$PREFIX']'

XML='--with-libxml --with-libxslt'
echo 'SECURITY['$SECURITY']'

TZDATA='--with-system-tzdata=/usr/share/zoneinfo'
echo 'TZDATA['$TZDATA']'

##DEBUG='--enable-debug'
##echo 'DEBUG['$DEBUG']'


./configure $PREFIX $LANGUAGES $SECURITY $XML $TZDATA $DEBUG

time make -j7 -l8 && time make -j7 -l8 check


Cheers,
Gavin




Re: System load consideration before spawning parallel workers

From
Gavin Flower
Date:
On 02/09/16 05:01, Peter Eisentraut wrote:
> On 8/16/16 3:39 AM, Haribabu Kommi wrote:
[...]

>> All of this seems very platform specific, too.  You have
>> Windows-specific code, but the rest seems very Linux-specific.  The
>> dstat tool I had never heard of before.  There is stuff with cgroups,
>> which I don't know how portable they are across different Linux
>> installations.  Something about Solaris was mentioned.  What about the
>> rest?  How can we maintain this in the long term?  How do we know that
>> these facilities actually work correctly and not cause mysterious problems?
[...]
I think that we should not hobble pg in Linux, because of limitations of 
other O/S's like those from Microsoft!

On the safe side, if a feature has insufficient evidence of working in a 
particular O/S, then it should not be default enabled for that O/S.

If a feature is useful in Linux, but not elsewhere: then pg should still 
run in the other O/S's but the documentation should reflect that.


Cheers,.
Gavin



Re: System load consideration before spawning parallel workers

From
Tom Lane
Date:
Gavin Flower <GavinFlower@archidevsys.co.nz> writes:
> On 02/09/16 04:44, Peter Eisentraut wrote:
>> You can try this out by building PostgreSQL this way.  Please save your
>> work first, because you might have to hard-reboot your system.

> Hmm...  I've built several versions of pg this way, without any obvious 
> problems!

I'm a little skeptical of that too.  However, I'd note that with a "make"
you're not likely to care, or possibly even notice, if the thing does
something like go completely to sleep for a little while, or if some
sub-jobs proceed well while others do not.  The fact that "-l 8" works
okay for make doesn't necessarily translate to more-interactive use cases.
        regards, tom lane



Re: System load consideration before spawning parallel workers

From
Bruce Momjian
Date:
On Thu, Sep  1, 2016 at 01:01:35PM -0400, Peter Eisentraut wrote:
> Maybe a couple of hooks could be useful to allow people to experiment
> with this.  But the hooks should be more general, as described above.
> But I think a few GUC settings that can be adjusted at run time could be
> sufficient as well.

Couldn't SQL sessions call a PL/Perl function that could query the OS
and set max_parallel_workers_per_gather appropriately?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: System load consideration before spawning parallel workers

From
Jim Nasby
Date:
On 9/2/16 4:07 PM, Bruce Momjian wrote:
> Couldn't SQL sessions call a PL/Perl function that could query the OS
> and set max_parallel_workers_per_gather appropriately?

I'd certainly like to see a greater ability to utilize "hooks" without 
resorting to C. "hooks" in quotes because while some hooks need to be in 
C to be of practical use, others (such as a parallelization limit or 
controlling autovacuum) might not.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: System load consideration before spawning parallel workers

From
Peter Geoghegan
Date:
On Thu, Sep 1, 2016 at 10:01 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 8/16/16 3:39 AM, Haribabu Kommi wrote:
>> Yes, we need to consider many parameters as a system load, not just only
>> the CPU. Here I attached a POC patch that implements the CPU load
>> calculation and decide the number of workers based on the available CPU
>> load. The load calculation code is not an optimized one, there are many ways
>> that can used to calculate the system load. This is just for an example.
>
> I see a number of discussion points here:
>
> We don't yet have enough field experience with the parallel query
> facilities to know what kind of use patterns there are and what systems
> for load management we need.  So I think building a highly specific
> system like this seems premature.  We have settings to limit process
> numbers, which seems OK as a start, and those knobs have worked
> reasonably well in other areas (e.g., max connections, autovacuum).  We
> might well want to enhance this area, but we'll need more experience and
> information.
>
> If we think that checking the CPU load is a useful way to manage process
> resources, why not apply this to more kinds of processes?  I could
> imagine that limiting connections by load could be useful.  Parallel
> workers is only one specific niche of this problem.

+1 to all of this, particularly the point about parallel workers being
one niche aspect of an overall problem.

What I'd like to see in this area first is our moving away from the
work_mem model. I think it makes a lot of sense to manage memory
currently capped by the catch-all work_mem setting as a shared
resource, to be dynamically doled out among backends according to
availability, priority, and possibly other considerations. I see the
9.6 work on external sort as a building piece for that, as it removed
the one thing that was sensitive to work_mem in a surprising,
unpredictable way.

-- 
Peter Geoghegan



Re: System load consideration before spawning parallel workers

From
Haribabu Kommi
Date:


On Fri, Sep 2, 2016 at 3:01 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> Yes, we need to consider many parameters as a system load, not just only
> the CPU. Here I attached a POC patch that implements the CPU load
> calculation and decide the number of workers based on the available CPU
> load. The load calculation code is not an optimized one, there are many ways
> that can used to calculate the system load. This is just for an example.

I see a number of discussion points here:

We don't yet have enough field experience with the parallel query
facilities to know what kind of use patterns there are and what systems
for load management we need.  So I think building a highly specific
system like this seems premature.  We have settings to limit process
numbers, which seems OK as a start, and those knobs have worked
reasonably well in other areas (e.g., max connections, autovacuum).  We
might well want to enhance this area, but we'll need more experience and
information.

Yes, I agree that parallel query is a new feature and we cannot decide it's 
affect now itself.

 
If we think that checking the CPU load is a useful way to manage process
resources, why not apply this to more kinds of processes?  I could
imagine that limiting connections by load could be useful.  Parallel
workers is only one specific niche of this problem.

Yes, I agree that parallel is only one problem.

How about Postmater calculates the CPU and etc load on the system and
update it in a shared location where every backend can access the details.
Using that, we can decide what operations to control. Using some GUC
specified interval, Postmater updates the system load, so this will not affect
the performance of other backends.

 
As I just wrote in another message in this thread, I don't trust system
load metrics very much as a gatekeeper.  They are reasonable for
long-term charting to discover trends, but there are numerous potential
problems for using them for this kind of resource control thing.

All of this seems very platform specific, too.  You have
Windows-specific code, but the rest seems very Linux-specific.  The
dstat tool I had never heard of before.  There is stuff with cgroups,
which I don't know how portable they are across different Linux
installations.  Something about Solaris was mentioned.  What about the
rest?  How can we maintain this in the long term?  How do we know that
these facilities actually work correctly and not cause mysterious problems?

The CPU load calculation patch is a POC patch, i didn't evaluate it's behavior
in all platforms.

 
Maybe a couple of hooks could be useful to allow people to experiment
with this.  But the hooks should be more general, as described above.
But I think a few GUC settings that can be adjusted at run time could be
sufficient as well.

With the GUC settings of parallel it is possible to control the behavior where
it improves the performance because of more parallel workers when there is
very less load on the system. In case if the system load increases and use of
more parallel workers can add the overhead instead of improvement to existing
current behavior when the load is high.

In such cases, the number of parallel workers needs to be reduced with change
in GUC settings. Instead of that, I just thought, how about if we do the same
automatically.


Regards,
Hari Babu
Fujitsu Australia

Re: System load consideration before spawning parallel workers

From
Peter Eisentraut
Date:
It seems clear that this patch design is not favored by the community,
so I'm setting the patch as rejected in the CF app.

I think there is interest in managing system resources better, but I
don't know what that would look like.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services