Thread: refusing connections based on load ...

refusing connections based on load ...

From

The Hermit Hacker

Date:

23 April 2001, 14:10:02

Anyone thought of implementing this, similar to how sendmail does it?  If
load > n, refuse connections?

Basically, if great to set max clients to 256, but if load hits 50 as a
result, the database is near to useless ... if you set it to 256, and 254
idle connections are going, load won't rise much, so is safe, but if half
of those processes are active, it hurts ...

so, if it was set so that a .conf variable could be set so that max
connection == 256 *or* load > n to refuse connections, you'd hvae best of
both worlds ...

sendmail does it now, and, apparently relatively portable across OSs ...
okay, just looked at the code, and its kinda painful, but its in
src/conf.c, as a 'getla' function ...

If nobody is working on something like this, does anyone but me feel that
it has merit to make use of?  I'll play with it if so ...

Re: refusing connections based on load ...

From

ncm@zembu.com (Nathan Myers)

Date:

23 April 2001, 15:11:16

On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote:
> 
> Anyone thought of implementing this, similar to how sendmail does it?  If
> load > n, refuse connections?
> ... 
> If nobody is working on something like this, does anyone but me feel that
> it has merit to make use of?  I'll play with it if so ...

I agree that it would be useful.  Even more useful would be soft load 
shedding, where once some load average level is exceeded the postmaster 
delays a bit (proportionately) before accepting a connection.  

Nathan Myers
ncm@zembu.com

Re: refusing connections based on load ...

From

Jan Wieck

Date:

23 April 2001, 15:48:31

Nathan Myers wrote:
> On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote:
> >
> > Anyone thought of implementing this, similar to how sendmail does it?  If
> > load > n, refuse connections?
> > ...
> > If nobody is working on something like this, does anyone but me feel that
> > it has merit to make use of?  I'll play with it if so ...
>
> I agree that it would be useful.  Even more useful would be soft load
> shedding, where once some load average level is exceeded the postmaster
> delays a bit (proportionately) before accepting a connection.
   Or  have  the  load  check  on  AtXactStart,  and  delay  new   transactions  until  load  is  back  below  x,
where x   is   configurable  per  user/group  plus some per database scaling   factor.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: refusing connections based on load ...

From

Tom Lane

Date:

23 April 2001, 19:59:14

The Hermit Hacker <scrappy@hub.org> writes:
> sendmail does it now, and, apparently relatively portable across OSs ...

sendmail expects to be root.  It's unlikely (and very undesirable) that
postgres will be installed with adequate privileges to read /dev/kmem,
which is what it'd take to run the sendmail loadaverage code on most
platforms...
        regards, tom lane

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

23 April 2001, 22:35:37

On Mon, 23 Apr 2001, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > sendmail does it now, and, apparently relatively portable across OSs ...
>
> sendmail expects to be root.  It's unlikely (and very undesirable) that
> postgres will be installed with adequate privileges to read /dev/kmem,
> which is what it'd take to run the sendmail loadaverage code on most
> platforms...

Actually, not totally accurate ... sendmail has a 'RunAs' option for those
that don't wish to have it run as root, and still works for the loadavg
stuff, to the best of my knowledge (its an option I haven't played with
yet) ...

Re: refusing connections based on load ...

From

Larry Rosenman

Date:

23 April 2001, 22:42:53

* The Hermit Hacker <scrappy@hub.org> [010423 21:38]:
> On Mon, 23 Apr 2001, Tom Lane wrote:
> 
> > The Hermit Hacker <scrappy@hub.org> writes:
> > > sendmail does it now, and, apparently relatively portable across OSs ...
> >
> > sendmail expects to be root.  It's unlikely (and very undesirable) that
> > postgres will be installed with adequate privileges to read /dev/kmem,
> > which is what it'd take to run the sendmail loadaverage code on most
> > platforms...
> 
> Actually, not totally accurate ... sendmail has a 'RunAs' option for those
> that don't wish to have it run as root, and still works for the loadavg
> stuff, to the best of my knowledge (its an option I haven't played with
> yet) ...
And 8.12.x will have some other options as well....

Like the SUBMISSION prog only needs to be SGID, not SUID....

LER

> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: refusing connections based on load ...

From

Tom Lane

Date:

23 April 2001, 22:50:52

The Hermit Hacker <scrappy@hub.org> writes:
> On Mon, 23 Apr 2001, Tom Lane wrote:
>> sendmail expects to be root.

> Actually, not totally accurate ... sendmail has a 'RunAs' option for those
> that don't wish to have it run as root,

True, it doesn't *have* to be root, but the loadavg code still requires
privileges beyond those of mere mortals (as does listening on port 25,
last I checked).

On my HPUX box:

$ ls -l /dev/kmem
crw-r-----   1 bin        sys          3 0x000001 Jun 10  1996 /dev/kmem

so postgres would have to run setuid bin or setgid sys to read the load
average.  Either one is equivalent to giving an attacker the keys to the
kingdom (overwrite a few key /usr/bin/ executables and wait for root to
run one...)

On Linux and BSD it seems to be more common to put /dev/kmem into a
specialized group "kmem", so running postgres as setgid kmem is not so
immediately dangerous.  Still, do you think it's a good idea to let an
attacker have open-ended rights to read your kernel memory?  It wouldn't
take too much effort to sniff passwords, for example.

Basically, if we do this then we are abandoning the notion that Postgres
runs as an unprivileged user.  I think that's a BAD idea, especially in
an environment that's open enough that you might feel the need to
load-throttle your users.  By definition you do not trust them, eh?

A less dangerous way of approaching it might be to have an option
whereby the postmaster invokes 'uptime' via system() every so often
(maybe once a minute?) and throttles on the basis of the results.
The reaction time would be poorer, but security would be a whole lot
better.
        regards, tom lane

Re: refusing connections based on load ...

From

Larry Rosenman

Date:

23 April 2001, 22:51:04

* Larry Rosenman <ler@lerctr.org> [010423 21:45]:
> * The Hermit Hacker <scrappy@hub.org> [010423 21:38]:
> > On Mon, 23 Apr 2001, Tom Lane wrote:
> > 
> > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > sendmail does it now, and, apparently relatively portable across OSs ...
> > >
> > > sendmail expects to be root.  It's unlikely (and very undesirable) that
> > > postgres will be installed with adequate privileges to read /dev/kmem,
> > > which is what it'd take to run the sendmail loadaverage code on most
> > > platforms...
> > 
> > Actually, not totally accurate ... sendmail has a 'RunAs' option for those
> > that don't wish to have it run as root, and still works for the loadavg
> > stuff, to the best of my knowledge (its an option I haven't played with
> > yet) ...
> And 8.12.x will have some other options as well....
> 
> Like the SUBMISSION prog only needs to be SGID, not SUID....
Actually, the sendmail DAEMON will still have ROOT privs, so it can
read /dev/kmem.

I suspect I don't have as much of an issue if we are sgid kmem...

LER

> 
> LER
> 
> > 
> > 
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> -- 
> Larry Rosenman                     http://www.lerctr.org/~ler
> Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
> US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: refusing connections based on load ...

From

Larry Rosenman

Date:

23 April 2001, 23:12:55

* Tom Lane <tgl@sss.pgh.pa.us> [010423 21:54]:
> The Hermit Hacker <scrappy@hub.org> writes:

> On my HPUX box:
> 
> $ ls -l /dev/kmem
> crw-r-----   1 bin        sys          3 0x000001 Jun 10  1996 /dev/kmem
> 
> so postgres would have to run setuid bin or setgid sys to read the load
> average.  Either one is equivalent to giving an attacker the keys to the
> kingdom (overwrite a few key /usr/bin/ executables and wait for root to
> run one...)
On my UnixWare box it's 0440 sys.sys....

> 
> On Linux and BSD it seems to be more common to put /dev/kmem into a
> specialized group "kmem", so running postgres as setgid kmem is not so
> immediately dangerous.  Still, do you think it's a good idea to let an
> attacker have open-ended rights to read your kernel memory?  It wouldn't
> take too much effort to sniff passwords, for example.
> 
> Basically, if we do this then we are abandoning the notion that Postgres
> runs as an unprivileged user.  I think that's a BAD idea, especially in
> an environment that's open enough that you might feel the need to
> load-throttle your users.  By definition you do not trust them, eh?
> 
> A less dangerous way of approaching it might be to have an option
> whereby the postmaster invokes 'uptime' via system() every so often
> (maybe once a minute?) and throttles on the basis of the results.
> The reaction time would be poorer, but security would be a whole lot
> better.
Then there are boxes like my UnixWare one where the load average is
not available AT ALL:

$ uptime 10:05pm  up 2 days,  3:16,  3 users
$ 

It's a threaded kernel, and SCO/Novell/whoever has removed all traces
from userland of the load average.  avenrun[] is still a symbol in the
kernel, but...


-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: refusing connections based on load ...

From

Tom Lane

Date:

23 April 2001, 23:21:38

> Rather than do system('uptime') and incur the process start-up each time,
> you could do fp = popen('vmstat 60', 'r'), then just read the fp.

popen doesn't incur a process start?  Get real.  But you're right, popen()
is the right call not system(), because you need to read the stdout.

> I believe vmstat is fairly standard.

Not more so than uptime --- and the latter's output format is definitely
less variable across platforms.  The HPUX man page goes so far as to say

WARNINGS    Users of vmstat must not rely on the exact field widths and spacing of    its output, as these will vary
dependingon the system, the release of    HP-UX, and the data to be displayed.
 

and that's just for *one* platform.
        regards, tom lane

Re: refusing connections based on load ...

From

Ian Lance Taylor

Date:

24 April 2001, 00:03:33

Tom Lane <tgl@sss.pgh.pa.us> writes:

> On Linux and BSD it seems to be more common to put /dev/kmem into a
> specialized group "kmem", so running postgres as setgid kmem is not so
> immediately dangerous.  Still, do you think it's a good idea to let an
> attacker have open-ended rights to read your kernel memory?  It wouldn't
> take too much effort to sniff passwords, for example.

On Linux you can get the load average by doing `cat /proc/loadavg'.
On NetBSD you can get the load average via a sysctl.  On those systems
and others the uptime program is neither setuid nor setgid.

> A less dangerous way of approaching it might be to have an option
> whereby the postmaster invokes 'uptime' via system() every so often
> (maybe once a minute?) and throttles on the basis of the results.
> The reaction time would be poorer, but security would be a whole lot
> better.

That is the way to do it on systems where obtaining the load average
requires special privileges.  But do you really need the load average
once a minute?  The load average printed by uptime is just as accurate
as the load average obtained by examining the kernel.

Ian

---------------------------(end of broadcast)---------------------------
TIP 652: Life is a serious burden, which no thinking, humane person would
wantonly inflict on someone else.    -- Clarence Darrow

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

24 April 2001, 00:23:16

other then a potential buffer overrun, what would be the problem with:

open(kmem)
read values
close(kmem)

?

I would think it would be less taxing to the system then doing a system()
call, but still effectively as safe, no?

On Mon, 23 Apr 2001, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > On Mon, 23 Apr 2001, Tom Lane wrote:
> >> sendmail expects to be root.
>
> > Actually, not totally accurate ... sendmail has a 'RunAs' option for those
> > that don't wish to have it run as root,
>
> True, it doesn't *have* to be root, but the loadavg code still requires
> privileges beyond those of mere mortals (as does listening on port 25,
> last I checked).
>
> On my HPUX box:
>
> $ ls -l /dev/kmem
> crw-r-----   1 bin        sys          3 0x000001 Jun 10  1996 /dev/kmem
>
> so postgres would have to run setuid bin or setgid sys to read the load
> average.  Either one is equivalent to giving an attacker the keys to the
> kingdom (overwrite a few key /usr/bin/ executables and wait for root to
> run one...)
>
> On Linux and BSD it seems to be more common to put /dev/kmem into a
> specialized group "kmem", so running postgres as setgid kmem is not so
> immediately dangerous.  Still, do you think it's a good idea to let an
> attacker have open-ended rights to read your kernel memory?  It wouldn't
> take too much effort to sniff passwords, for example.
>
> Basically, if we do this then we are abandoning the notion that Postgres
> runs as an unprivileged user.  I think that's a BAD idea, especially in
> an environment that's open enough that you might feel the need to
> load-throttle your users.  By definition you do not trust them, eh?
>
> A less dangerous way of approaching it might be to have an option
> whereby the postmaster invokes 'uptime' via system() every so often
> (maybe once a minute?) and throttles on the basis of the results.
> The reaction time would be poorer, but security would be a whole lot
> better.
>
>             regards, tom lane
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

24 April 2001, 00:26:16

On 23 Apr 2001, Ian Lance Taylor wrote:

> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > On Linux and BSD it seems to be more common to put /dev/kmem into a
> > specialized group "kmem", so running postgres as setgid kmem is not so
> > immediately dangerous.  Still, do you think it's a good idea to let an
> > attacker have open-ended rights to read your kernel memory?  It wouldn't
> > take too much effort to sniff passwords, for example.
>
> On Linux you can get the load average by doing `cat /proc/loadavg'.
> On NetBSD you can get the load average via a sysctl.  On those systems
> and others the uptime program is neither setuid nor setgid.

Good call ... FreeBSD has it also, and needs no special privileges ...
just checked, and the sysctl command isn't setuid/setgid anything, so I'm
guessing that using sysctl() to pull these values shouldn't create any
security issues on those systems that support it ?

Re: refusing connections based on load ...

From

Lincoln Yeoh

Date:

24 April 2001, 00:38:48

At 03:09 PM 23-04-2001 -0300, you wrote:
>
>Anyone thought of implementing this, similar to how sendmail does it?  If
>load > n, refuse connections?
>
>Basically, if great to set max clients to 256, but if load hits 50 as a
>result, the database is near to useless ... if you set it to 256, and 254
>idle connections are going, load won't rise much, so is safe, but if half
>of those processes are active, it hurts ...

Sorry, but I still don't understand the reasons why one would want to do
this. Could someone explain?

I'm thinking that if I allow 256 clients, and my hardware/OS bogs down when
60 users are doing lots of queries, I either accept that, or figure that my
hardware/OS actually can't cope with that many clients and reduce the max
clients or upgrade the hardware (or maybe do a little tweaking here and
there).

Why not be more deterministic about refusing connections and stick to
reducing max clients? If not it seems like a case where you're promised
something but when you need it, you can't have it. 

Cheerio,
Link.

Re: refusing connections based on load ...

From

Doug McNaught

Date:

24 April 2001, 00:54:15

Tom Lane <tgl@sss.pgh.pa.us> writes:

> > Rather than do system('uptime') and incur the process start-up each time,
> > you could do fp = popen('vmstat 60', 'r'), then just read the fp.
> 
> popen doesn't incur a process start?  Get real.  But you're right, popen()
> is the right call not system(), because you need to read the stdout.

Tom,

I think the point here is that the 'vmstat' process, once started,
will keep printing status output every 60 seconds (if invoked as
above) so you don't have to restart it every minute, just read the
pipe. 

> > I believe vmstat is fairly standard.
> 
> Not more so than uptime --- and the latter's output format is definitely
> less variable across platforms.  The HPUX man page goes so far as to say
> 
> WARNINGS
>      Users of vmstat must not rely on the exact field widths and spacing of
>      its output, as these will vary depending on the system, the release of
>      HP-UX, and the data to be displayed.
> 
> and that's just for *one* platform.

A very valid objection.  I'm also dubious as to the utility of the
whole concept.  What happens when Sendmail refuses a message based on
load?  It is requeued on the sending end to be tried later.  What
happens when PG refuses a new client connection based on load?  The
application stops working.  Is this really better than having slow
response time because the server is thrashing?

I guess my point is that Sendmail is a store-and-forward situation
where the mail system can "catch up" once the load returns to normal.
Whereas, I would think, the majority of PG installations want a
working database, and whether it's refusing connections due to load or 
simply bogged down isn't going to make a difference to users that
can't get their data.

-Doug
-- 
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time...          --Dylan

Re: refusing connections based on load ...

From

ncm@zembu.com (Nathan Myers)

Date:

24 April 2001, 01:01:06

On Mon, Apr 23, 2001 at 10:50:42PM -0400, Tom Lane wrote:
> Basically, if we do this then we are abandoning the notion that Postgres
> runs as an unprivileged user.  I think that's a BAD idea, especially in
> an environment that's open enough that you might feel the need to
> load-throttle your users.  By definition you do not trust them, eh?

No.  It's not a case of trust, but of providing an adaptive way
to keep performance reasonable.  The users may have no independent
way to cooperate to limit load, but the DB can provide that.

> A less dangerous way of approaching it might be to have an option
> whereby the postmaster invokes 'uptime' via system() every so often
> (maybe once a minute?) and throttles on the basis of the results.
> The reaction time would be poorer, but security would be a whole lot
> better.

Yes, this alternative looks much better to me.  On Linux you have
the much more efficient alternative, /proc/loadavg.  (I wouldn't
use system(), though.)

Nathan Myers
ncm@zembu.com

Re: Re: refusing connections based on load ...

From

ncm@zembu.com (Nathan Myers)

Date:

24 April 2001, 02:03:53

On Tue, Apr 24, 2001 at 12:39:29PM +0800, Lincoln Yeoh wrote:
> At 03:09 PM 23-04-2001 -0300, you wrote:
> >Basically, if great to set max clients to 256, but if load hits 50 
> >as a result, the database is near to useless ... if you set it to 256, 
> >and 254 idle connections are going, load won't rise much, so is safe, 
> >but if half of those processes are active, it hurts ...
> 
> Sorry, but I still don't understand the reasons why one would want to do
> this. Could someone explain?
> 
> I'm thinking that if I allow 256 clients, and my hardware/OS bogs down
> when 60 users are doing lots of queries, I either accept that, or
> figure that my hardware/OS actually can't cope with that many clients
> and reduce the max clients or upgrade the hardware (or maybe do a
> little tweaking here and there).
>
> Why not be more deterministic about refusing connections and stick
> to reducing max clients? If not it seems like a case where you're
> promised something but when you need it, you can't have it.

The point is that "number of connections" is a very poor estimate of 
system load.  Sometimes a connection is busy, sometimes it's not.
Some connections are busy, some are not.  The goal is maximum 
throughput or some tradeoff of maximum throughput against latency.  
If system throughput varies nonlinearly with load (as it almost 
always does) then this happens at some particular load level.

Refusing a connection and letting the client try again later can be 
a way to maximize throughput by keeping the system at the optimum 
point.  (Waiting reduces delay.  Yes, this is counterintuitive, but 
why do we queue up at ticket windows?)

Delaying response, when under excessive load, to clients who already 
have a connection -- even if they just got one -- can have a similar 
effect, but with finer granularity and with less complexity in the 
clients.  

Nathan Myers
ncm@zembu.com

Re: refusing connections based on load ...

From

Peter Eisentraut

Date:

24 April 2001, 10:58:14

Tom Lane writes:

> The Hermit Hacker <scrappy@hub.org> writes:
> > sendmail does it now, and, apparently relatively portable across OSs ...
>
> sendmail expects to be root.  It's unlikely (and very undesirable) that
> postgres will be installed with adequate privileges to read /dev/kmem,
> which is what it'd take to run the sendmail loadaverage code on most
> platforms...

This program:

#include <stdio.h>

int main()
{   double la[3];
   if (getloadavg(la, 3) == -1)       perror("getloadavg");
   printf("%f %f %f\n", la[0], la[1], la[2]);
   return 0;
}

works unprivileged on Linux 2.2 and FreeBSD 4.3.  Rumour[*] also has it
that there is a way to do this on Solaris and HP-UX 9.  So I think that
covers enough users to be worthwhile.

[*] - Autoconf AC_FUNC_GETLOADAVG

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

24 April 2001, 13:56:05

Apparently so under Solaris ...

hestia:/> uname -a
SunOS hestia 5.7 Generic_106542-12 i86pc i386 i86pc

C Library Functions                                getloadavg(3C)

NAME    getloadavg - get system load averages

SYNOPSIS    #include <sys/loadavg.h>
    int getloadavg(double loadavg[], int nelem);

DESCRIPTION

How hard would it be to knock up code that, by default, ignores loadavg,
but if, say, set in postgresql.conf:

loadavg    = 4

it will just refuse connections?


On Tue, 24 Apr 2001, Peter Eisentraut wrote:

> Tom Lane writes:
>
> > The Hermit Hacker <scrappy@hub.org> writes:
> > > sendmail does it now, and, apparently relatively portable across OSs ...
> >
> > sendmail expects to be root.  It's unlikely (and very undesirable) that
> > postgres will be installed with adequate privileges to read /dev/kmem,
> > which is what it'd take to run the sendmail loadaverage code on most
> > platforms...
>
> This program:
>
> #include <stdio.h>
>
> int main()
> {
>     double la[3];
>
>     if (getloadavg(la, 3) == -1)
>         perror("getloadavg");
>
>     printf("%f %f %f\n", la[0], la[1], la[2]);
>
>     return 0;
> }
>
> works unprivileged on Linux 2.2 and FreeBSD 4.3.  Rumour[*] also has it
> that there is a way to do this on Solaris and HP-UX 9.  So I think that
> covers enough users to be worthwhile.
>
> [*] - Autoconf AC_FUNC_GETLOADAVG
>
> --
> Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter
>
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: refusing connections based on load ...

From

Jan Wieck

Date:

24 April 2001, 14:16:18

Doug McNaught wrote:
> A very valid objection.  I'm also dubious as to the utility of the
> whole concept.  What happens when Sendmail refuses a message based on
> load?  It is requeued on the sending end to be tried later.  What
> happens when PG refuses a new client connection based on load?  The
> application stops working.  Is this really better than having slow
> response time because the server is thrashing?
   That's exactly the point why I suggested to delay transaction   starts instead. The client app allways gets  the
connection.  Doing  dialog  steps inside of open transactions is allways a   bad design, leading to a couple  of
problems (coffee  break   with  open  locks),  so  we can assume that if an application   starts a transaction, it'll
keepthis one backend as busy  as   possible until the transactions end.

   Processing  too  many transactions parallel is what get's the   system  into  heavy  swapping  and   exponential
usage  of   resources. So if we delay starting transactions if the system   load is above the limit, we probably
speedupthe overall  per   transaction  response  time,  increasing  the  througput. And   that's what this discussion
isall about, no?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: refusing connections based on load ...

From

Peter Eisentraut

Date:

24 April 2001, 15:02:05

Doug McNaught writes:

> A very valid objection.  I'm also dubious as to the utility of the
> whole concept.  What happens when Sendmail refuses a message based on
> load?  It is requeued on the sending end to be tried later.  What
> happens when PG refuses a new client connection based on load?  The
> application stops working.  Is this really better than having slow
> response time because the server is thrashing?

The concept is just as dubious as the concept of rejecting clients based
on how many clients are already connected.  There are some technical
reasons for the latter, but it is still used as an administrative tool.
The rule is, if you don't like it, don't use it.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter

Re: Re: refusing connections based on load ...

From

Lincoln Yeoh

Date:

24 April 2001, 22:05:24

At 10:59 PM 23-04-2001 -0700, Nathan Myers wrote:
>On Tue, Apr 24, 2001 at 12:39:29PM +0800, Lincoln Yeoh wrote:
>> Why not be more deterministic about refusing connections and stick
>> to reducing max clients? If not it seems like a case where you're
>> promised something but when you need it, you can't have it.
>
>The point is that "number of connections" is a very poor estimate of 
>system load.  Sometimes a connection is busy, sometimes it's not.

Actually I use number of connections to estimate how much RAM I will need,
not for estimating system load.

Because once the system runs out of RAM, performance drops a lot. If I can
prevent the system running out of RAM, it can usually take whatever I throw
at it at near the max throughput. 

For my app say the max is X hits per second with a few concurrent
transactions. When I boost it the number of concurrent transactions (e.g.
25 on a 128MB machine, load~13) it goes down to maybe 0.95X hits per
second[1]. This is acceptable to me.

But once the machine starts swapping, things bog down drastically and some
connections get Server Error.

>Refusing a connection and letting the client try again later can be 
>a way to maximize throughput by keeping the system at the optimum 
>point.  (Waiting reduces delay.  Yes, this is counterintuitive, but 
>why do we queue up at ticket windows?)
>
>Delaying response, when under excessive load, to clients who already 
>have a connection -- even if they just got one -- can have a similar 
>effect, but with finer granularity and with less complexity in the 
>clients.  

With my web apps, refusing connection based on load doesn't help at all,
they are fastcgi processes and are already holding database connections
open, before even getting a web request ( might as well open the db
connection before the client talks to you).

For other apps maybe refusing connection could help. But are these cases in
the majority? In say a bank teller environment, the database connections
are probably already open, and could remain open the whole day.

Delaying transactions based on load is easier to understand for me.

Cheerio,
Link.

[1] This is a guesstimate: the hits per second drops gradually during the
benchmark.
The speed for a low concurrent test run AFTER the benchmark had a slower
hits per second than the benchmark figures.

This is probably because there was a lot of selecting and updating of the
same row, and Postgresql needs a vacuum before the speed goes back up.
Seems like the dead rows get in the way of the index or something - speed
doesn't slow down as much for lots of inserts and selects.

Re: Re: Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

24 April 2001, 22:28:33

On Wed, 25 Apr 2001, Lincoln Yeoh wrote:

> At 10:59 PM 23-04-2001 -0700, Nathan Myers wrote:
> >On Tue, Apr 24, 2001 at 12:39:29PM +0800, Lincoln Yeoh wrote:
> >> Why not be more deterministic about refusing connections and stick
> >> to reducing max clients? If not it seems like a case where you're
> >> promised something but when you need it, you can't have it.
> >
> >The point is that "number of connections" is a very poor estimate of
> >system load.  Sometimes a connection is busy, sometimes it's not.
>
> Actually I use number of connections to estimate how much RAM I will need,
> not for estimating system load.
>
> Because once the system runs out of RAM, performance drops a lot. If I can
> prevent the system running out of RAM, it can usually take whatever I throw
> at it at near the max throughput.

I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and
disks are pretty much sitting idle ...

It turns out, in this case, that vacuum was in order (i vacuum 12x per day
now instead of 6), so that now it will run with 300 simultaneous
connections, but with a loadavg of 68 or so, 300 connections are just
building on each other to slow the rest down :(

Re: Re: Re: refusing connections based on load ...

From

Lincoln Yeoh

Date:

25 April 2001, 01:14:58

At 11:28 PM 24-04-2001 -0300, The Hermit Hacker wrote:
>
>I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
>week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and
>disks are pretty much sitting idle ...
>
>It turns out, in this case, that vacuum was in order (i vacuum 12x per day
>now instead of 6), so that now it will run with 300 simultaneous
>connections, but with a loadavg of 68 or so, 300 connections are just
>building on each other to slow the rest down :(
>

Hmm then maybe we should refuse connections based on "need to vacuum"... :).

Seriously though does the _total_ work throughput go down significantly
when you have high loads? 

I got a load 13 with 25 concurrent connections (not much), and yeah things
took longer but the hits per second wasn't very much different from the
peak possible with fewer connections. Basically in my case almost the same
amount of work is being done per second.

So maybe higher loads might be fine on your more powerful system?

Cheerio,
Link.

Re: refusing connections based on load ...

From

ncm@zembu.com (Nathan Myers)

Date:

25 April 2001, 01:36:27

On Tue, Apr 24, 2001 at 11:28:17PM -0300, The Hermit Hacker wrote:
> I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
> week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and
> disks are pretty much sitting idle ...

Assuming "strip'd" above means "striped", it strikes me that you
might be much better off operating the drives independently, with
the various tables, indexes, and logs scattered each entirely on one 
drive.  That way the heads can move around independently reading and 
writing N blocks, rather than all moving in concert reading or writing 
only one block at a time.  (Striping the WAL file on a couple of raw 
devices might be a good idea along with the above.  Can we do that?)

But of course speculation is much less useful than trying it.  Some 
measurements before and after would be really, really interesting
to many of us.

Nathan Myers
ncm@zembu.com

Re: Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

25 April 2001, 08:42:02

On Tue, 24 Apr 2001, Nathan Myers wrote:

> On Tue, Apr 24, 2001 at 11:28:17PM -0300, The Hermit Hacker wrote:
> > I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
> > week, I've hit many times where CPU usage is 100%, RAM is 500Meg free and
> > disks are pretty much sitting idle ...
>
> Assuming "strip'd" above means "striped", it strikes me that you
> might be much better off operating the drives independently, with
> the various tables, indexes, and logs scattered each entirely on one
> drive.

have you ever tried to maintain a database doing this?  PgSQL is
definitely not designed for this sort of setup, I had symlinks goign
everywhere,a nd with the new numbering schema, this is even more difficult
to try and do :)

Re: refusing connections based on load ...

From

Christopher Masto

Date:

25 April 2001, 11:23:33

The whole argument over how to get load averages seems rather silly,
and it's moot if the idea of using the load information to alter
PG behavior is rejected.

I personally have no use for it, but I don't think it's a bad idea in
general.  Particularly given future redundancy/load sharing features.
On the other hand, I think almost all of this stuff can and should be
done outside of postmaster.

Here is the 0-change version, for rejecting connections, and for
operating systems that have built-in firewall capability, such as
FreeBSD: a standalone daemon that adds a reject rule for the Postgres
port when the load gets too high, and drops that rule when the load
goes back down.

Now here's the small-change version: add support to Postgres for a SET
command or similar way to say "stop accepting connections", or "set
accept/transaction delay to X".  Write a standalone daemon which
monitors the load and issues commands to Postgres as necessary.  That
daemon may need extra privileges, but it is small, auditable, and
doesn't talk to the outside world.  It's probably better to include
in the Postgres protocol support for accepting (TCP-wise) a connection,
then closing it with an error message, because this daemon needs to
be able to connect to tell it to let users in again.  It's probably as
simple as always letting the superuser in.

The latter is nicer in a number of ways.  Persistent connections were
already mentioned - rejecting new connections may not be a good enough
solution there.  With a fancier approach, you could even hang up on
some existing connections with an appropriate message, or just NOTICE
them that you're slowing them down or you'd like them to go away
voluntarily.

From a web-hosting standpoint, someday it would be nifty to have
per-user-per-connection limits, so I could put up a couple of big
PG servers and only allow user X one connection, which can't use
more than Y amount of RAM, and passes a scheduling hint to the OS
so it shares CPU time with other economy-class users, which can
be throttled down to 25% of what ultra-mega-hosting users get.
Simple load shedding is a baby step in the right direction.  If
nothing else, it will cast a spotlight on some of the problem
areas.
-- 
Christopher Masto         Senior Network Monkey      NetMonger Communications
chris@netmonger.net        info@netmonger.net        http://www.netmonger.net

Free yourself, free your machine, free the daemon -- http://www.freebsd.org/

Re: refusing connections based on load ...

From

Tom Lane

Date:

25 April 2001, 13:32:55

Jan Wieck and I talked about this for awhile yesterday, and we came to
the conclusion that load-average-based throttling is a Bad Idea.  Quite
aside from the portability and permissions issues that may arise in
getting the numbers, the available numbers are the wrong thing:

(1) On most Unix systems, the finest-grain load average that you can get
is a 1-minute average.  This will lose both on the ramp up (by the time
you realize you overdid it, you've let *way* too many xacts through the
starting gate) and on the ramp down (you'll hold off xacts for circa a
minute after the crunch is past).

(2) You can also get shorter-time-frame CPU usage numbers (at least,
most versions of top(1) seem to display such things) but CPU load is
really not very helpful for measuring how badly the system is thrashing.
Postgres tends to beat your disks into the ground long before it pegs
the CPU.  Too bad there's no "disk usage" numbers.

However, there is another possibility that would be simple to implement
and perfectly portable: allow the dbadmin to impose a limit on the
number of simultaneous concurrent transactions.  (Setting this equal to
the max allowed number of backends would turn off the limit.)  That
way, you could have umpteen open connections, but you could limit how
many of them were actually *doing* something at any given instant.
If more than N try to start transactions at the same time, the later
ones have to wait for the earlier ones to finish before they can start.
This'd be trivial to do with a semaphore initialized to N --- P() it
in StartTransaction and V() it in Commit/AbortTransaction.

A conncurrent-xacts limit isn't perfect of course, but I think it'd
be pretty good, and certainly better than anything based on the
available load-average numbers.
        regards, tom lane

Re: refusing connections based on load ...

From

Peter Eisentraut

Date:

25 April 2001, 14:17:25

Tom Lane writes:

> A conncurrent-xacts limit isn't perfect of course, but I think it'd
> be pretty good, and certainly better than anything based on the
> available load-average numbers.

The concurrent transaction limit would allow you to control the absolute
load of the PostgreSQL server, but we can already do that and it's not
what we're after here.  The idea behind the load average based approach is
to make the postmaster respect the situation of the overall system.
Additionally, the concurrent transaction limit would only be useful on
setups that have a lot of idle transactions.  Those setups exist, but not
everywhere.

To me, both of these approaches are in the "if you don't like it, don't
use it" category.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

25 April 2001, 15:05:53

On Wed, 25 Apr 2001, Peter Eisentraut wrote:

> Tom Lane writes:
>
> > A conncurrent-xacts limit isn't perfect of course, but I think it'd
> > be pretty good, and certainly better than anything based on the
> > available load-average numbers.
>
> The concurrent transaction limit would allow you to control the absolute
> load of the PostgreSQL server, but we can already do that and it's not
> what we're after here.  The idea behind the load average based approach is
> to make the postmaster respect the situation of the overall system.
> Additionally, the concurrent transaction limit would only be useful on
> setups that have a lot of idle transactions.  Those setups exist, but not
> everywhere.
>
> To me, both of these approaches are in the "if you don't like it, don't
> use it" category.

Agreed ... by default, the loadavg method could be set to zero, to ignore
... I don't care if I'm off by 1min before I catch the increase, the fact
is that I have caught it, and prevent any new ones coming in until it
drops off again ...

Make it two variables:

transla
rejectla

if transla is hit, restrict on transactions, letting others connect, but
putting them on hold while the la drops again ... if it goes above
rejectla, refuse new connections altogether ...

so now I can set something like:

transla = 8
rejectla = 16

but if loadavg goes above 16, I want to get rid of what is causing the
load to rise *before* adding new variables to the mix that will cause it
to rise higher ...

and your arg about permissions (Tom's, not Peter's) is moot in at least 3
of the major systems (Linux, *BSD and Solaris) as there is a getloadavg()
function in all three for doing this ...

Re: refusing connections based on load ...

From

Tom Lane

Date:

25 April 2001, 15:34:14

Peter Eisentraut <peter_e@gmx.net> writes:
> The idea behind the load average based approach is
> to make the postmaster respect the situation of the overall system.

That'd be great if we could do it, but as I pointed out, the available
stats do not allow us to do it very well.

I think this will create a lot of portability headaches for no real
gain.  If it were something we could just do and forget, I would not
object --- but the porting issues will create a LOT more work than
I think this can possibly be worth.  The fact that the work is
distributed and will mostly be incurred by people other than the ones
advocating the change doesn't improve matters.
        regards, tom lane

tables/indexes/logs on different volumes

From

ncm@zembu.com (Nathan Myers)

Date:

25 April 2001, 16:43:59

On Wed, Apr 25, 2001 at 09:41:57AM -0300, The Hermit Hacker wrote:
> On Tue, 24 Apr 2001, Nathan Myers wrote:
> 
> > On Tue, Apr 24, 2001 at 11:28:17PM -0300, The Hermit Hacker wrote:
> > > I have a Dual-866, 1gig of RAM and strip'd file systems ... this past
> > > week, I've hit many times where CPU usage is 100%, RAM is 500Meg free
> > > and disks are pretty much sitting idle ...
> >
> > Assuming "strip'd" above means "striped", it strikes me that you
> > might be much better off operating the drives independently, with
> > the various tables, indexes, and logs scattered each entirely on one
> > drive.
> 
> have you ever tried to maintain a database doing this?  PgSQL is
> definitely not designed for this sort of setup, I had symlinks going
> everywhere, and with the new numbering schema, this is even more 
> difficult to try and do :)

Clearly you need to build a tool to organize it.  It would help a lot if 
PG itself could provide some basic assistance, such as calling a stored
procedure to generate the pathname of the file.

Has there been any discussion of anything like that?

Nathan Myers
ncm@zembu.com

Re: refusing connections based on load ...

From

Jan Wieck

Date:

25 April 2001, 16:52:10

The Hermit Hacker wrote:
> Agreed ... by default, the loadavg method could be set to zero, to ignore
> ... I don't care if I'm off by 1min before I catch the increase, the fact
> is that I have caught it, and prevent any new ones coming in until it
> drops off again ...
>
> Make it two variables:
>
> transla
> rejectla
>
> if transla is hit, restrict on transactions, letting others connect, but
> putting them on hold while the la drops again ... if it goes above
> rejectla, refuse new connections altogether ...
>
> so now I can set something like:
>
> transla = 8
> rejectla = 16
>
> but if loadavg goes above 16, I want to get rid of what is causing the
> load to rise *before* adding new variables to the mix that will cause it
> to rise higher ...
>
> and your arg about permissions (Tom's, not Peter's) is moot in at least 3
> of the major systems (Linux, *BSD and Solaris) as there is a getloadavg()
> function in all three for doing this ...
   I've  just  recompiled  my php4 module to get sysvsem support   and limited the number of concurrent DB transactions
on  the   application    level.    The   (not   yet   finished)   TPC-C   implementation I'm working on scales about
3-4 times  better   now. That's an improvement!
 
   This  proves that limiting the number of concurrently running   transactions is sufficient to  keep  the  system
load down.   Combined these two look as follows:
 
   -   We start with a fairly high setting in the semaphore.
   -   When the system load exceeds the high-watermark, we don't       increment the semaphore back after transaction
end (need       to ensure that at least a small minimum of xacts is left,       but that's easy).
 
   -   When the system goes back to normal load level, we slowly       increase the semaphore again.
   This  way  we might have some peek pushing the system against   the wall for a moment. If that doesn't go  away
quickly, we   just delay users (who see some delay anyway actually).
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: refusing connections based on load ...

From

Tom Lane

Date:

25 April 2001, 17:21:35

The Hermit Hacker <scrappy@hub.org> writes:
> Autoconf has a 'LOADAVG' check already, so what is so problematic about
> using that to enabled/disable that feature?

Because it's tied to a GNU getloadavg.c implementation, which we'd have
license problems with using.
        regards, tom lane

Re: refusing connections based on load ...

From

Tom Lane

Date:

25 April 2001, 17:31:35

Jan Wieck <JanWieck@yahoo.com> writes:
>     This  proves that limiting the number of concurrently running
>     transactions is sufficient to  keep  the  system  load  down.
>     Combined these two look as follows:

>     -   We start with a fairly high setting in the semaphore.

>     -   When the system load exceeds the high-watermark, we don't
>         increment the semaphore back after transaction end  (need
>         to ensure that at least a small minimum of xacts is left,
>         but that's easy).

>     -   When the system goes back to normal load level, we slowly
>         increase the semaphore again.

This is a nice way of dealing with the slow reaction time of the
load average --- you don't let it directly drive the decision about
when to start a new transaction, but instead let it tweak the ceiling
on number of concurrent xacts.  I like it.

You probably don't need to have any additional "slowness" in the loop
other than the inherent averaging in the kernel's load average.

I'm still concerned about portability issues, and about whether load
average is really the right number to be looking at, however.
        regards, tom lane

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

25 April 2001, 17:38:22

On Wed, 25 Apr 2001, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > The idea behind the load average based approach is
> > to make the postmaster respect the situation of the overall system.
>
> That'd be great if we could do it, but as I pointed out, the available
> stats do not allow us to do it very well.
>
> I think this will create a lot of portability headaches for no real
> gain.  If it were something we could just do and forget, I would not
> object --- but the porting issues will create a LOT more work than
> I think this can possibly be worth.  The fact that the work is
> distributed and will mostly be incurred by people other than the ones
> advocating the change doesn't improve matters.

As I mentioned, getloadavg() appears to be support on 3 of the primary
platforms we work with, so I'd say for most installations, portability
issues aren't an issue ...

Autoconf has a 'LOADAVG' check already, so what is so problematic about
using that to enabled/disable that feature?

If ( loadavg available on OS  &&  enabled in postgresql.conf ) operate on it
} else ( loadavg not available on OS  && enabled ) noop with a WARN level error that its not available
}

Re: refusing connections based on load ...

From

Vince Vielhaber

Date:

25 April 2001, 18:29:38

On Wed, 25 Apr 2001, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > Autoconf has a 'LOADAVG' check already, so what is so problematic about
> > using that to enabled/disable that feature?
>
> Because it's tied to a GNU getloadavg.c implementation, which we'd have
> license problems with using.

It's part of the standard C library in FreeBSD.  Any other platforms
have it built in?

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net        56K Nationwide Dialup from $16.00/mo
atPop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

25 April 2001, 20:58:22

On Wed, 25 Apr 2001, Vince Vielhaber wrote:

> On Wed, 25 Apr 2001, Tom Lane wrote:
>
> > The Hermit Hacker <scrappy@hub.org> writes:
> > > Autoconf has a 'LOADAVG' check already, so what is so problematic about
> > > using that to enabled/disable that feature?
> >
> > Because it's tied to a GNU getloadavg.c implementation, which we'd have
> > license problems with using.
>
> It's part of the standard C library in FreeBSD.  Any other platforms
> have it built in?

As has been mentioned, Solaris and Linux also have it ...

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

25 April 2001, 21:00:03

On Wed, 25 Apr 2001, Tom Lane wrote:

> I'm still concerned about portability issues, and about whether load
> average is really the right number to be looking at, however.

Its worked for Sendmail for how many years now, and the code is there to
use, with all "portability issues resolved for every platform they use ...
and a growing number of platforms appear to have the mechanisms already
built into their C libraries ...

Re: refusing connections based on load ...

From

Vince Vielhaber

Date:

26 April 2001, 05:37:46

On Wed, 25 Apr 2001, The Hermit Hacker wrote:

> On Wed, 25 Apr 2001, Vince Vielhaber wrote:
>
> > On Wed, 25 Apr 2001, Tom Lane wrote:
> >
> > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > Autoconf has a 'LOADAVG' check already, so what is so problematic about
> > > > using that to enabled/disable that feature?
> > >
> > > Because it's tied to a GNU getloadavg.c implementation, which we'd have
> > > license problems with using.
> >
> > It's part of the standard C library in FreeBSD.  Any other platforms
> > have it built in?
>
> As has been mentioned, Solaris and Linux also have it ...

But what's in FreeBSD's standard library isn't GNU.

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net        56K Nationwide Dialup from $16.00/mo
atPop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================

Re: refusing connections based on load ...

From

The Hermit Hacker

Date:

26 April 2001, 07:45:31

On Thu, 26 Apr 2001, Vince Vielhaber wrote:

> On Wed, 25 Apr 2001, The Hermit Hacker wrote:
>
> > On Wed, 25 Apr 2001, Vince Vielhaber wrote:
> >
> > > On Wed, 25 Apr 2001, Tom Lane wrote:
> > >
> > > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > > Autoconf has a 'LOADAVG' check already, so what is so problematic about
> > > > > using that to enabled/disable that feature?
> > > >
> > > > Because it's tied to a GNU getloadavg.c implementation, which we'd have
> > > > license problems with using.
> > >
> > > It's part of the standard C library in FreeBSD.  Any other platforms
> > > have it built in?
> >
> > As has been mentioned, Solaris and Linux also have it ...
>
> But what's in FreeBSD's standard library isn't GNU.

Wouldn't matter if it was, its part of the OSs standard library ... unless
you mean to pull it in and use it with the distribution, which I think
might be a bad idea ... if we pull anything in, sendmail's would be best
... FreeBSD's will have had anything required for non-FreeBSD systems
yanked out, if it was ever there, while sendmail's already has all the
'hooks' in it ...

Re: refusing connections based on load ...

From

Vince Vielhaber

Date:

26 April 2001, 07:56:01

On Thu, 26 Apr 2001, The Hermit Hacker wrote:

> On Thu, 26 Apr 2001, Vince Vielhaber wrote:
>
> > On Wed, 25 Apr 2001, The Hermit Hacker wrote:
> >
> > > On Wed, 25 Apr 2001, Vince Vielhaber wrote:
> > >
> > > > On Wed, 25 Apr 2001, Tom Lane wrote:
> > > >
> > > > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > > > Autoconf has a 'LOADAVG' check already, so what is so problematic about
> > > > > > using that to enabled/disable that feature?
> > > > >
> > > > > Because it's tied to a GNU getloadavg.c implementation, which we'd have
> > > > > license problems with using.
> > > >
> > > > It's part of the standard C library in FreeBSD.  Any other platforms
> > > > have it built in?
> > >
> > > As has been mentioned, Solaris and Linux also have it ...
> >
> > But what's in FreeBSD's standard library isn't GNU.
>
> Wouldn't matter if it was, its part of the OSs standard library ... unless
> you mean to pull it in and use it with the distribution, which I think
> might be a bad idea ... if we pull anything in, sendmail's would be best
> ... FreeBSD's will have had anything required for non-FreeBSD systems
> yanked out, if it was ever there, while sendmail's already has all the
> 'hooks' in it ...

That wasn't what I was saying at all.

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net        56K Nationwide Dialup from $16.00/mo
atPop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================

Re: refusing connections based on load ...

From

"Len Morgan"

Date:

26 April 2001, 09:10:51

>Nathan Myers wrote:
>> On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote:
>> >
>> > Anyone thought of implementing this, similar to how sendmail does it?
If
>> > load > n, refuse connections?
>> > ...
>> > If nobody is working on something like this, does anyone but me feel
that
>> > it has merit to make use of?  I'll play with it if so ...
>>
>> I agree that it would be useful.  Even more useful would be soft load
>> shedding, where once some load average level is exceeded the postmaster
>> delays a bit (proportionately) before accepting a connection.
>
>    Or  have  the  load  check  on  AtXactStart,  and  delay  new
>    transactions  until  load  is  back  below  x,  where  x   is
>    configurable  per  user/group  plus some per database scaling
>    factor.

How is this different than limiting the number of backends that can be
running at once?  It would seem to me that a user that has a "delayed"
startup is going to think there's something wrong with the server and keep
trying, where as a message like "too many clients - try again later"
explains what's really going on.

len morgan

Re: refusing connections based on load ...

From

"August Zajonc"

Date:

26 April 2001, 09:41:21

The soft load shedding idea is great.

Along the lines of "lots of idle connections" is the issue with the simple
number of connections. I suspect in most real world apps you'll have
logic+web serving on a set of frontends talking to a single db backend
(until clustering is really nailed).

The issue we hit is that if we all the frontends have 250 maxclients, the
number on the backend goes way up.

This falls in the connection pooling realm, and could be implemented with
the client lib presenting a server view, so apps would simply treat the
pooler as a local server which would allocate connections as needed from a
pool of persistent connections. This also has a benefit in cases (cgi) where
persistent connections cannot be maintained properly. I suspect we've got a
10% duty cycle on the persistent connections we set up... This problem is
predicated on the idea that holding a connection is not negligible (i.e.,
5,000 connections open is worse than 200) for the same loads. Not sure if
that's the case...

AZ

"Nathan Myers" <ncm@zembu.com> wrote in message
news:20010423121105.Y3797@store.zembu.com...
> On Mon, Apr 23, 2001 at 03:09:53PM -0300, The Hermit Hacker wrote:
> >
> > Anyone thought of implementing this, similar to how sendmail does it?
If
> > load > n, refuse connections?
> > ...
> > If nobody is working on something like this, does anyone but me feel
that
> > it has merit to make use of?  I'll play with it if so ...
>
> I agree that it would be useful.  Even more useful would be soft load
> shedding, where once some load average level is exceeded the postmaster
> delays a bit (proportionately) before accepting a connection.
>
> Nathan Myers
> ncm@zembu.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

Re: refusing connections based on load ...

From

Neal Norwitz

Date:

26 April 2001, 09:54:50

Tom Lane wrote:

> A less dangerous way of approaching it might be to have an option
> whereby the postmaster invokes 'uptime' via system() every so often
> (maybe once a minute?) and throttles on the basis of the results.
> The reaction time would be poorer, but security would be a whole lot
> better.

Rather than do system('uptime') and incur the process start-up each time,
you could do fp = popen('vmstat 60', 'r'), then just read the fp.

I believe vmstat is fairly standard.  For those systems 
which don't support vmstat, it could be faked with a shell script.

You could write the specific code to handle each arch, but it's
a royal pain, because it's so different for many archs.

Another possibility could be to read from /proc for those systems
that support /proc.  But I think this will be more variable than
the output from vmstat.  Vmstat also has the added benefit of
providing other information.

I agree with Tom about not wanting to open up /dev/kmem, 
due to potential security problems.

Neal

Re: refusing connections based on load ...

From

Tom Lane

Date:

26 April 2001, 10:27:50

Vince Vielhaber <vev@michvhf.com> writes:
> On Wed, 25 Apr 2001, The Hermit Hacker wrote:
>> On Wed, 25 Apr 2001, Vince Vielhaber wrote:
>> 
> On Wed, 25 Apr 2001, Tom Lane wrote:
> Because it's tied to a GNU getloadavg.c implementation, which we'd have
> license problems with using.
> 
> It's part of the standard C library in FreeBSD.  Any other platforms
> have it built in?
>> 
>> As has been mentioned, Solaris and Linux also have it ...

> But what's in FreeBSD's standard library isn't GNU.

Obviously I confused some people.  What Autoconf's LOADAVG macro
actually does is (1) check to see if system has a getloadavg() library routine, and if     so, set up to use that.
Otherwise(2) apply a bunch of ad-hoc checks to find out whether a GNU-specific     getloadavg module can be used.  That
moduleisn't actually     included with autoconf; I imagine the one they have in mind is     the one in GNU make.

Therefore, Autoconf's macro is useless to us as a means of configuring
load average support, because we won't be using GNU make's getloadavg
module.

The Sendmail loadavg code should be more friendly from a licensing
standpoint, but IT HAS PRIVILEGE PROBLEMS.  Reading /dev/kmem isn't
something that we should expect to be able to do in Postgres.

In short, I haven't seen any evidence that we have a portable solution
available.  Please don't reply (yet again) "It works on $MYSYSTEM,
therefore there's no problem."  If you want to implement this feature
then you need to take responsibility for making it work everywhere.
        regards, tom lane