Thread: Admission Control

Admission Control

From
"Kevin Grittner"
Date:
Recent discussions involving the possible benefits of a connection
pool for certain users has reminded me of a brief discussion at The
Royal Oak last month, where I said I would post a reference a
concept which might alleviate the need for external connection
pools.  For those interested, check out section 2.4 of this
document:
Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007.
Architecture of a Database System. Foundations and Trends(R) in
Databases Vol. 1, No. 2 (2007) 141*259.
http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
-Kevin


Re: Admission Control

From
Robert Haas
Date:
On Fri, Jun 25, 2010 at 1:33 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Recent discussions involving the possible benefits of a connection
> pool for certain users has reminded me of a brief discussion at The
> Royal Oak last month, where I said I would post a reference a
> concept which might alleviate the need for external connection
> pools.  For those interested, check out section 2.4 of this
> document:
>
> Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007.
> Architecture of a Database System. Foundations and Trends(R) in
> Databases Vol. 1, No. 2 (2007) 141*259.
>
> http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf

I think a good admission control system for memory would be huge for
us.  There are innumerable threads on pgsql-performance where we tell
people to set work_mem to a tiny value (like 4MB or 16MB) because any
higher value risks driving the machine into swap in the event that
they get an unusually large number of connections or those connections
issue queries that require an unusual number of hashes or sorts.
There are also innumerable postings from people complaining that
external sorts are way slower than in-memory sorts, and of course a
hash join frequently mops the floor with a nested loop with inner
index-scan.

A really trivial admission control system might let you set a
system-wide limit on work_mem.  As we build a plan, we could estimate
the total amount of work_mem it will require by examining all the
sort, hash, and hash aggregate nodes it contains.  In shared memory,
we keep a total of this value across all back-ends.  Just before
beginning to execute a plan that uses >0 work_mem, we bump this value
up by the value for the current plan, unless that would make us exceed
the system-wide limit.  In that case, we sleep, and then next person
to finish executing (and decrease the value in shared memory) will
wake us up to try again.  (Ideally, we'd want to make
maintenance_work_mem part of this accounting process also; otherwise,
a system that was humming along nicely might suddenly start thrashing
when a maintenance operation kicks off.)  I suppose this would take a
good deal of performance testing to see how well it actually works.  A
refinement might be to try to consider an inferior plan that uses less
memory when the system is tight on memory, rather than waiting.  But
you'd have to be careful about that, because waiting might be better
(it's worth waiting 15 s if it means the execution time will decrease
by > 15 s).

The idea of doling out queries to engine processes in an interesting
one, but seems very different than our current query execution model.
I can't even begin to speculate as to whether there's anything
interesting we could do in that area without reading some literature
on the topic - got any pointers?  But even if we can't or don't want
to do that, we already know that limiting the number of backends and
round-robining queries among them performs MUCH better that setting
max_connections to a large value, and not just because of memory
exhaustion.  Our answer to that problem is "use a connection pooler",
but that's not a very good solution even aside from the possible
administrative nuisance, because it only solves half the problem.  In
the general case, the question is not whether we can currently afford
to allocate 0 or 1 backends to a given query, but whether we can
afford to allocate 0, 1, or >1; furthermore, if we can afford to
allocate >1 backend, we'd ideally like to reuse an existing backend
rather than starting a new one.  I don't think anyone's going to be
too happy with a parallel query implementation with a dependency on an
external connection poooler.

One of the difficulties in reusing an existing backend for a new
query, or in maintaining a pool of backends that could be used as
workers for parallel queries, is that our current system does not
require us to have, nor do we have, a way of resetting a backend to a
completely pristine state.  DISCARD ALL is probably pretty close,
because AIUI that's what connection poolers are using today, and if it
didn't work reasonably well, we'd have gotten complaints.  But DISCARD
ALL will not let you rebind to a different database, for example,
which would be a pretty useful thing to do in a pooling environment,
so that you needn't maintain separate pools for each database, and it
doesn't let you restart the authentication cycle (with no backsies)
either. Of course, addressing these problems wouldn't by itself give
us a built-in connection pooler or parallel query execution, and
there's some danger of destabilizing the code, but possibly it would
be a good way to get off the ground.  Not sure, though.  Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> check out section 2.4 of this
> A really trivial admission control system might let you set a
> system-wide limit on work_mem.
Heck, I think an even *more* trivial admission control policy which
limits the number of active database transactions released to
execution might solve a lot of problems.  Of course, what you
propose is more useful, although I'd be inclined to think that we'd
want an admission control layer which could be configured so support
both of these and much more.  Done correctly, it could almost
completely eliminate the downward slope after you hit the "knee" in
many performance graphs.
> A refinement might be to try to consider an inferior plan that
> uses less memory when the system is tight on memory, rather than
> waiting.
I wouldn't try messing with that until we have the basics down.  ;-)
It is within the scope of what an execution admission controller is
intended to be able to do, though.
> The idea of doling out queries to engine processes in an
> interesting one, but seems very different than our current query
> execution model.
That wasn't in section 2.4 itself -- you must have read the whole
chapter.  I think any discussion of that should spin off a separate
thread -- the techniques are really orthogonal.  And frankly, that's
more ambitious a topic than *I'm* inclined to want to get into at
the moment.  An "execution admission controller" that starts simple
but leaves room for growth seems within the realm of possibility.
-Kevin


Re: Admission Control

From
Josh Berkus
Date:
On 6/25/10 12:15 PM, Robert Haas wrote:
> I think a good admission control system for memory would be huge for
> us.  There are innumerable threads on pgsql-performance where we tell
> people to set work_mem to a tiny value (like 4MB or 16MB) because any
> higher value risks driving the machine into swap in the event that
> they get an unusually large number of connections or those connections
> issue queries that require an unusual number of hashes or sorts.

Greenplum did this several years ago with the Bizgres project; it had a
resource control manager which was made available for PostgreSQL core.
However, it would have required a large and unpredictable amount of work
to make it compatible with OLTP workloads.

The problem with centralized resource control is the need for
centralized locking on requests for resources.  That forces transactions
to be serialized in order to make sure resources are not
double-allocated.  This isn't much of a problem in a DW application, but
in a web app with thousands of queries per second it's deadly.
Performance engineering for PostgreSQL over the last 7 years has been
partly about eliminating centralized locking; we don't want to add new
locking.

That means that a realistic admissions control mechanism would need to
operate based on projections and estimates and "best effort" policies.
Not only is this mathematically more complex, it's an open question
whether it puts us ahead of where we are now vis-a-vis underallocation
of memory.  Realistically, a lot of tuning and testing would be required
before such a tool was actually an improvement.

Or, to put it another way: the "poor man's admission control" is a waste
of time because it doesn't actually help performance.  We're basically
facing doing the hard version, or not bothering.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Admission Control

From
"Kevin Grittner"
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> Greenplum did this several years ago with the Bizgres project
> However, it [was not] compatible with OLTP workloads.
> the "poor man's admission control" is a waste of time because it
> doesn't actually help performance.  We're basically facing doing
> the hard version, or not bothering.
I think it's premature to assume that without any evidence.  I'm
sure it's possible to create a policy which does more harm than good
for any particular workload; there's no denying that could happen,
but things such as limiting open transactions (as just one example)
might be accomplished at very low cost.  Since I have seen dramatic
performance improvements from restricting this through a connection
pool, I'm inclined to believe there could be benefit from such a
simple policy as this.  The total work memory policy Robert proposed
sounds likely to more than pay for itself by allowing larger
work_mem settings without risking cache flushing or swapping.
One thing that seems clear to me is that the admission policy should
be configurable, so that it can be tuned base on workload.  That
would also be consistent with a "start simple and expand the
capabilities" approach.
C'mon, don't be such a buzz-kill.  Why should Greenplum have all the
fun?  ;-)
-Kevin


Re: Admission Control

From
Robert Haas
Date:
On Fri, Jun 25, 2010 at 4:10 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 6/25/10 12:15 PM, Robert Haas wrote:
>> I think a good admission control system for memory would be huge for
>> us.  There are innumerable threads on pgsql-performance where we tell
>> people to set work_mem to a tiny value (like 4MB or 16MB) because any
>> higher value risks driving the machine into swap in the event that
>> they get an unusually large number of connections or those connections
>> issue queries that require an unusual number of hashes or sorts.
>
> Greenplum did this several years ago with the Bizgres project; it had a
> resource control manager which was made available for PostgreSQL core.
> However, it would have required a large and unpredictable amount of work
> to make it compatible with OLTP workloads.
>
> The problem with centralized resource control is the need for
> centralized locking on requests for resources.  That forces transactions
> to be serialized in order to make sure resources are not
> double-allocated.  This isn't much of a problem in a DW application, but
> in a web app with thousands of queries per second it's deadly.
> Performance engineering for PostgreSQL over the last 7 years has been
> partly about eliminating centralized locking; we don't want to add new
> locking.

I haven't seen the Greenplum code - how did it actually work?  The
mechanism I just proposed would (except in the case of an overloaded
system) only require holding a lock for long enough to test and update
a single integer in shared memory, which doesn't seem like it would
cause a serious serialization problem.  I might be missing something,
or it might suck for lots of other reasons, but if we already know
that then let's try to be more specific about what the problems are.

> That means that a realistic admissions control mechanism would need to
> operate based on projections and estimates and "best effort" policies.
> Not only is this mathematically more complex, it's an open question
> whether it puts us ahead of where we are now vis-a-vis underallocation
> of memory.  Realistically, a lot of tuning and testing would be required
> before such a tool was actually an improvement.

Before today, that's the only approach I'd ever considered, but this
article made me rethink that.  If you have a stream of queries that
can be run quickly with 1GB of memory and much more slowly with any
lesser amount, the only sensible thing to do is wait until there's a
GB of memory available for you to grab.  What projection or estimate
of "best effort" would arrive at even approximately the same result?

> Or, to put it another way: the "poor man's admission control" is a waste
> of time because it doesn't actually help performance.  We're basically
> facing doing the hard version, or not bothering.

I think it's an oversimplification to group all approaches as "easy"
and "hard", and even more of an oversimplification to say that all of
the easy ones suck.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
Robert Haas
Date:
On Fri, Jun 25, 2010 at 3:52 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Heck, I think an even *more* trivial admission control policy which
> limits the number of active database transactions released to
> execution might solve a lot of problems.

That wouldn't have any benefit over what you can already do with a
connection pooler, though, I think.  In fact, it would probably be
strictly worse, since enlarging the number of backends slows the
system down even if they aren't actually doing anything much.

> Of course, what you
> propose is more useful, although I'd be inclined to think that we'd
> want an admission control layer which could be configured so support
> both of these and much more.  Done correctly, it could almost
> completely eliminate the downward slope after you hit the "knee" in
> many performance graphs.

And world peace!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> Heck, I think an even *more* trivial admission control policy
>> which limits the number of active database transactions released
>> to execution might solve a lot of problems.
> 
> That wouldn't have any benefit over what you can already do with a
> connection pooler, though, I think.  In fact, it would probably be
> strictly worse, since enlarging the number of backends slows the
> system down even if they aren't actually doing anything much.
Agreed -- *if* the connection pool is on another computer. 
Otherwise it'll probably consume about he same resources on the same
machine as what I'm proposing, only with more setup and
configuration involved.  Until we build a connection pooler into the
base product, though, you know what an uphill battle it is to get
people to control contention that way. We can usually get someone to
tweak a GUC when they report a problem, and maybe the tuning tools
could start to take core count and effective spindle count into
consideration and suggest a good setting for this, if we had it.
With the right connection pooler built in to core, though, this one
could go right out the window, and they could tune at that layer
instead.  [thinks]  Actually, the transaction count limit doesn't
need the planner to run first, so it could be considered part of the
first-tier admission control.  Essentially, it *would be* a simple
but effective built-in connection pooler.
I still think an execution admission controller would be worthwhile,
but the policy I proposed doesn't belong there; it might be the
subject of a pretty simple patch which might solve a lot of
performance problems.  Gotta sleep on that....
-Kevin


Re: Admission Control

From
Martijn van Oosterhout
Date:
On Fri, Jun 25, 2010 at 03:15:59PM -0400, Robert Haas wrote:
>  A
> refinement might be to try to consider an inferior plan that uses less
> memory when the system is tight on memory, rather than waiting.  But
> you'd have to be careful about that, because waiting might be better
> (it's worth waiting 15 s if it means the execution time will decrease
> by > 15 s).

I think you could go a long way by doing something much simpler. We
already generate multiple plans and compare costs, why not just include
memory usage as a cost? If you start doing accounting for memory across
the cluster you can assign a "cost" to memory. When there are only a
few processes running it's cheap and you get plans like now. But as the
total memory usage increases you increase the "cost" of memory and
there will be increased pressure to produce lower memory usage plans.

I think this is better than just cutting plans out at a certain
threshold since it would allow plans that *need* memory to work
efficiently will still be able to.

(It doesn't help in situations where you can't accurately predict
memory usage, like hash tables.)

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: Admission Control

From
Robert Haas
Date:
On Sat, Jun 26, 2010 at 11:03 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Fri, Jun 25, 2010 at 03:15:59PM -0400, Robert Haas wrote:
>>  A
>> refinement might be to try to consider an inferior plan that uses less
>> memory when the system is tight on memory, rather than waiting.  But
>> you'd have to be careful about that, because waiting might be better
>> (it's worth waiting 15 s if it means the execution time will decrease
>> by > 15 s).
>
> I think you could go a long way by doing something much simpler. We
> already generate multiple plans and compare costs, why not just include
> memory usage as a cost? If you start doing accounting for memory across
> the cluster you can assign a "cost" to memory. When there are only a
> few processes running it's cheap and you get plans like now. But as the
> total memory usage increases you increase the "cost" of memory and
> there will be increased pressure to produce lower memory usage plans.
>
> I think this is better than just cutting plans out at a certain
> threshold since it would allow plans that *need* memory to work
> efficiently will still be able to.

That's an interesting idea.

> (It doesn't help in situations where you can't accurately predict
> memory usage, like hash tables.)

Not sure what you mean by this part.  We already predict how much
memory a hash table will use.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
Martijn van Oosterhout
Date:
On Sat, Jun 26, 2010 at 11:37:16AM -0400, Robert Haas wrote:
> On Sat, Jun 26, 2010 at 11:03 AM, Martijn van Oosterhout
> > (It doesn't help in situations where you can't accurately predict
> > memory usage, like hash tables.)
>
> Not sure what you mean by this part.  We already predict how much
> memory a hash table will use.

By this I mean where the memory usage of the HashAggregate depends on
how many groups there are, and it's sometimes very difficult to predict
that beforehand. Though maybe that got fixed.

Another issue is cached plans. Say there is increased memory pressure,
at what point do you start replanning existing plans?

While this does have the advantage of being relatively simple to
implement, I think it would be a bitch to tune...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: Admission Control

From
Robert Haas
Date:
On Sat, Jun 26, 2010 at 11:59 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Sat, Jun 26, 2010 at 11:37:16AM -0400, Robert Haas wrote:
>> On Sat, Jun 26, 2010 at 11:03 AM, Martijn van Oosterhout
>> > (It doesn't help in situations where you can't accurately predict
>> > memory usage, like hash tables.)
>>
>> Not sure what you mean by this part.  We already predict how much
>> memory a hash table will use.
>
> By this I mean where the memory usage of the HashAggregate depends on
> how many groups there are, and it's sometimes very difficult to predict
> that beforehand. Though maybe that got fixed.

Oh, I see.  Well, yeah, it's possible the estimates aren't that good
in that case.  I think it's a fairly rare query that has more than one
aggregate in it, though, so you're probably OK as long as you're not
TOO far off - where you can really use up a lot of memory, I think, is
on a query that has lots of sorts or hash joins.

> Another issue is cached plans. Say there is increased memory pressure,
> at what point do you start replanning existing plans?

The obvious thing to do would be to send an invalidation message
whenever you changed the system-wide cost value for use of memory, but
maybe if you're changing it in small increments you'd want to be a bit
more selective.

> While this does have the advantage of being relatively simple to
> implement, I think it would be a bitch to tune...

I'm not sure.  What does seem clear is that it's fundamentally at odds
with the "admission control" approach Kevin is advocating.  When you
start to run short on a resource (perhaps memory), you have to decide
between (a) waiting for memory to become available and (b) switching
to a more memory-efficient plan.  The danger of (b) is that using less
memory probably means using more of some other resource, like CPU or
disk, and now you've just switched around which release you're
overloading - but on the other hand, if the difference in CPU/disk is
small and the memory savings is large, maybe it makes sense.  Perhaps
in the end we'll find we need both capabilities.

I can't help feeling like some good instrumentation would be helpful
in answering some of these questions, although I don't know where to
put it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
"Ross J. Reedstrom"
Date:
On Sat, Jun 26, 2010 at 01:19:57PM -0400, Robert Haas wrote:
> 
> I'm not sure.  What does seem clear is that it's fundamentally at odds
> with the "admission control" approach Kevin is advocating.  When you
> start to run short on a resource (perhaps memory), you have to decide
> between (a) waiting for memory to become available and (b) switching
> to a more memory-efficient plan.  The danger of (b) is that using less
> memory probably means using more of some other resource, like CPU or
> disk, and now you've just switched around which release you're
> overloading - but on the other hand, if the difference in CPU/disk is
> small and the memory savings is large, maybe it makes sense.  Perhaps
> in the end we'll find we need both capabilities.
> 
> I can't help feeling like some good instrumentation would be helpful
> in answering some of these questions, although I don't know where to
> put it.

One issue with this is that no matter how expensive you make a query,
it will run - it just may take a very long time (if the cost is a
reasonable estimate)

This is also an implied suggestion for a dynamically self-modifying cost
param, since the memory cost isn't absolute, but rather relative to free
memory. In addition, as Robert points out, the tradeoff between
resources is dynamic, as well.

Hmm, I'm suddenly struck by the idea of having a max_cost parameter,
that refuses to run (or delays?) queries that have "too high" a cost.
That might have some interactive-SQL uses, as well: catch the cases you
forgot a join condition, so have an unintended cartesian explosion, etc.
Could also be a belt-and-suspenders last defense for DB admins who
aren't sure the client software completely stops the users from doing
something stupid.

Clearly, default to current behavior, -1 (infinity).

Ross
-- 
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


Re: Admission Control

From
Tom Lane
Date:
"Ross J. Reedstrom" <reedstrm@rice.edu> writes:
> Hmm, I'm suddenly struck by the idea of having a max_cost parameter,
> that refuses to run (or delays?) queries that have "too high" a cost.

That's been suggested before, and shot down on the grounds that the
planner's cost estimates are not trustworthy enough to rely on for
purposes of outright-failing a query.  If you didn't want random
unexpected failures, you'd have to set the limit so much higher than
your regular queries cost that it'd be pretty much useless.

Maybe it'd be all right if it were just used to delay launching the
query a bit, but I'm not entirely sure I see the point of that.
        regards, tom lane


Re: Admission Control

From
Josh Berkus
Date:
> While this does have the advantage of being relatively simple to
> implement, I think it would be a bitch to tune...

Precisely.  So, there's a number of issues to solve here:

1) We'd need to add accouting for total memory usage to explain plans 
(worth doing on its own, really, even without admission control).

2) Our memory usage estimates aren't that accurate, which would still 
force us to underallocate (as we do with work_mem) if we were letting 
queries through based on memory usage, or force us to abort queries 
whose actual memory usage was too far off estimated usage.

3) Due to concurrency, by the time the query executes, there will be a 
different set of queries executing on the server than were executing at 
evaluation time.   This will also cause underallocation to continue.

4) Many other processes can use major hunks of memory (autovacuum, for 
example) and would need to be brought under centralized management if 
this was to be a real fix for the underallocation issue.

5) Running out of CPU is, in my experience, more common than running out 
of memory on modern systems.  So it's questionable how useful managing 
overall memory usage at all would be.

Having tinkered with it, I'll tell you that (2) is actually a very hard 
problem, so any solution we implement should delay as long as possible 
in implementing (2).  In the case of Greenplum, what Mark did originally 
IIRC was to check against the global memory pool for each work_mem 
allocation.  This often resulted in 100's of global locking checks per 
query ... like I said, feasible for DW, not for OLTP.

The same is the case with (3).  In some ways, (3) is an even harder 
issue because it allows any kind of admission system to be its own 
enemy; you can imagine "admission storms" where the server tries to 
serve 150 queries which have been waiting for admission at once as soon 
as current load drops.

Given this, I think that actually solving the issue of underallocation, 
or even materially improving on memory underallocation compared to where 
we are now, is a much harder issue than a lot of people realize. 
However, it does point towards two areas for further work:

First, (1) would be valuable on its own as part of diagnostics, logging, 
pg_stat_statements, etc.  And seems like a good "easy" TODO.

We can go back to Kevin's originally proposed simple feature: just 
allowing the DBA to limit the number of concurrently executing queries 
by role and overall.   This would consist of two parameters, 
max_concurrent_statements and max_statement_wait; the second would say 
how long the connection would wait for a query slot before giving up and 
returning an error.

This might help the overall underallocation issue,  and would be far 
simpler than schemes involving RAM, I/O and CPU accounting.  However, 
even this solution has two thorny issues to be solved:

a) the "waiting query storm" issue mentioned above, and

b) pending queries are sitting on idle connections, which could easily 
block higher-priority queries, so managing max_connections per role 
would become much more complex.

Overall, it sounds like a lot of work for improving underallocation 
issues for users who can't make effective use of connection pooling but 
nevertheless have high-performance needs.  Are there enough of these 
users to make it worthwhile?

--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: Admission Control

From
"Kevin Grittner"
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> We can go back to Kevin's originally proposed simple feature:
> just allowing the DBA to limit the number of concurrently
> executing queries by role and overall.
Well, that's more sophisticated than what I proposed, but it's an
interesting twist on it.
> This would consist of two parameters, 
> max_concurrent_statements and max_statement_wait; the second would
> say how long the connection would wait for a query slot before
> giving up and returning an error.
The timeout is also an embellishment to what I proposed, but another
interesting idea.
> even this solution has two thorny issues to be solved:
> 
> a) the "waiting query storm" issue mentioned above
I fleshed out the idea a bit more on the thread titled "Built-in
connection pool", since this would effectively function in a very
similar way to a connection pool.  If you look at that proposal, at
most one transaction would be released to execution when one
transaction completed.  I'm not seeing anything resembling a "storm"
in that, so you must be envisioning something rather different. 
Care to clarify?
> b) pending queries are sitting on idle connections, which could
> easily block higher-priority queries, so managing max_connections
> per role would become much more complex.
That is a good point.  The biggest difference between the
functionality of the proposal on the other thread and the connection
pool built in to our application framework is that the latter has a
prioritized FIFO queue, with ten levels of priority.  A small query
which is run as you tab between controls in a GUI window runs at a
much higher priority than a query which fills a list with a large
number of rows.  This involves both connections reserved for higher
priorities and having higher priority transactions "jump ahead" of
lower priority transactions.  This helps the user perception of
performance in the fat-client GUI applications.  I suppose if we had
limits by role, we could approach this level of functionality within
PostgreSQL.
On the other hand, our web apps run everything at the same priority,
so there would effectively be *no* performance difference between
what I proposed we build in to PostgreSQL and what our shop
currently puts in front of PostgreSQL for a connection pool.
-Kevin


Re: Admission Control

From
Jesper Krogh
Date:
On 2010-06-25 22:44, Robert Haas wrote:
> On Fri, Jun 25, 2010 at 3:52 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov>  wrote:
>    
>> Heck, I think an even *more* trivial admission control policy which
>> limits the number of active database transactions released to
>> execution might solve a lot of problems.
>>      
> That wouldn't have any benefit over what you can already do with a
> connection pooler, though, I think.  In fact, it would probably be
> strictly worse, since enlarging the number of backends slows the
> system down even if they aren't actually doing anything much.
>    

Sorry if I'm asking silly questions, but how does transactions and
connection pooler's interact?

Say if you have 100 clients all doing "fairly inactive" database work
in transactions lasting a couple of minutes at the same time. If I 
understand
connection poolers they dont help much in those situations where an
"accounting" system on "limited resources" across all backends 
definately would help.

(yes, its a real-world application here, wether it is clever or not...  )

In a fully web environment where all transaction last 0.1s .. a pooler
might make fully sense (when traffic goes up).

-- 
Jesper


Re: Admission Control

From
"Kevin Grittner"
Date:
Jesper Krogh <jesper@krogh.cc> wrote:
> Sorry if I'm asking silly questions, but how does transactions and
> connection pooler's interact?
That depends a great deal on the pooler and its configuration, as
well as your client architecture.  Our shop gathers up the
information needed for our database transaction and submits it to a
server application which has all the logic needed to use that data
to apply the transaction.  We determined long ago that it is a Very
Bad Idea for us to have an open database transaction which is
waiting for a user to do something before it can proceed.
> Say if you have 100 clients all doing "fairly inactive" database
> work in transactions lasting a couple of minutes at the same time.
I'm not sure what you mean by that.  If you mean that part of a
database transaction hits the database, and then it takes a while
for the rest of the statements for the transaction to be generated
(based on network latency or waits for user input), then it is hard
to see how you can effectively use a connection pool.  I have yet to
see an environment where I think that's a good approach, but I
haven't seen everything yet.  :-)
On the other hand, if the issue is that 100 transactions are fired
off at the same time and it takes two minutes for them all to
finish, unless you have *a lot* of CPUs and spindles, that's not
efficient use of your resources.  A connection pool might indeed
help with that; you might start getting transactions back in one
second, and have them all done in a minute and a half if you used a
properly configured connection pool.
-Kevin


Re: Admission Control

From
Jesper Krogh
Date:
On 2010-06-28 21:24, Kevin Grittner wrote:
> Jesper Krogh<jesper@krogh.cc>  wrote:
>
>    
>> Sorry if I'm asking silly questions, but how does transactions and
>> connection pooler's interact?
>>      
>
> That depends a great deal on the pooler and its configuration, as
> well as your client architecture.  Our shop gathers up the
> information needed for our database transaction and submits it to a
> server application which has all the logic needed to use that data
> to apply the transaction.  We determined long ago that it is a Very
> Bad Idea for us to have an open database transaction which is
> waiting for a user to do something before it can proceed.
>    

The situation is more:
1) Grab a bunch of data (using pg_try_advisory_lock() to lock out  other processes from grabbing the same).
2) Process the data (in external software).
3) Push results back into the database, including a flag   telling that the data has been processed.
4) Release advisory locks.


Step 2 takes somewhere between a couple of seconds to a couple of
minutes depending on the task to be done.

It might not be "optimal" but it is extremely robust and simple
to wrap 1 to 4 within a BEGIN / COMMIT block.
On the application side is really nice not having to deal with
"partly processed" data in the database, which I can get around
with by just keeping the transaction open.
From my POV, a connection pooler doesn't buy anything, and
I cannot stop all processes from executing at the same time, allthough
it "most likely" will not happen. There is no "wait for user"
involved.

And that means somewhere in the 100+ backends, allthough they
are "mostly" idle, seen from a database perspective.

I have not hit any issues with the work_mem being too high, but
I'm absolutely sure that I could flood the system if they happened to
be working at the same time.

Jesper
-- 
Jesper


Re: Admission Control

From
"Kevin Grittner"
Date:
Jesper Krogh <jesper@krogh.cc> wrote:
> I have not hit any issues with the work_mem being too high, but
> I'm absolutely sure that I could flood the system if they happened
> to be working at the same time.
OK, now that I understand your workload, I agree that a connection
pool at the transaction level won't do you much good.  Something
which limited the concurrent *query* count, or an execution
admission controller based on resource usage, could save you from
occasional random incidents of resource over-usage.
-Kevin


Re: Admission Control

From
Mark Kirkwood
Date:
On 29/06/10 04:48, Tom Lane wrote:
> "Ross J. Reedstrom"<reedstrm@rice.edu>  writes:
>    
>> Hmm, I'm suddenly struck by the idea of having a max_cost parameter,
>> that refuses to run (or delays?) queries that have "too high" a cost.
>>      
> That's been suggested before, and shot down on the grounds that the
> planner's cost estimates are not trustworthy enough to rely on for
> purposes of outright-failing a query.  If you didn't want random
> unexpected failures, you'd have to set the limit so much higher than
> your regular queries cost that it'd be pretty much useless.
>
>    

I wrote something along the lines of this for Greenplum (is probably 
still available in the Bizgres cvs). Yes, cost is not an ideal metric to 
use for bounding workload (but was perhaps better than nothing at all in 
the case it was intended for).

One difficulty with looking at things from the statement cost point of 
view is that all the requisite locks are already taken by the time you 
have a plan - so if you delay execution, these are still held, so 
deadlock likelihood is increased (unless you release locks for waiters, 
and retry for them later - but possibly need to restart executor from 
scratch to cope with possible table or schema changes).

> Maybe it'd be all right if it were just used to delay launching the
> query a bit, but I'm not entirely sure I see the point of that.
>    

I recall handling this by having a configurable option to let these 
queries run if nothing else was. Clearly to have this option on you 
would have to be confident that no single query could bring the system down.

Cheers

Mark


Re: Admission Control

From
Simon Riggs
Date:
On Fri, 2010-06-25 at 13:10 -0700, Josh Berkus wrote:

> The problem with centralized resource control

We should talk about the problem of lack of centralized resource control
as well, to balance.

Another well observed problem is that work_mem is user settable, so many
programs acting together with high work_mem can cause swapping.

The reality is that inefficient resource control leads to about 50%
resource wastage.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Admission Control

From
Mark Kirkwood
Date:
On 29/06/10 05:36, Josh Berkus wrote:
>
> Having tinkered with it, I'll tell you that (2) is actually a very 
> hard problem, so any solution we implement should delay as long as 
> possible in implementing (2).  In the case of Greenplum, what Mark did 
> originally IIRC was to check against the global memory pool for each 
> work_mem allocation.  This often resulted in 100's of global locking 
> checks per query ... like I said, feasible for DW, not for OLTP.

Actually only 1 lock check per query, but certainly extra processing and 
data structures to maintain the pool information... so, yes certainly 
much more suitable for DW (AFAIK we never attempted to measure the 
additional overhead for non DW workload).

Cheers

Mark


Re: Admission Control

From
Josh Berkus
Date:
Simon, Mark,

> Actually only 1 lock check per query, but certainly extra processing and
> data structures to maintain the pool information... so, yes certainly
> much more suitable for DW (AFAIK we never attempted to measure the
> additional overhead for non DW workload).

I recall testing it when the patch was submitted for 8.2., and the 
overhead was substantial in the worst case ... like 30% for an in-memory 
one-liner workload.

I've been going over the greenplum docs and it looks like the  attempt 
to ration work_mem was dropped.  At this point, Greenplum 3.3 only 
rations by # of concurrent queries and total cost.  I know that work_mem 
rationing was in the original plans; what made that unworkable?

My argument in general is that in the general case ... where you can't 
count on a majority of long-running queries ... any kind of admission 
control or resource management is a hard problem (if it weren't, Oracle 
would have had it before 11).  I think that we'll need to tackle it, but 
I don't expect the first patches we make to be even remotely usable. 
It's definitely not an SOC project.

I should write more about this.

--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: Admission Control

From
Mark Kirkwood
Date:
On 09/07/10 05:10, Josh Berkus wrote:
> Simon, Mark,
>
>> Actually only 1 lock check per query, but certainly extra processing and
>> data structures to maintain the pool information... so, yes certainly
>> much more suitable for DW (AFAIK we never attempted to measure the
>> additional overhead for non DW workload).
>
> I recall testing it when the patch was submitted for 8.2., and the 
> overhead was substantial in the worst case ... like 30% for an 
> in-memory one-liner workload.
>

Interesting - quite high! However I recall you tested the initial 
committed version, later additions dramatically reduced the overhead 
(what is in the Bizgres repo *now* is the latest).

> I've been going over the greenplum docs and it looks like the  attempt 
> to ration work_mem was dropped.  At this point, Greenplum 3.3 only 
> rations by # of concurrent queries and total cost.  I know that 
> work_mem rationing was in the original plans; what made that unworkable?
>

That certainly was my understanding too. I left Greenplum about the time 
this was being discussed, and I think the other staff member involved 
with the design left soon afterwards as well, which might have been a 
factor!

> My argument in general is that in the general case ... where you can't 
> count on a majority of long-running queries ... any kind of admission 
> control or resource management is a hard problem (if it weren't, 
> Oracle would have had it before 11).  I think that we'll need to 
> tackle it, but I don't expect the first patches we make to be even 
> remotely usable. It's definitely not an SOC project.
>
> I should write more about this.
>

+1

Cheers

Mark



Re: Admission Control

From
Mark Kirkwood
Date:
On 09/07/10 12:58, Mark Kirkwood wrote:
> On 09/07/10 05:10, Josh Berkus wrote:
>> Simon, Mark,
>>
>>> Actually only 1 lock check per query, but certainly extra processing 
>>> and
>>> data structures to maintain the pool information... so, yes certainly
>>> much more suitable for DW (AFAIK we never attempted to measure the
>>> additional overhead for non DW workload).
>>
>> I recall testing it when the patch was submitted for 8.2., and the 
>> overhead was substantial in the worst case ... like 30% for an 
>> in-memory one-liner workload.
>>
>
> Interesting - quite high! However I recall you tested the initial 
> committed version, later additions dramatically reduced the overhead 
> (what is in the Bizgres repo *now* is the latest).

Purely out of interest, since the old repo is still there, I had a quick 
look at measuring the overhead, using 8.4's pgbench to run two custom 
scripts: one consisting of a single 'SELECT 1', the other having 100 
'SELECT 1' - the latter being probably the worst case scenario. Running 
1,2,4,8 clients and 1000-10000 tramsactions gives an overhead in the 
5-8% range [1] (i.e transactions/s decrease by this amount with the 
scheduler turned on [2]). While a lot better than 30% (!) it is 
certainly higher than we'd like.


Cheers

Mark

[1] I got the same range for pgbench select-only using its usual workload
[2] As compared to Bizgres(8.2.4) and also standard Postgres 8.2.12.


Re: Admission Control

From
Robert Haas
Date:
On Thu, Jul 8, 2010 at 10:21 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> Purely out of interest, since the old repo is still there, I had a quick
> look at measuring the overhead, using 8.4's pgbench to run two custom
> scripts: one consisting of a single 'SELECT 1', the other having 100 'SELECT
> 1' - the latter being probably the worst case scenario. Running 1,2,4,8
> clients and 1000-10000 tramsactions gives an overhead in the 5-8% range [1]
> (i.e transactions/s decrease by this amount with the scheduler turned on
> [2]). While a lot better than 30% (!) it is certainly higher than we'd like.

Isn't the point here to INCREASE throughput?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
Mark Kirkwood
Date:
On 09/07/10 14:26, Robert Haas wrote:
> On Thu, Jul 8, 2010 at 10:21 PM, Mark Kirkwood
> <mark.kirkwood@catalyst.net.nz>  wrote:
>    
>> Purely out of interest, since the old repo is still there, I had a quick
>> look at measuring the overhead, using 8.4's pgbench to run two custom
>> scripts: one consisting of a single 'SELECT 1', the other having 100 'SELECT
>> 1' - the latter being probably the worst case scenario. Running 1,2,4,8
>> clients and 1000-10000 tramsactions gives an overhead in the 5-8% range [1]
>> (i.e transactions/s decrease by this amount with the scheduler turned on
>> [2]). While a lot better than 30% (!) it is certainly higher than we'd like.
>>      
> Isn't the point here to INCREASE throughput?
>
>    

LOL - yes it is! Josh wanted to know what the overhead was for the queue 
machinery itself, so I'm running a test to show that (i.e so I have a 
queue with the thresholds set higher than the test will load them).

In the situation where (say) 11 concurrent queries of a certain type 
make your system become unusable, but 10 are fine, then constraining it 
to have a max of 10 will tend to improve throughput. By how much is hard 
to say, for instance preventing the Linux OOM killer shutting postgres 
down would be infinite I guess :-)

Cheers

Mark


Re: Admission Control

From
Robert Haas
Date:
On Thu, Jul 8, 2010 at 11:00 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> On 09/07/10 14:26, Robert Haas wrote:
>>
>> On Thu, Jul 8, 2010 at 10:21 PM, Mark Kirkwood
>> <mark.kirkwood@catalyst.net.nz>  wrote:
>>
>>>
>>> Purely out of interest, since the old repo is still there, I had a quick
>>> look at measuring the overhead, using 8.4's pgbench to run two custom
>>> scripts: one consisting of a single 'SELECT 1', the other having 100
>>> 'SELECT
>>> 1' - the latter being probably the worst case scenario. Running 1,2,4,8
>>> clients and 1000-10000 tramsactions gives an overhead in the 5-8% range
>>> [1]
>>> (i.e transactions/s decrease by this amount with the scheduler turned on
>>> [2]). While a lot better than 30% (!) it is certainly higher than we'd
>>> like.
>>>
>>
>> Isn't the point here to INCREASE throughput?
>>
>>
>
> LOL - yes it is! Josh wanted to know what the overhead was for the queue
> machinery itself, so I'm running a test to show that (i.e so I have a queue
> with the thresholds set higher than the test will load them).
>
> In the situation where (say) 11 concurrent queries of a certain type make
> your system become unusable, but 10 are fine, then constraining it to have a
> max of 10 will tend to improve throughput. By how much is hard to say, for
> instance preventing the Linux OOM killer shutting postgres down would be
> infinite I guess :-)

Hmm.  Well, those numbers seem awfully high, for what you're doing,
then.  An admission control mechanism that's just letting everything
in shouldn't knock 5% off performance (let alone 30%).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
Mark Kirkwood
Date:
On 09/07/10 15:57, Robert Haas wrote:
> Hmm.  Well, those numbers seem awfully high, for what you're doing,
> then.  An admission control mechanism that's just letting everything
> in shouldn't knock 5% off performance (let alone 30%).
>
>    

Yeah it does, on the other hand both Josh and I were trying to elicit 
the worst case overhead.

Cheers

Mark


Re: Admission Control

From
Robert Haas
Date:
On Fri, Jul 9, 2010 at 12:03 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> On 09/07/10 15:57, Robert Haas wrote:
>>
>> Hmm.  Well, those numbers seem awfully high, for what you're doing,
>> then.  An admission control mechanism that's just letting everything
>> in shouldn't knock 5% off performance (let alone 30%).
>
> Yeah it does, on the other hand both Josh and I were trying to elicit the
> worst case overhead.

Even so...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Admission Control

From
"Kevin Grittner"
Date:
Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
> Purely out of interest, since the old repo is still there, I had a
> quick look at measuring the overhead, using 8.4's pgbench to run
> two custom scripts: one consisting of a single 'SELECT 1', the
> other having 100 'SELECT 1' - the latter being probably the worst
> case scenario. Running 1,2,4,8 clients and 1000-10000 transactions
> gives an overhead in the 5-8% range [1] (i.e transactions/s
> decrease by this amount with the scheduler turned on [2]). While a
> lot better than 30% (!) it is certainly higher than we'd like.
Hmmm...  In my first benchmarks of the serializable patch I was
likewise stressing a RAM-only run to see how much overhead was added
to a very small database transaction, and wound up with about 8%. 
By profiling where the time was going with and without the patch,
I narrowed it down to lock contention.  Reworking my LW locking
strategy brought it down to 1.8%.  I'd bet there's room for similar
improvement in the "active transaction" limit you describe. In fact,
if you could bring the code inside blocks of code already covered by
locks, I would think you could get it down to where it would be hard
to find in the noise.
-Kevin


Re: Admission Control

From
Mark Kirkwood
Date:
On 10/07/10 03:54, Kevin Grittner wrote:
> Mark Kirkwood<mark.kirkwood@catalyst.net.nz>  wrote:
>
>    
>> Purely out of interest, since the old repo is still there, I had a
>> quick look at measuring the overhead, using 8.4's pgbench to run
>> two custom scripts: one consisting of a single 'SELECT 1', the
>> other having 100 'SELECT 1' - the latter being probably the worst
>> case scenario. Running 1,2,4,8 clients and 1000-10000 transactions
>> gives an overhead in the 5-8% range [1] (i.e transactions/s
>> decrease by this amount with the scheduler turned on [2]). While a
>> lot better than 30% (!) it is certainly higher than we'd like.
>>      
>
> Hmmm...  In my first benchmarks of the serializable patch I was
> likewise stressing a RAM-only run to see how much overhead was added
> to a very small database transaction, and wound up with about 8%.
> By profiling where the time was going with and without the patch,
> I narrowed it down to lock contention.  Reworking my LW locking
> strategy brought it down to 1.8%.  I'd bet there's room for similar
> improvement in the "active transaction" limit you describe. In fact,
> if you could bring the code inside blocks of code already covered by
> locks, I would think you could get it down to where it would be hard
> to find in the noise.
>
>    

Yeah, excellent suggestion - I suspect there is room for considerable 
optimization along the lines you suggest... at the time the focus was 
heavily biased toward a purely DW workload where the overhead vanished 
against large plan and execute times, but this could be revisited. 
Having said that I suspect a re-architect is needed for a more 
wideranging solution suitable for Postgres (as opposed to Bizgres or 
Greenplum)

Cheers

Mark