Re: Admission Control - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Admission Control
Date
Msg-id AANLkTinB6CnWA9evgQqyBbf7B3GOrh4DhSsau7kPcpR2@mail.gmail.com
Whole thread Raw
In response to Re: Admission Control  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Admission Control  ("Ross J. Reedstrom" <reedstrm@rice.edu>)
List pgsql-hackers
On Sat, Jun 26, 2010 at 11:59 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Sat, Jun 26, 2010 at 11:37:16AM -0400, Robert Haas wrote:
>> On Sat, Jun 26, 2010 at 11:03 AM, Martijn van Oosterhout
>> > (It doesn't help in situations where you can't accurately predict
>> > memory usage, like hash tables.)
>>
>> Not sure what you mean by this part.  We already predict how much
>> memory a hash table will use.
>
> By this I mean where the memory usage of the HashAggregate depends on
> how many groups there are, and it's sometimes very difficult to predict
> that beforehand. Though maybe that got fixed.

Oh, I see.  Well, yeah, it's possible the estimates aren't that good
in that case.  I think it's a fairly rare query that has more than one
aggregate in it, though, so you're probably OK as long as you're not
TOO far off - where you can really use up a lot of memory, I think, is
on a query that has lots of sorts or hash joins.

> Another issue is cached plans. Say there is increased memory pressure,
> at what point do you start replanning existing plans?

The obvious thing to do would be to send an invalidation message
whenever you changed the system-wide cost value for use of memory, but
maybe if you're changing it in small increments you'd want to be a bit
more selective.

> While this does have the advantage of being relatively simple to
> implement, I think it would be a bitch to tune...

I'm not sure.  What does seem clear is that it's fundamentally at odds
with the "admission control" approach Kevin is advocating.  When you
start to run short on a resource (perhaps memory), you have to decide
between (a) waiting for memory to become available and (b) switching
to a more memory-efficient plan.  The danger of (b) is that using less
memory probably means using more of some other resource, like CPU or
disk, and now you've just switched around which release you're
overloading - but on the other hand, if the difference in CPU/disk is
small and the memory savings is large, maybe it makes sense.  Perhaps
in the end we'll find we need both capabilities.

I can't help feeling like some good instrumentation would be helpful
in answering some of these questions, although I don't know where to
put it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Admission Control
Next
From: Robert Haas
Date:
Subject: Re: parallelizing subplan execution (was: explain and PARAM_EXEC)