Re: Automatically setting work_mem - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Automatically setting work_mem
Date
Msg-id 200603171116.02841.josh@agliodbs.com
Whole thread Raw
In response to Re: Automatically setting work_mem  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Automatically setting work_mem
List pgsql-hackers
Tom,

> My own thoughts about the problems with our work_mem arrangement are
> that the real problem is the rule that we can allocate work_mem per sort
> or hash operation; this makes the actual total memory use per backend
> pretty unpredictable for nontrivial queries.  I don't know how to fix
> this though.  The planner needs to know the work_mem that will be used
> for any one of these operations in order to estimate costs, so simply
> trying to divide up work_mem among the operations of a completed plan
> tree is not going to improve matters.

Yes ... the unpredictability is the problem:
(1) We can only allocate the # of connections and default work_mem per 
operation.
(2) There are a variable # of concurrent queries per connection (0..1)
(3) Each query has a variable # of operations requiring work_mem, which 
will require a variable amount of work_mem.  If your malloc is good, this 
is limited to concurrent operations, but for some OSes this is all 
operations per query.  Thus the former uses 0..3xwork_mem per query and 
the latter 0..7x in general practice.

Overall, this means that based on a specific max_connections and work_mem, 
a variable amount between 1*work_mem and (connections*3*work_mem) memory 
may be needed at once.  Since the penalty for overallocating RAM is severe 
on most OSes, DBAs are forced to allow for the worst case.  This results 
in around 2/3 underallocation on systems with unpredictable loads.  This 
means that, even on heavily loaded DB systems, most of the time you're 
wasting a big chunk of your RAM.

Simon and I met about this (and other stuff) at the GreenPlum offices last 
summer.   The first plan we came up with is query queueing.  Query 
queueing would eliminate variability (2), which would then make the only 
variability one of how much work_mem would be needed per query, reducing 
(but not eliminating) the underallocation.   Additionally, we thought to 
tie the query queues to ROLES, which would allow the administrator to 
better control how much work_mem per type of query was allowed.  It would 
also allow admins to balance priorities better on mixed-load machines.

Mind you, I'm also thinking that on enterprise installations with 
multi-department use of the database, the fact that work_mem is 
inalienably USERSET is also an allocation problem.   One user with a SET 
command can blow all of your resource planning away.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Seperate command-line histories for seperate databases
Next
From: Tom Lane
Date:
Subject: Re: Automatically setting work_mem