Re: Allow sorts to use more available memory - Mailing list pgsql-performance

From Robert Schnabel
Subject Re: Allow sorts to use more available memory
Date
Msg-id 4E6E8606.9070201@missouri.edu
Whole thread Raw
In response to Re: Allow sorts to use more available memory  (Andy Colson <andy@squeakycode.net>)
List pgsql-performance
On 9/12/2011 1:57 PM, Andy Colson wrote:
> On 9/12/2011 1:22 PM, Robert Schnabel wrote:
>> On 9/12/2011 12:57 PM, Shaun Thomas wrote:
>>> On 09/12/2011 12:47 PM, Andy Colson wrote:
>>>
>>>> work_mem is not the total a query can use. I believe each step can
>>>> use that much, and each backend can use it for multiple bits. So if
>>>> you had two backends, each doing 2 sorts, you'd use 2*2 = 4 * 2GB =
>>>> 8GB.
>>> Exactly. Find a big query somewhere in your system. Use EXPLAIN to
>>> examine it. Chances are, that one query has one or more sorts. Each one
>>> of those gets its own work_mem. Each sort. The query have four sorts? It
>>> may use 4*work_mem. On a whim a while back, I doubled our 8MB setting to
>>> 16MB on a test system. During a load test, the machine ran out of
>>> memory, swapped out, and finally crashed after the OOM killer went nuts.
>>>
>>> Set this value *at your own risk* and only after *significant* testing.
>>> Having it too high can have rather unexpected consequences. Setting it
>>> to 1 or 2GB, unless you have VERY few threads, or a TON of memory, is a
>>> very, very bad idea.
>>>
>> Yep, I know. But in the context of the data warehouse where *I'm the
>> only user* and I have a query that does, say 4 large sorts like
>> http://explain.depesz.com/s/BrAO and I have 32GB RAM I'm not worried
>> about using 8GB or 16GB in the case of work_mem = 4GB. I realize the
>> query above only used 1.9GB for the largest sort but I know I have other
>> queries with 1 or 2 sorts that I've watched go to disk.
>>
>> Bob
>>
>>
>>
>>
> Huge guess here, cant see select or ddl, but looks like all the tables
> are sequential scans.  It might help to add an index or two, then the
> table joins could be done much more efficiently with with a lot less
> memory.
>
> -Andy
In this case I doubt it.  Basically what these queries are doing is
taking table1 (~30M rows) and finding all the rows with a certain
condition.  This produces ~15M rows.  Then I have to find all of those
15M rows that are present in table2.  In the case of the query above
this results in 1.79M rows.  Basically, the 15M rows that meet the
condition for table1 have matching rows spread out over 10 different
tables (table2's).

Actually, you just gave me an idea.  When I generate the "table1" I can
probably add a field that tells me which "table2" it came from for each
row that satisfies my criteria.  Sometimes just having someone else make
you think is very productive. :-)

Thanks
Bob


pgsql-performance by date:

Previous
From: Robert Schnabel
Date:
Subject: Re: Allow sorts to use more available memory
Next
From: david@lang.hm
Date:
Subject: Re: Raid 5 vs Raid 10 Benchmarks Using bonnie++