Re: PostgreSQL Tuning Results - Mailing list pgsql-hackers

From Ron Mayer
Subject Re: PostgreSQL Tuning Results
Date
Msg-id POEDIPIPKGJJLDNIEMBEGEDMCBAA.ron@intervideo.com
Whole thread Raw
In response to Re: PostgreSQL Tuning Results  ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>)
Responses Re: PostgreSQL Tuning Results
List pgsql-hackers
Christopher Kings-Lynne wrote:
>
>I reckon that sort_mem is the hardest thing to optimise1
>

Agreed... in part because it depends a lot on the query.

Also, if I understand correctly sort_mem not only affects sorts
but also hash table stuff as well, right?  If that's true for
the new hash aggregates, I think this means large sort_mem settings 
will become even more useful for data-warehouse-type applications.


One thing I've been wondering, is if "sort_mem" could be 
per connection/backend-process instead of per sorting operation 
so that sort_mem could be set more aggressivelly without running out
of memory so easily with large queries.

If that's not possible (i.e. one couldn't know how many simultaneous 
sorts are needed beforehand), how about only let the first one or 
two get all the memory and make the rest use a smaller one.


Anecdote:
I have a reasonably large data warehouse (1e6 to 1e8 rows
in various tables) with quite a bit of data (500k rows) added
each day.  A lot of processing (7 hours/day) is spent loading
data and generating various aggregates.  In a couple places
in the ETL part of the data warehouse code I have:
 set sort_mem = 250000; /// something that only needs a single sort set sort_mem =  65536; ... set sort_mem = 4096; ///
someugly aggregate-creating join generated by a reporting tool  set sort_mem = 65536;
 

Ron


pgsql-hackers by date:

Previous
From: Bradley McLean
Date:
Subject: plpython attypmod broken in MODIFY trigger returns
Next
From: Justin Clift
Date:
Subject: Re: Brain dump: btree collapsing