Re: Config parameters - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Config parameters
Date
Msg-id 1167767485.29707.25.camel@state.g2switchworks.com
Whole thread Raw
In response to Re: Config parameters  ("Jeremy Haile" <jhaile@fastmail.fm>)
Responses Re: Config parameters
List pgsql-performance
On Tue, 2007-01-02 at 13:19, Jeremy Haile wrote:
> Thanks for the information!
>
> Are there any rule-of-thumb starting points for these values that you
> use when setting up servers?  I'd at least like a starting point for
> testing different values.
>
> For example, I'm sure setting a default work_mem of 100MB is usually
> overkill - but is 5MB usually a reasonable number?  20MB?  My system
> does not have a huge number of concurrent users, but they are hitting
> large tables.  I'm not sure what numbers people usually use here
> successfully.

The setting for work_mem is very dependent on how many simultaneous
connections you'll be processing at the same time, and how likely they
are to be doing sorts.

If you'll only ever have 5 connections to a database on a machine with a
lot of memory, then setting it to 100M is probably fine.  Keep in mind,
the limit is PER SORT, not per query.  An upper limit of about 25% of
the machine's total memory is a good goal for how big to size work_mem.

So, on a 4 Gig machine you could divide 1G (25%) by the total possible
connections, then again by the average number of sorts you'd expect per
query / connection to get an idea.

Also, you can set it smaller than that, and for a given connection, set
it on the fly when needed.

<CONNECT>
set work_mem=1000000;
select .....
<DISCONNECT>

And you run less risk of blowing out the machine with joe user's random
query.

> For maintenance_work_mem, I turned off autovacuum to save on
> performance, but run a vacuum analyze once an hour.  My current database
> characteristics are heavy insert (bulk inserts every 5 minutes) and
> medium amount of selects on large, heavily indexed tables.

Did you turn off stats collection as well?  That's really the major
performance issue with autovacuum, not autovacuum itself.  Plus, if
you've got a table that really needs vacuuming every 5 minutes to keep
the database healthy, you may be working against yourself by turning off
autovacuum.

I.e. the cure may be worse than the disease.  OTOH, if you don't delete
/ update often, then don't worry about it.

> For temp_buffers - any rule of thumb starting point?  What's the best
> way to evaluate if this number is adjusted correctly?

Haven't researched temp_buffers at all.

> For random_page_cost - is the default of 4 pretty good for most drives?
> Do you usually bump it up to 3 on modern servers?  I've usually done
> internal RAID setups, but the database I'm currently working on is
> hitting a SAN over fiber.

random_page_cost is the hardest to come up with the proper setting.  If
you're hitting a RAID10 with 40 disk drives or some other huge drive
array, you might need to crank up random_page_cost to some very large
number, as sequential accesses are often preferred there.  I believe
there were some posts by Luke Lonergan (sp) a while back where he had
set random_page_cost to 20 or something even higher on a large system
like that.

On data sets that fit in memory, the cost nominally approaces 1.  On
smaller work group servers with a single mirror set for a drive
subsystem and moderate to large data sets, I've found values of 1.4 to
3.0 to be reasonable, depending on the workload.

> I realize that these values can vary a lot based on a variety of factors
> - but I'd love some more advice on what good rule-of-thumb starting
> points are for experimentation and how to evaluate whether the values
> are set correctly. (in the case of temp_buffers and work_mem especially)

To see if the values are good or not, run a variety of your worst
queries on the machine while varying the settings to see which run
best.  That will at least let you know if you're close.  While you can't
change buffers on the fly, you can change work_mem and random_page_cost
on the fly, per connection, to see the change.


pgsql-performance by date:

Previous
From: "Jeremy Haile"
Date:
Subject: Re: Config parameters
Next
From: "Jeremy Haile"
Date:
Subject: Re: Config parameters