Re: Query performance over a large proportion of data

From: Kevin Grittner
Subject: Re: Query performance over a large proportion of data
Date: ,
Msg-id: 49B69E1E.EE98.0025.0@wicourts.gov
(view: Whole thread, Raw)
In response to: Query performance over a large proportion of data  ("Steve McLellan")
Responses: Re: Query performance over a large proportion of data  (Tom Lane)
List: pgsql-performance

Tree view

Query performance over a large proportion of data  ("Steve McLellan", )
 Re: Query performance over a large proportion of data  (Scott Marlowe, )
 Re: Query performance over a large proportion of data  ("Kevin Grittner", )
  Re: Query performance over a large proportion of data  (Tom Lane, )
 Re: Query performance over a large proportion of data  (Scott Marlowe, )
 Re: Query performance over a large proportion of data  (Tom Lane, )
 Re: Query performance over a large proportion of data  (Steve McLellan, )
 Re: Query performance over a large proportion of data  (Steve McLellan, )
  Re: Query performance over a large proportion of data  (Scott Marlowe, )
 Re: Query performance over a large proportion of data  (Steve McLellan, )
 Re: Query performance over a large proportion of data  (decibel, )
  Re: Query performance over a large proportion of data  (Matteo Beccati, )

>>> "Steve McLellan" <> wrote:
> The server itself is a dual-core 3.7GHz Xeon Dell (each core
> reporting 2 logical CPUs) running an amd64 build of FreeBSD 6.2, and
> postgres 8.3.5 built from source. It's got 400GB storage in RAID-5
> (on 5 disks). It has 8GB of physical RAM. I'm able to use about 6GB
> of that for my own purposes; the server doesn't do much else but
> replicates a very low-usage mysql database.

> shared_buffers=1200MB

You might want to play with this -- that's not a bad starting point,
but your best performance with your load could be on either side of
that value.

> work_mem = 100MB

Probably kinda high, especially if you expect a lot of connections.
This much memory can be concurrently used, possibly more than once, by
each active connection.

> fsync = off

Don't use this setting unless you can afford to lose your entire
database cluster.  We use it for initial (repeatable) loads, but not
much else.

> enable_seqscan = off

Not a good idea; some queries will optimize better with seqscans.
You can probably get the behavior you want using other adjustments.

> effective_cache_size = 2000MB

From what you said above, I'd bump this up to 5GB or more.

You probably need to reduce random_page_cost.  If your caching is
complete enough, you might want to set it equal to seq_page_cost
(never set it lower that seq_page_cost!) and possibly reduce both of
these to 0.1.

Some people have had good luck with boosting cpu_tuple_cost and
cpu_index_tuple_cost.  (I've seen 0.5 for both recommended.)  I've
never had to do that, but if the earlier suggestions don't get good
plans, you might try that.

I hope that helps.

-Kevin


pgsql-performance by date:

From: Steve McLellan
Date:
Subject: Re: Query performance over a large proportion of data
From: Scott Marlowe
Date:
Subject: Re: Query performance over a large proportion of data