Thread: Got that new server, now it's time for config!

Got that new server, now it's time for config!

From

"Carlo Stonebanks"

Date:

22 March 2010, 19:58:37

Here we go again!

Based on recommendations made here, I got my client to migrate off of our
Windows 2003 Server x64 box to a new Linux box.

# CENTOS 5.4
# Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux
# pgsql 8.3.10, 8 CPUs, 48GB RAM
# RAID 10, 4 Disks

Below are the config values of this production server (those not listed are
those stubbed out) . Sadly, in an attempt to improve the server's
performance, someone wiped out all of the changes I had made to date, along
with comments indicating previous values, reason for the change, etc.

This is a data warehouse production server, used for ETL. 500 GB database,
approx 8000 tables and growing, although the vast majority of them are the
original import resource tables and are rarely accessed. The actual core
data is about 200 tables, consisting of millions of rows. Data importing and
content management is done via a 15,000 line TCL import scripts and
application base (as this is ETL with fuzzy logic, not just COPY... FROM...)
.

So, we have the hardware, we have the O/S - but I think our config leaves
much to be desired. Typically, our planner makes nad decisions, picking seq
scan over index scan, where index scan has a better result.

Can anyone see any obvious faults?

Carlo

autovacuum = on
autovacuum_analyze_scale_factor = 0.05
autovacuum_analyze_threshold = 1000
autovacuum_naptime = 1min
autovacuum_vacuum_cost_delay =  50
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 1000
bgwriter_lru_maxpages = 100
checkpoint_segments = 128
checkpoint_warning = 290s
client_min_messages =  debug1
datestyle = 'iso, mdy'
default_statistics_target = 250
default_text_search_config = 'pg_catalog.english'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
listen_addresses = '*'
log_destination = 'stderr'
log_error_verbosity =  verbose
log_line_prefix = '%t '
log_min_error_statement =  debug1
log_min_messages = debug1
logging_collector = on
maintenance_work_mem = 256MB
max_connections = 100
max_fsm_relations = 1000
max_locks_per_transaction = 128
port = 5432
shared_buffers = 4096MB
shared_preload_libraries = '$libdir/plugins/plugin_debugger.so'
track_counts = on
vacuum_cost_delay = 5
wal_buffers = 4MB
wal_sync_method = open_sync
work_mem = 64MB

Re: Got that new server, now it's time for config!

From

Dan Harris

Date:

22 March 2010, 20:23:13

On 3/22/10 4:36 PM, Carlo Stonebanks wrote:
> Here we go again!
>
> Can anyone see any obvious faults?
>
> Carlo
>
> maintenance_work_mem = 256MB
I'm not sure how large your individual tables are, but you might want to
bump this value up to get faster vacuums.
> max_fsm_relations = 1000
I think this will definitely need to be increased
> work_mem = 64MB
Most data warehousing loads I can think of will need more work_mem, but
this depends on how large of data sets you are planning to sort.

Re: Got that new server, now it's time for config!

From

Greg Smith

Date:

23 March 2010, 01:12:55

Carlo Stonebanks wrote:
> So, we have the hardware, we have the O/S - but I think our config
> leaves much to be desired. Typically, our planner makes nad decisions,
> picking seq scan over index scan, where index scan has a better result.
>

You're not setting effective_cache_size, so I wouldn't expect it to ever
choose an index scan given the size of your data set.  The planner
thinks that anything bigger than 128MB isn't likely to fit in RAM by
default, which favors sequential scans.  That parameter should probably
be >24GB on your server, so it's off by more than two orders of magnitude.

> wal_sync_method = open_sync

This is a scary setting to be playing with on Linux when using ext3
filesystems due to general kernel bugginess in this area.  See
http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php for an
example.  I wouldn't change this from the default in your position if
using that filesystem.

I'd drastically increase effective_cache_size, put wal_sync_method back
to the default, and then see how things go for a bit before tweaking
anything else.  Nothing else jumped out as bad in your configuration
besides the extremely high logging levels, haven't looked at it that
carefully yet though.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us

Re: Got that new server, now it's time for config!

From

Scott Mead

Date:

23 March 2010, 10:08:31

On Tue, Mar 23, 2010 at 12:12 AM, Greg Smith <greg@2ndquadrant.com> wrote:

Carlo Stonebanks wrote:
So, we have the hardware, we have the O/S - but I think our config leaves much to be desired. Typically, our planner makes nad decisions, picking seq scan over index scan, where index scan has a better result.

You're not setting effective_cache_size, so I wouldn't expect it to ever choose an index scan given the size of your data set. The planner thinks that anything bigger than 128MB isn't likely to fit in RAM by default, which favors sequential scans. That parameter should probably be >24GB on your server, so it's off by more than two orders of magnitude.

I'm curious why you've set:

log_min_error_statement = debug1

log_min_messages = debug1

client_min_messages = debug1

Although not directly addressing the problem of using index scans, this is going to be causing lots of message verbosity, possibly (based on your rate) enough to clobber the disks more than you need to.

-Scott M

wal_sync_method = open_sync

This is a scary setting to be playing with on Linux when using ext3 filesystems due to general kernel bugginess in this area. See http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php for an example. I wouldn't change this from the default in your position if using that filesystem.

I'd drastically increase effective_cache_size, put wal_sync_method back to the default, and then see how things go for a bit before tweaking anything else. Nothing else jumped out as bad in your configuration besides the extremely high logging levels, haven't looked at it that carefully yet though.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance