Re: Feature Request --- was: PostgreSQL Performance Tuning - Mailing list pgsql-performance

From Josh Berkus
Subject Re: Feature Request --- was: PostgreSQL Performance Tuning
Date
Msg-id 200705010923.47951.josh@agliodbs.com
Whole thread Raw
In response to Re: Feature Request --- was: PostgreSQL Performance Tuning  (Greg Smith <gsmith@gregsmith.com>)
Responses Re: Feature Request --- was: PostgreSQL Performance Tuning  (Carlos Moreno <moreno_pg@mochima.com>)
Re: Feature Request --- was: PostgreSQL Performance Tuning  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-performance
Greg,

> 1) Collect up data about their system (memory, disk layout), find out a
> bit about their apps/workload, and generate a config file based on that.

We could start with this.  Where I bogged down is that collecting system
information about several different operating systems ... and in some cases
generating scripts for boosting things like shmmax ... is actually quite a
large problem from a slog perspective; there is no standard way even within
Linux to describe CPUs, for example. Collecting available disk space
information is even worse.   So I'd like some help on this portion.

I actually have algorithms which are "good enough to start with" for most of
the important GUCs worked out, and others could be set through an interactive
script ("Does your application require large batch loads involving thousands
or millions of updates in the same transaction?"  "How large (GB) do you
expect your database to be?")

> 2) Connect to the database and look around.  Study the tables and some
> their stats, make some estimates based on what your find, produce a new
> config file.

I'm not sure that much more for (2) can be done than for (1).  Tables-on-disk
don't tell us much.

> 3) Monitor the database while it's doing its thing.  See which parts go
> well and which go badly by viewing database statistics like pg_statio.
> From that, figure out where the bottlenecks are likely to be and push more
> resources toward them.  What I've been working on lately is exposing more
> readouts of performance-related database internals to make this more
> practical.

We really should collaborate on that.

> When first exposed to this problem, most people assume that (1) is good
> enough--ask some questions, look at the machine, and magically a
> reasonable starting configuration can be produced.  It's already been
> pointed out that anyone with enough knowledge to do all that can probably
> spit out a reasonable guess for the config file without help.

But that's actually more than most people already do.  Further, if you don't
start with a "reasonable" configuration, then it's difficult-impossible to
analyze where your settings are out-of-whack; behavior introduced by some
way-to-low settings will mask any other tuning that needs to be done.  It's
also hard/impossible to devise tuning algorithms that work for both gross
tuning (increase shared_buffers by 100x) and fine tuning (decrease
bgwriter_interval to 45ms).

So whether or not we do (3), we need to do (1) first.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

pgsql-performance by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Query performance problems with partitioned tables
Next
From: Andreas Haumer
Date:
Subject: Re: sytem log audit/reporting and psql