Re: improvements to pgtune - Mailing list pgsql-hackers

From Greg Smith
Subject Re: improvements to pgtune
Date
Msg-id 4DC6C9A3.8060700@2ndquadrant.com
Whole thread Raw
In response to Re: improvements to pgtune  (Shiv <rama.theone@gmail.com>)
Responses Re: improvements to pgtune  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Shiv wrote:
>  So my exams are over now and am fully committed to the project in 
> terms of time. I have started compiling a sort of personal todo for 
> myself. I agree with your advice to start the project with small steps 
> first. (I have a copy of the code and am trying to glean as much of it 
> as I can)

I just fixed a couple of bugs in the program that were easier to correct 
than explain.  The code changes have been pushed to the github repo.  
I've also revised the output format to be a lot nicer.  There's a UI 
shortcut you may find useful too; the program now takes a single input 
parameter as the input file, outputting to standard out.

So a sample run might look like this now:

$ ./pgtune postgresql.conf.sample
[old settings]
#------------------------------------------------------------------------------
# pgtune wizard run on 2011-05-08
# Based on 2060728 KB RAM in the server
#------------------------------------------------------------------------------

default_statistics_target = 100
maintenance_work_mem = 120MB
checkpoint_completion_target = 0.9
effective_cache_size = 1408MB
work_mem = 12MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 480MB
max_connections = 80

>  I would really appreciate your reply to Josh's thoughts. It would 
> help me understand the variety of tasks and a possible ordering for me 
> to attempt them.
> Josh's comments :/ "What would you list as the main things pgtune 
> doesn't cover right now?  I have my own list, but I suspect that yours 
> is somewhat different./
> /
> /
> /I do think that autotuning based on interrogating the database is 
> possible.  However, I think the way to make it not be a tar baby is to 
> tackle it one setting at a time, and start with ones we have the most 
> information for.  One of the real challenges there is that some data 
> can be gleaned from pg_* views, but a *lot* of useful performance data 
> only shows up in the activity log, and then only if certain settings 
> are enabled."/

I just revised the entire TODO file (which is now TODO.rst, formatted in 
ReST markup:  http://docutils.sourceforge.net/rst.html ; test with 
"rst2html TODO.rst > TODO.html and look at the result).  It should be 
easier to follow the flow of now, and it's organized in approximately 
the order I think things need to get finished in.

There are few major areas for expansion that might happen on this 
program to choose from.  I was thinking about doing them in this order:

1) Fix the settings validation and limits.  I consider this a good place 
to start on hacking the code.  it's really necessary work eventually, 
and it's easier to get started with than the other ideas.

2) Improve internals related to tracking things like memory and 
connections so they're easier to pass around the program.  Adding a 
"platform" class is what I was thinking of.  See the "Estimating shared 
memory usage" section of the TODO for more information.  Add PostgreSQL 
version as another input to that.

3) Improve the settings model used for existing parameters.  Right now 
people have reported that the work_mem settings suggested in particular 
are too high for many servers.  Ideas about why that is are in the 
TODO.  (This really requires the platform change be done first, or the 
code will be too hard to write/maintain)

4) Estimate memory used by the configuration and output sysctl.conf 
files.  (Needs platform change too)

5) Add tuning suggestions for new parameters.  The most obvious ideas 
all involve adding common logging changes.

6) Create some new UIs for running the program.  A text-based program 
that asked questions (a 'wizard') or a GUI program doing the same are 
two common suggestions.

The ideas Josh was talking about for interrogating the database for 
things are all a long ways off from the current state of the code being 
able to support them.  If (1) through (3) here were done, that whole 
direction starts with (5) and then runs further that way.  That might be 
a valid direction to move next instead of the (4), (6) I've listed 
here.  You'd have finished something that taught enough about how the 
existing program works to be able to make some more difficult design 
decisions about fitting new features into it.

If you really want to get right into live server analysis, there's no 
way for that to fit into the current program yet.  And I don't think 
you'll get enough practice to see how it would without doing some more 
basic work first.  You might as well write something new if that's your 
goal, and expect that you may not finish anything useful by the end of 
the summer.  If you want to complete a project that results in code that 
people absolutely will use, the more boring plan I've outlined goes that 
way.  One of the secrets to software development is that ideas for 
complicated features rarely result in software that gets released, while 
working on simpler programs that don't aim so high leads to software 
that ships to the world and finds users.  The only reason pgtune is now 
available in packaged form on multiple operating systems is that I 
ignored all advice about aiming for a complicated tool and instead wrote 
a really simple one.  That was hard enough to finish.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: could not write block & xlog flush request 3FD/0 is not satisfied
Next
From: Tom Lane
Date:
Subject: Re: Questions about the internal of fastpath function call