Re: improvements to pgtune - Mailing list pgsql-hackers

From Shiv
Subject Re: improvements to pgtune
Date
Msg-id BANLkTikKJW11_HbQKdytJq28TmzH8WpA_A@mail.gmail.com
Whole thread Raw
In response to Re: improvements to pgtune  (Shiv <rama.theone@gmail.com>)
Responses Re: improvements to pgtune  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-hackers
Hi Greg,
 So my exams are over now and am fully committed to the project in terms of time. I have started compiling a sort of personal todo for myself. I agree with your advice to start the project with small steps first. (I have a copy of the code and am trying to glean as much of it as I can)
 I would really appreciate your reply to Josh's thoughts. It would help me understand the variety of tasks and a possible ordering for me to attempt them.
Josh's comments : "What would you list as the main things pgtune doesn't cover right now?  I have my own list, but I suspect that yours is somewhat different.

I do think that autotuning based on interrogating the database is possible.  However, I think the way to make it not be a tar baby is to tackle it one setting at a time, and start with ones we have the most information for.  One of the real challenges there is that some data can be gleaned from pg_* views, but a *lot* of useful performance data only shows up in the activity log, and then only if certain settings are enabled."
Regards,
Shiv


On Thu, Apr 28, 2011 at 9:34 PM, Shiv <rama.theone@gmail.com> wrote:
That's some great starting advice there. I have a couple of final exams in the next 36 hours. Will get to work almost immediately after that.
I will definitely take small steps before going for some of the tougher tasks. I would of-course like this conversation to go on, so I can see a more comprehensive TODO list.
One of my first tasks on GSoC is to make sure I create a good project specification document. So there can be definite expectations and targets. This conversation helps me do that!
Regards,
Shiv


On Thu, Apr 28, 2011 at 9:50 AM, Greg Smith <greg@2ndquadrant.com> wrote:
Shiv wrote:
 On the program I hope to learn as much about professional software engineering principles as PostgreSQL. My project is aimed towards extending and hopefully improving upon pgtune. If any of you have some ideas or thoughts to share. I am all ears!!

Well, first step on the software engineering side is to get a copy of the code in a form you can modify.  I'd recommend grabbing it from https://github.com/gregs1104/pgtune ; while there is a copy of the program on git.postgresql.org, it's easier to work with the one on github instead.  I can push updates over to the copy on postgresql.org easily enough, and that way you don't have to worry about getting an account on that server.

There's a long list of suggested improvements to make at https://github.com/gregs1104/pgtune/blob/master/TODO

Where I would recommend getting started is doing some of the small items on there, some of which I have already put comments into the code about but just not finished yet.  Some examples:

-Validate against min/max
-Show original value in output
-Limit shared memory use on Windows (see notes on shared_buffers at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for more information)
-Look for postgresql.conf file using PGDATA environment variable
-Look for settings files based on path of the pgtune executable
-Save a settings reference files for newer versions of PostgreSQL (right now I only target 8.4) and allow passing in the version you're configuring.

A common mistake made by GSOC students is to dive right in to trying to make big changes.  You'll be more successful if you get practice at things like preparing and sharing patches on smaller changes first.

At the next level, there are a few larger features that I would consider valuable that are not really addressed by the program yet:

-Estimate how much shared memory is used by the combination of settings.  See Table 17-2 at http://www.postgresql.org/docs/9.0/static/kernel-resources.html ; those numbers aren't perfect, and improving that table is its own useful project.  But it gives an idea how they fit together.  I have some notes at the end of the TODO file on how I think the information needed to produce this needs to be passed around the inside of pgtune.

-Use that estimate to produce a sysctl.conf file for one platform; Linux is the easiest one to start with.  I've attached a prototype showing how to do that, written in bash.

-Write a Python-TK or web-based front-end for the program.

Now that I know someone is going to work on this program again, I'll see what I can do to clean some parts of it up.  There are a couple of things it's easier for me to just fix rather than to describe, like the way I really want to change how it adds comments to the settings it changes.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



#!/bin/bash

# Output lines suitable for sysctl configuration based
# on total amount of RAM on the system.  The output
# will allow up to 50% of physical memory to be allocated
# into shared memory.

# On Linux, you can use it as follows (as root):
#
# ./shmsetup >> /etc/sysctl.conf
# sysctl -p

# Early FreeBSD versions do not support the sysconf interface
# used here.  The exact version where this works hasn't
# been confirmed yet.

page_size=`getconf PAGE_SIZE`
phys_pages=`getconf _PHYS_PAGES`

if [ -z "$page_size" ]; then
 echo Error:  cannot determine page size
 exit 1
fi

if [ -z "$phys_pages" ]; then
 echo Error:  cannot determine number of memory pages
 exit 2
fi

shmall=`expr $phys_pages / 2`
shmmax=`expr $shmall \* $page_size`

echo \# Maximum shared segment size in bytes
echo kernel.shmmax = $shmmax
echo \# Maximum number of shared memory segments in pages
echo kernel.shmall = $shmall



pgsql-hackers by date:

Previous
From: Dan Ports
Date:
Subject: Re: patch: fix race in SSI's CheckTargetForConflictsIn
Next
From: Dimitri Fontaine
Date:
Subject: Re: Why not install pgstattuple by default?