Home > mailing lists

Thread: Overhauling GUCS

Overhauling GUCS

From

Josh Berkus

Date:

30 May 2008, 22:38:29

Magnus and I had the fortune to share a plane flight when leaving pgCon. This gave us a chance to discuss my ideas
aroundreforming PostgreSQL

configuration at some length. Here's what we came up with (Magnus will
presumably correct the parts I got wrong):

Currently, PostgreSQL,conf and our set of GUC configurations suffer from
4 large problems:

1. Most people have no idea how to set these.
2. The current postgresql.conf file is a huge mess of 194 options, the
vast majority of which most users will never touch.
3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
and the settings.sgml), which are only synched with each other manually.
4. We don't seem to be getting any closer to autotuning.

We think all of these are solvable for 8.4. We also think that it's
vitally important than any overhaul of the GUCS to be completed in one
version to minimize the pain involved.

Here's a list of the things we want to change. It's all a package and
should make sense if you take all the changes as a whole.

1) Add several pieces of extra information to guc.c in the form of extra
"gettext" commands: default value, subcategory, long description,
recommendations, enum lists.
2) Incorporate this data into pg_settings
3) Delete postgresql.conf.sample
4) Add a script called pg_generate_conf to generate a postgresql.conf
based on guc.c and command-line switches (rather than
postgresql.conf.sample), which would be called automatically by initdb.

For (4), pg_generate_conf would take the following switches:
-b , --basic = short conf file, listing only the 15-18 most commonly
changed options
-a , --advanced = conf file listing all 196+ options
-t, --terse = conf file lists only category headings and actual
settings, no comments
-n, --normal = conf file has category and subcategory settings, with
short desc comments
-v, --verbose = conf file lists full descriptions and recommendations in
comments with each option
-c "option = value" set specific option to specific value in the file
-f "filename" = take options and values from file "filename"

The default would be "-b, -n" with specific settings for shared_buffers
and wal_sync_method. The current postgresql.conf is a lot more like an
"-a, -v" file.

This would help us in the following ways:

A. there would now only be 2 places to maintain GUCS lists, the docs and
guc.c.
B. by having a generated postgresql.conf and an easy way to generate it,
writing autoconfiguration scripts (as well as shortcuts like SET
PERSISTENT) become vastly easier.
C. Most users would deal only with a limited set of 15-20 configuration
variables.

There's obviously some refinements needed, but what do people think of
the above general idea?

Re: Overhauling GUCS

From

Greg Smith

Date:

31 May 2008, 02:59:23

On Fri, 30 May 2008, Josh Berkus wrote:

> 1) Add several pieces of extra information to guc.c in the form of extra 
> "gettext" commands:  default value, subcategory, long description, 
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings

When you last brought this up in February (I started on a long reply to 
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00759.php that I 
never quite finished) the thing I got stuck on was how to deal with the 
way people tend to comment in these files as they change things.

One problem I didn't really see addressed by the improvements you're 
suggesting is how to handle migrating customized settings to a new version 
(I'm talking about 8.4->9.0 after this is in place, 8.3->8.4 is a whole 
different problem).  It would be nice to preserve "history" of what people 
did like in your examples (which look exactly like what I find myself 
running into in the field).  Now, that will get a lot easier just by 
virtue of having a smaller config file, but I think that adding something 
into pg_settings that allows saving user-added commentary would be a nice 
step toward some useful standardization on that side of things.  It would 
make future automated tools aimed at parsing and generating new files, as 
part of things like version upgrades, a lot easier if there was a standard 
way such comments were handled in addition to the raw data itself.

The other thing I'd like to see make its way into pg_settings, so that 
tools can operate on it just by querying the database, is noting what file 
the setting came from so that you can track things like include file 
usage.  I think with those two additions (comments and source file 
tracking) it would even be concievable to clone a working facsimile of 
even a complicated postgresql.conf file set remotely just by reading 
pg_settings.

While a bit outside of the part you're specifically aiming to improve 
here, if you could slip these two additions in I think it would be a boon 
to future writers of multi-server management tools as well.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Simon Riggs

Date:

31 May 2008, 04:32:19

On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:

> There's obviously some refinements needed, but what do people think of 
> the above general idea?

My initial reaction is that this is too much change, though I agree with
many of the points and understand it is well intended.

We have many supporters, though 90% of them seldom touch the database
from one release to the next. Many are dismayed that every time they do
we've fiddled with the knobs so some don't work anymore, some are
missing or renamed and there's a few new ones. So if they don't know
what they're doing it is frequently because we keep moving the
goalposts. 

We should also consider that most people with multiple databases are
running multiple versions of PostgreSQL. The main reason for that is
that we keep changing the behaviour of SQL between releases, so even if
you had a magic superfast upgrade utility, in perhaps 50% of cases they
won't use it because they have to do a full application re-test, which
costs time and money. 

Trying to be a Postgres DBA is hard when each new release is configured
differently to the last one and we have a major release about 3-5 more
frequently than Oracle/SQLServer. That is probably largest source of
questions and annoyance from the students on the courses I run.

So my viewpoint is that we should be aggressively adding new features,
yet be conservative in changing existing behaviour: provide options for
behaves-like-previous-release and keep the administrative interfaces as
similar as possible between releases.

If you wish to make changes, I would suggest that you simply reorder
some of the parameters in the .conf file, so that the ones you think are
important are grouped at the top.

Another suggestion would be to allow a #include style interface within
postgresql.conf. We can then do this:

#include standard_postgresql.conf

# now we put Josh's favourite GUCs as overrides on the above 
...

That keeps things simple because the standard_postgresql.conf looks
almost identical to what people are used to. It also provides a new
feature, allowing people to include site-wide defaults for various
settings to allow easy implementation of local policy. It also
demonstrates a best practice approach to managing GUCs.

Some other problems I see with GUCs

* It's not possible to set one parameter depending upon the setting of
another.

* It's always unclear which GUCs can be changed, and when. That is much
more infrequently understood than the meaning of them.

* We should rename effective_cache_size to something that doesn't sound
like it does what shared_buffers does

* There is no config verification utility, so if you make a change and
then try to restart and it won't, you are in trouble.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

31 May 2008, 12:15:36

Simon Riggs wrote:
> On Fri, 2008-05-30 at 17:34 -0700, Josh Berkus wrote:

> We have many supporters, though 90% of them seldom touch the database
> from one release to the next. Many are dismayed that every time they do
> we've fiddled with the knobs so some don't work anymore, some are
> missing or renamed and there's a few new ones.

I certainly agree we don't want to make it more difficult.

>
> So if they don't know
> what they're doing it is frequently because we keep moving the
> goalposts. 

No, its because they don't read the docs. We are not talking about 
dumping SQL here :) I don't feel it is correct to not perform such a 
needed cleanup for fear that some user can't be bothered to read the 
documentation.

> We should also consider that most people with multiple databases are
> running multiple versions of PostgreSQL. The main reason for that is
> that we keep changing the behaviour of SQL between releases, so even if
> you had a magic superfast upgrade utility, in perhaps 50% of cases they
> won't use it because they have to do a full application re-test, which
> costs time and money. 

This could certainly be a problem.

> 
> Trying to be a Postgres DBA is hard when each new release is configured
> differently to the last one and we have a major release about 3-5 more
> frequently than Oracle/SQLServer. That is probably largest source of
> questions and annoyance from the students on the courses I run.

That is the nature of the beast though. We are still learning, 
improving, engineering. It's part of progress of the database. I would 
suggest that your students (which is what I tell mine) not upgrade every 
release but instead try hanging out for 3 years per release.

> 
> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.
> 

I agree with this in principle but not on this argument.

> If you wish to make changes, I would suggest that you simply reorder
> some of the parameters in the .conf file, so that the ones you think are
> important are grouped at the top.

IMO, the only settings that should be in the postgresql.conf are the 
ones that require a restart. The rest should be gathered from pg_settings.

> 
> Another suggestion would be to allow a #include style interface within
> postgresql.conf. We can then do this:
> 
> #include standard_postgresql.conf
> 
> # now we put Josh's favourite GUCs as overrides on the above 
> ...

I think that could get very messy but Apache allows this. It isn't 
exactly a new idea and I wouldn't fight against it.

> Some other problems I see with GUCs
> 
> * It's not possible to set one parameter depending upon the setting of
> another.

That is getting a bit more complicated than I think we need. Unless I am 
misunderstanding what you are saying.

> 
> * It's always unclear which GUCs can be changed, and when. That is much
> more infrequently understood than the meaning of them.
> 

This is a definite problem. Even the docs are a bit difficult on this 
one. It should be a nice simple grid :) or like I said above, only goes 
in the conf if requires a restart.

Hmm,

SET effective_cache_size = '5000MB';
WARNING: You must issue a reload to the database for this to take effect.

> * We should rename effective_cache_size to something that doesn't sound
> like it does what shared_buffers does

Hmmm, I wonder if it is not the other way around. Although I know that 
one thing I run into is that a lot of people think effective_cache is an 
allocation.

shared_buffer_alloc?
shared_mem?

> 
> * There is no config verification utility, so if you make a change and
> then try to restart and it won't, you are in trouble.
> 

+1 +1

Sincerely,

Joshua D. Drake

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

31 May 2008, 12:15:45

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Many are dismayed that every time they do we've fiddled with the
> knobs so some don't work anymore, some are missing or renamed and
> there's a few new ones.
...
> * We should rename effective_cache_size

*raises eyebrow*

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200805311114
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhBa3UACgkQvJuQZxSWSsg6pACgu2EJOyzj+ik79QwzdTTfi8Qj
aW0AoNayz4sniFVFxnQFgnxh7dPB6QOD
=Uyap
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

Tom Lane

Date:

31 May 2008, 13:23:43

Josh Berkus <josh@agliodbs.com> writes:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from 
> 4 large problems:

> 1. Most people have no idea how to set these.
> 2. The current postgresql.conf file is a huge mess of 194 options, the 
> vast majority of which most users will never touch.
> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf, 
> and the settings.sgml), which are only synched with each other manually.
> 4. We don't seem to be getting any closer to autotuning.

The proposal doesn't actually solve any of those problems.

> Here's a list of the things we want to change.  It's all a package and 
> should make sense if you take all the changes as a whole.

> 1) Add several pieces of extra information to guc.c in the form of extra 
> "gettext" commands:  default value, subcategory, long description, 
> recommendations, enum lists.
> 2) Incorporate this data into pg_settings
> 3) Delete postgresql.conf.sample
> 4) Add a script called pg_generate_conf to generate a postgresql.conf 
> based on guc.c and command-line switches (rather than 
> postgresql.conf.sample), which would be called automatically by initdb.

I disagree with doing any of this.  It doesn't result in any useful
reduction in maintenance effort, and what it does do is make it
impossible to keep control over the detailed layout, formatting,
commenting etc in a sample postgresql.conf.  Nor do I think that
"generate a whole config file from scratch" is going to be a useful
behavior for tuning problems --- how will you merge it with what
you had before?
        regards, tom lane

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

31 May 2008, 14:51:47

On May 31, 2008, at 09:23, Tom Lane wrote:

>> 1. Most people have no idea how to set these.
>> 2. The current postgresql.conf file is a huge mess of 194 options,  
>> the
>> vast majority of which most users will never touch.
>> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
>> and the settings.sgml), which are only synched with each other  
>> manually.
>> 4. We don't seem to be getting any closer to autotuning.
>
> The proposal doesn't actually solve any of those problems.

It solves #2 at least.

> I disagree with doing any of this.  It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf.  Nor do I think that
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?

I'd love to see these issues resolved. The current postgresql.conf is  
way over the top. Might you have a better idea?

Thanks,

David

Re: Overhauling GUCS

From

Josh Berkus

Date:

31 May 2008, 15:44:29

Simon, Tom, Greg,

> Now, that will get a lot easier just by
> virtue of having a smaller config file, but I think that adding something
> into pg_settings that allows saving user-added commentary would be a nice
> step toward some useful standardization on that side of things.  It would
> make future automated tools aimed at parsing and generating new files, as
> part of things like version upgrades, a lot easier if there was a standard
> way such comments were handled in addition to the raw data itself.

Hmmm.  What about a COMMENT ON SETTING?  That seems like it would serve the 
purpose ... and make preserving user comments easier than the current 
file-conversion approach.

> The other thing I'd like to see make its way into pg_settings, so that
> tools can operate on it just by querying the database, is noting what file
> the setting came from so that you can track things like include file
> usage.  I think with those two additions (comments and source file
> tracking) it would even be concievable to clone a working facsimile of
> even a complicated postgresql.conf file set remotely just by reading
> pg_settings.

Hmmm.  Any idea how to do this?  It sounds like a good idea to me.

> So my viewpoint is that we should be aggressively adding new features,
> yet be conservative in changing existing behaviour: provide options for
> behaves-like-previous-release and keep the administrative interfaces as
> similar as possible between releases.

It's my viewpoint based on a lot of user feedback that the current 
postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL 
adoption. This was a point with which there was pretty much universal 
agreement when I talked with people at pgCon.  That is, I beleive that you're 
arguing for keeping .conf stable for the 5% of users who understand it and 
ignoring the 95% of users who are baffled by it.

At this point, I think we are the only major SQL database without some form of 
basic autotuning for the most significant settings (certainly MySQL, Oracle, 
and SQL Server have it); we've been able to coast through that because 
autotuning features are new in other DBs, but it's going to start to hurt us 
pretty soon.

Now, I do believe that we should plan any GUCS overhaul to happen in one 
postgresql version rather than phasing it in over multiple versions so that 
administrators only need to get used to a new way once.

> > Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> > 4 large problems:
> >
> > 1. Most people have no idea how to set these.
> > 2. The current postgresql.conf file is a huge mess of 194 options, the
> > vast majority of which most users will never touch.
> > 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> > and the settings.sgml), which are only synched with each other manually.
> > 4. We don't seem to be getting any closer to autotuning.
>
> The proposal doesn't actually solve any of those problems.

OK, let me draw some lines:
1 & 2) by not having the settings be defined in a 500-line file, new users 
would no longer be baffled by scores of settings which probably don't concern 
them, trying to find the handful of settings which do.

3) We'd consolidate the GUC lists down from 3 places to 2, which is one less 
area to synchronize.  Magnus and I looked to see if it might be possible to 
generate the docs from the same list, but it's not practical.

4) By shifting from a model where postgresql.conf is document-formatted and 
hand-edited to one where it's machine generated, it becomes vastly easier to 
write simple utilities to manage these settings.  Right now, the big 
"obstacle" to things like SET PERSISTENT is "how to we preseve the 
hand-edited comments in the file" -- and the answer is we *don't.*  

> I disagree with doing any of this.  It doesn't result in any useful
> reduction in maintenance effort, and what it does do is make it
> impossible to keep control over the detailed layout, formatting,
> commenting etc in a sample postgresql.conf.  

Have you *looked* at postgresql.conf.sample lately, Tom?  It's a disaster.  
Maintenance is already difficult, and becoming more so as we add settings.

Further, you and Simon seem to think that the current "narrative docs inside 
the conf file" format has some kind of value which makes things easier for 
DBAs.  I don't believe it does, and I have yet to meet a *single* new 
PostgreSQL user who wasn't confused and intimidated by the file.

> Nor do I think that 
> "generate a whole config file from scratch" is going to be a useful
> behavior for tuning problems --- how will you merge it with what
> you had before?

Who's merging?  I don't think you get the proposal.  The whole "hand-edited" 
approach to postgresql.conf should go away.  It's not useful, it's not 
educational, and it doesn't make PostgreSQL easy to manage.

Further, the lack of an easy way to manage settings via port access to 
PostgreSQL is a significant inhibitor to adopting PostgreSQL in environments 
with large numbers of servers.  See prior arguments by the CPANEL folks about 
why they don't support PostgreSQL, which is in turn a major reason why 
PostgreSQL web hosting is hard to find.  

I agree that editing the data about settings in the guc.c file is not ideal; 
Magnus and I discussed that mainly because we wanted to preserve the 
translation framework with gettext strings.  If someone can think of a better 
way to do this part, I'm all ears.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Gregory Stark

Date:

31 May 2008, 16:01:32

"David E. Wheeler" <david@kineticode.com> writes:

> I'd love to see these issues resolved. The current postgresql.conf is  way over
> the top. Might you have a better idea?

I don't think fiddling with surface issues like the formatting of the
postgresql.conf is productive. Hiding parameters because you don't think
beginners need them is only going to frustrate those people who do need to
adjust them.

What might be productive is picking up a group of parameters and thinking hard
about what they mean in terms of user-visible real-world effects. If they can
be recast in terms of behaviour the user wants instead of internal
implementation details then that would translate into a massive simplification
as well as being easier to explain to users.

I think we do a pretty good job of this already. Witness things like
effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for
example, good luck figuring out what to set it to.

The vacuum cost delay factors are probably ripe for such a recast though. I
think we need just one parameter "vacuum_io_bandwidth" or something like that.
The bgwriter parameters might also be a candidate but I'm less certain.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Gregory Stark

Date:

31 May 2008, 16:37:03

"Josh Berkus" <josh@agliodbs.com> writes:

> It's my viewpoint based on a lot of user feedback that the current 
> postgresql.conf is fundamentally broken and a major roadblock to PostgreSQL 
> adoption. This was a point with which there was pretty much universal 
> agreement when I talked with people at pgCon.  

Actually as a new DBA when I was first starting out with Postgres I found it
very convenient to have all the common parameters in one place where I could
just uncomment and adjust them. Instead of having to search through
documentation and find the default value from which

> 1 & 2) by not having the settings be defined in a 500-line file, new users 
> would no longer be baffled by scores of settings which probably don't concern 
> them, trying to find the handful of settings which do.

I'm not sure how an empty file is any less "baffling" than one listing the
default value for parameters they don't need yet.

> 3) We'd consolidate the GUC lists down from 3 places to 2, which is one less 
> area to synchronize.  Magnus and I looked to see if it might be possible to 
> generate the docs from the same list, but it's not practical.

This seems like a trivial gain and one which is unlikely to outweigh the pain
of having to massage the info into C data structures.

> 4) By shifting from a model where postgresql.conf is document-formatted and 
> hand-edited to one where it's machine generated, it becomes vastly easier to 
> write simple utilities to manage these settings.  Right now, the big 
> "obstacle" to things like SET PERSISTENT is "how to we preseve the 
> hand-edited comments in the file" -- and the answer is we *don't.*  

What this sounds like is a sly way to try to get rid of postgresql.conf
entirely and replace it with parameters stored in the database so admins would
adjust the parameters using an SQL syntax rather than a text file.

There are pros and cons of such a system but I think for newbie admins that
would be a thousand times *more* baffling. You would have to learn new
commands and have no holistic view of what parameters had been set, what
related parameters might exist. You also have no way to keep the file in a
version control system or sync across servers etc.

> Have you *looked* at postgresql.conf.sample lately, Tom?  It's a disaster.  
> Maintenance is already difficult, and becoming more so as we add settings.

What difficulties?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

31 May 2008, 18:34:33

On May 31, 2008, at 12:36, Gregory Stark wrote:

> What this sounds like is a sly way to try to get rid of  
> postgresql.conf
> entirely and replace it with parameters stored in the database so  
> admins would
> adjust the parameters using an SQL syntax rather than a text file.
>
> There are pros and cons of such a system but I think for newbie  
> admins that
> would be a thousand times *more* baffling. You would have to learn new
> commands and have no holistic view of what parameters had been set,  
> what
> related parameters might exist. You also have no way to keep the  
> file in a
> version control system or sync across servers etc.

FWIW, this has not been a barrier to MySQL adoption.

Best,

David

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

31 May 2008, 19:33:03

Josh Berkus wrote:
> Currently, PostgreSQL,conf and our set of GUC configurations suffer from
> 4 large problems:

As we have talked about it before, you know that I agree that the GUC system 
could use some improvements.  But I'm a bit surprised about some of your 
assessments.

> 1. Most people have no idea how to set these.

Could you clarify this?  I can't really believe that people are incapable of 
editing a configuration file.

> 2. The current postgresql.conf file is a huge mess of 194 options, the
> vast majority of which most users will never touch.

My opinion has always been that we should provide a default file with only the 
essential options instead of all of them.  I see this as a the major problem, 
because people are overwhelmed and consequently don't set anything.

> 3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
> and the settings.sgml), which are only synched with each other manually.

While this is not ideal, I can't really see who this is a major problem, at 
least from the perspective of the user.

> 4. We don't seem to be getting any closer to autotuning.

True.  But how does your proposal address this?

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

31 May 2008, 19:38:51

On May 31, 2008, at 15:32, Peter Eisentraut wrote:

>> 1. Most people have no idea how to set these.
>
> Could you clarify this?  I can't really believe that people are  
> incapable of
> editing a configuration file.

I've been using PostgreSQL on and off, mostly on, for almost 10 years.  
I still have no idea what 75% of those settings in postgresql.conf  
mean or are for. There are an overwhelming number of them. I know that  
5-8 of them I always touch, thanks largely to assistance now and then  
from Josh Berkus, but the rest are just complexity to me. I don't  
doubt that the vast majority of them are useful in one situation or  
another, but unless I'm in one of those situations, I really don't  
need to see them there and be confused by them.

Does that help?

>> 2. The current postgresql.conf file is a huge mess of 194 options,  
>> the
>> vast majority of which most users will never touch.
>
> My opinion has always been that we should provide a default file  
> with only the
> essential options instead of all of them.  I see this as a the major  
> problem,
> because people are overwhelmed and consequently don't set anything.

+1

Best,

David

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

31 May 2008, 20:32:13

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


In the interest of constructive criticism, here's some
ways I think postgresql.conf could be improved.

* A weak +1 to the #include idea. I'm much more inclined to simply
add a new section to the bottom of the file and use version
control, but I could see ways in which include would be useful.


* Rearrange the settings to put the more "common" ones at
the top, even if it means messing up the categories a bit.
Having 'bonjour_name' at the top, and 'default_statistics_target'
buried way down below is crazy. The sections should be
looked at from a clean perspective: do we need them at all? Are their
better ways to arrange things? What order should they be in?


* Much more verbose comments. The abovementioned default_statistics_target
is a very important settings, but there is zero explanation in the file
of what it is. The only thing we're told is that it ranges from 10 - 1000.
We can do better than that. Users would absolutely love it if each item
had a clear explanation, and it would be well worth a slightly increased
file size. See the postfix main.cf file for a good example of such.


* Remove the confusing "commented out is default" bit entirely. Simply
set each value explicitly. Why should new users have to confront this?:

# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.


* Lose the tabbed indenting, which looks bad, especially for multi-line
comments. Just use spaces.


* Create a tool, or at least a best practices, for controlling and tracking
changes to the file.


* Put some doc URLs in the file, at the minimum per major section. At the
very bare minimum, a real URL at the top.


* Indicate which values can be changed per-session or per-role.


* Fix the disparity between the syntax in the file and the SET interface.
For example, work_mem = 16MB works in the conf file, but you have to write
SET work_mem = '16MB'. Easiest is probably just to quote everything in the conf.


* Lose the post-value, end-of-line comments, they just get in the way when making
changes and make the file harder to read by contributing to the wrap problem.


* I'm tempted by the argument of creating a separate file for the obscure
settings, but I think it would be too much pain, and nobody would ever agree on
which settings are 'important' and which are 'obscure'.


* It might be nice to mention other ways to reload the file, such as
'service postgresql reload', or whatever Windows uses.


* The word 'paramters' is still misspelled. :)


* That whole sentence about changing the parameters as command-line
options needs to go away.


* Since the executable is now named "postgres" (thank goodness we got
rid of "postmaster"), the file should be named 'postgres.conf'. This would
also be a way to quickly distinguish 'old' vs 'new' style conf files if
we end up making major changes to it.


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200805311911
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhB39sACgkQvJuQZxSWSshahQCg4V5QsO34HOhUDoPzT7STcR45
V5UAoPQxkmuk/oCYirTKxMAhV+Kh8Ytz
=7Lgk
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

Andrew Dunstan

Date:

31 May 2008, 22:42:07

Greg Sabino Mullane wrote:
>
> * A weak +1 to the #include idea. I'm much more inclined to simply
> add a new section to the bottom of the file and use version
> control, but I could see ways in which include would be useful.
>
>
>
>   

We already have include directives, and have had since 8.2.

Thus spake the docs:

"In addition to parameter settings, the postgresql.conf file can contain 
/include directives/, which specify another file to read and process as 
if it were inserted into the configuration file at this point. Include 
directives simply look like:

include 'filename'

If the file name is not an absolute path, it is taken as relative to the 
directory containing the referencing configuration file. Inclusions can 
be nested."

cheers

andrew

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

31 May 2008, 23:24:03

Greg Sabino Mullane wrote:

> * Much more verbose comments. The abovementioned default_statistics_target
> is a very important settings, but there is zero explanation in the file
> of what it is. The only thing we're told is that it ranges from 10 - 1000.
> We can do better than that. Users would absolutely love it if each item
> had a clear explanation, and it would be well worth a slightly increased
> file size. See the postfix main.cf file for a good example of such.

I kind of agree with this but actually think we should have the bare 
minimum comments in the file. Why? Because our documentation links are 
static. Each setting should have the URL to the full documentation on a 
particular setting.

> 
> * Create a tool, or at least a best practices, for controlling and tracking
> changes to the file.
> 

This I disagree with. There are plenty of tools to handle this should 
someone really want to. SVN, CVS, parrot, etc... Let systems management 
be the domain of systems management.

> 
> * Put some doc URLs in the file, at the minimum per major section. At the
> very bare minimum, a real URL at the top.
> 

Hah! See above :)

> 
> * Indicate which values can be changed per-session or per-role.
> 

Agreed. Along with this although offtopic for this post is a grid in the 
docs that are explicit about this.

> 
> * Fix the disparity between the syntax in the file and the SET interface.
> For example, work_mem = 16MB works in the conf file, but you have to write
> SET work_mem = '16MB'. Easiest is probably just to quote everything in the conf.

Agreed.

> 
> * I'm tempted by the argument of creating a separate file for the obscure
> settings, but I think it would be too much pain, and nobody would ever agree on
> which settings are 'important' and which are 'obscure'.
> 

Actually I could buy into this. There really are only about a dozen must 
change settings (if that). I could see something like:

Memory settings:

network etc/network.conf
include etc/memory.conf
logging etc/logging.conf

etc...

> 
> * It might be nice to mention other ways to reload the file, such as
> 'service postgresql reload', or whatever Windows uses.
> 

I think a url to the docs is a better idea here.

> 
> * The word 'paramters' is still misspelled. :)
> 

Heh.
 > * Since the executable is now named "postgres" (thank goodness we got
> rid of "postmaster"), the file should be named 'postgres.conf'. This would
> also be a way to quickly distinguish 'old' vs 'new' style conf files if
> we end up making major changes to it.

It was never postmaster.conf (that I can recall). I don't see the issue 
here. Consider apache... It isn't apache.conf.

I think postgresql.conf (as that is the name of the software) makes sense.

Sincerely,

Joshua D. Drake

Re: Overhauling GUCS

From

Alexander Vysokovskih

Date:

01 June 2008, 02:38:31

Can I express one's user view:

Greg Sabino Mullane writes:
> * Much more verbose comments. The abovementioned default_statistics_target
> is a very important settings, but there is zero explanation in the file
> of what it is. The only thing we're told is that it ranges from 10 - 1000.
> We can do better than that. Users would absolutely love it if each item
> had a clear explanation, and it would be well worth a slightly increased
> file size. See the postfix main.cf file for a good example of such.
>      Absolutely agreed :-). Total list of GUC and fully explanation each 
of them is sufficient.
For "easy start", it might be a several postgresql.conf with 
distribution, like mysql doing it,
or generate postgresql.conf by wizard-dialog before initdb starting, for 
example.
But I think that all GUCs must present in configuration file.

Thanks.

Re: Overhauling GUCS

From

Ron Mayer

Date:

01 June 2008, 13:11:37

Gregory Stark wrote:
> 
> I think we do a pretty good job of this already. Witness things like
> effective_cache_size -- imagine if this were "nested_loop_cache_hit_rate" for
> example, good luck figuring out what to set it to.

I think either of these are fine if we describe how to measure
them.   Ideally if we had a GUC that said "log_nested_loop_cache_hit_rate"
that enabled some timing code (understandably with lots of overhead) that
made an attempt to measure the hit rate, it'd be easier to figure out
than the effective cache size, no?

> The vacuum cost delay factors are probably ripe for such a recast though. I
> think we need just one parameter "vacuum_io_bandwidth" or something like that.

+1; though perhaps the inverse of that is more useful.  When my
machines are idle I'd be happy if they vacuum more.   Wouldn't
we be better served specifying the I/O bandwidth of each device/tablespace
and letting vacuum use whatever portion would be otherwise idle?

> The bgwriter parameters might also be a candidate but I'm less certain.

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

01 June 2008, 18:12:45

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Andrew Dunstan wrote:

> We already have include directives, and have had since 8.2.

Heh, thanks - which proves how useless they are to me. :)

Joshua Drake wrote:

> I kind of agree with this but actually think we should have the bare
> minimum comments in the file. Why? Because our documentation links are
> static. Each setting should have the URL to the full documentation on a
> particular setting.

Ugh, why so much context switching? Put the docs next to the setting. URLs
are nice but not necessary. If you are arguing for minimum comments in
conf files, please make a patch for pg_hba.conf ;)

>> * Create a tool, or at least a best practices, for controlling and tracking
>> changes to the file.

> This I disagree with. There are plenty of tools to handle this should
> someone really want to. SVN, CVS, parrot, etc... Let systems management
> be the domain of systems management.

Well, perhaps just a note in the docs at least that one might want to put
postgresql.conf in version control. I've seen people not doing so more often
than you would think. Perhaps because they are DBAs and not sysadmins? I also
meant a tool to do things like verify that the changes are valid, as someone
else mentioned elsewhere in this thread.

>> * It might be nice to mention other ways to reload the file, such as
>> 'service postgresql reload', or whatever Windows uses.

> I think a url to the docs is a better idea here.

Good point. Maybe a sort of "DBA basics" page in the docs is warranted for
things like this.

>> * Since the executable is now named "postgres" (thank goodness we got
>> rid of "postmaster"), the file should be named 'postgres.conf'. This would
>> also be a way to quickly distinguish 'old' vs 'new' style conf files if
>> we end up making major changes to it.

> It was never postmaster.conf (that I can recall). I don't see the issue
> here. Consider apache... It isn't apache.conf.

Not saying it ever was postmaster.conf: just that I'm glad we finally
changed the name. As for the Apache project, the httpd executable reads the
httpd.conf file. Hence, one might expect the postgres executable to read a
postgres.conf file.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200806011656
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhDEJMACgkQvJuQZxSWSsgeogCfT0g69NDoxyWGiWmDcB3PxH8h
wJ8AnjzssA7aIk0rBdJzL+bB5vSQSeBV
=lgZG
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

02 June 2008, 01:46:20

Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160

> Joshua Drake wrote:
> 
>> I kind of agree with this but actually think we should have the bare
>> minimum comments in the file. Why? Because our documentation links are
>> static. Each setting should have the URL to the full documentation on a
>> particular setting.
> 
> Ugh, why so much context switching? Put the docs next to the setting. URLs
> are nice but not necessary. If you are arguing for minimum comments in
> conf files, please make a patch for pg_hba.conf ;)

Hah! Well I don't know that a minimum of comments is what I am arguing 
as much as not too much comments. The comments in general in the 
postgresql.conf are useless unless you have previous knowledge. I really 
think that if we take advantage of the fact that we have static URLs 
that life would be easier overall.

> 
>>> * Create a tool, or at least a best practices, for controlling and tracking
>>> changes to the file.
> 
>> This I disagree with. There are plenty of tools to handle this should
>> someone really want to. SVN, CVS, parrot, etc... Let systems management
>> be the domain of systems management.
> 
> Well, perhaps just a note in the docs at least that one might want to put
> postgresql.conf in version control.

I could certainly buy into this. No reason we can't help people better 
administrators. I would suggest a link to a static wiki page (on 
wiki.pg) that would link to each option?

> I've seen people not doing so more often
> than you would think. Perhaps because they are DBAs and not sysadmins? I also
> meant a tool to do things like verify that the changes are valid, as someone
> else mentioned elsewhere in this thread.

pg_ctl -D data check?

I would +1 that. Including (in later releases):

WARNING: You specify 66536 for shared buffers but you only have 131072 
of memory. Consider decreasing the parameter.

Obviously we would need more non math friendly wording.

> 
>>> * It might be nice to mention other ways to reload the file, such as
>>> 'service postgresql reload', or whatever Windows uses.
> 
>> I think a url to the docs is a better idea here.
> 
> Good point. Maybe a sort of "DBA basics" page in the docs is warranted for
> things like this.
>

Yeah I could buy into this.

 >>> * Since the executable is now named "postgres" (thank goodness we got
>>> rid of "postmaster"), the file should be named 'postgres.conf'. This would
>>> also be a way to quickly distinguish 'old' vs 'new' style conf files if
>>> we end up making major changes to it.
> 
>> It was never postmaster.conf (that I can recall). I don't see the issue
>> here. Consider apache... It isn't apache.conf.
> 
> Not saying it ever was postmaster.conf: just that I'm glad we finally
> changed the name. As for the Apache project, the httpd executable reads the
> httpd.conf file. Hence, one might expect the postgres executable to read a
> postgres.conf file.

Maybe, but I think I would need more convincing.

Sincerely,

Joshua D. Drake

Re: Overhauling GUCS

From

Tom Lane

Date:

02 June 2008, 02:33:39

"Joshua D. Drake" <jd@commandprompt.com> writes:
> Greg Sabino Mullane wrote:
>> Ugh, why so much context switching? Put the docs next to the setting. URLs
>> are nice but not necessary. If you are arguing for minimum comments in
>> conf files, please make a patch for pg_hba.conf ;)

> Hah! Well I don't know that a minimum of comments is what I am arguing 
> as much as not too much comments. The comments in general in the 
> postgresql.conf are useless unless you have previous knowledge. I really 
> think that if we take advantage of the fact that we have static URLs 
> that life would be easier overall.

Yeah ... this thread started out with the idea of reducing duplication,
so how did we arrive at wanting to duplicate the SGML documentation into
postgresql.conf.sample?  I think we need to integrate, not duplicate.

(pg_hba.conf.sample is a pretty good example of what not to do, I think
... and it used to be even worse ...)
        regards, tom lane

Re: Overhauling GUCS

From

"Dave Page"

Date:

02 June 2008, 04:37:42

On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> wrote:

>> I've seen people not doing so more often
>> than you would think. Perhaps because they are DBAs and not sysadmins? I
>> also
>> meant a tool to do things like verify that the changes are valid, as
>> someone
>> else mentioned elsewhere in this thread.
>
> pg_ctl -D data check?
>
> I would +1 that.

I would also really like to see that - though I'd also like to see an
SQL interface so we can check a config before saving when editing via
pgAdmin or similar.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: Overhauling GUCS

From

Greg Smith

Date:

02 June 2008, 07:21:16

On Sun, 1 Jun 2008, Joshua D. Drake wrote:

> Well I don't know that a minimum of comments is what I am arguing as 
> much as not too much comments.

Josh's proposal included making three levels of documentation-level 
comments available:  terse, normal, and verbose.  The verbose comment 
level probably should include a web link to full documentation.  The way 
the comments litter the existing file, the status quo that's called normal 
mode in this proposal, is IMHO a complete mess.  Most use cases I can 
think of want either no comments or really verbose ones, the limited 
commentary in the current sample postgresql.conf seems a middle ground 
that's not right for anybody.

The key thing thing here in my mind is that it should be possible to 
transform between those three different verbosity levels without losing 
any settings or user-added comments.  They're really just different views 
on the same data, and which view you're seeing should be easy to change 
without touching the data.

I just extracted the original design proposal and some of the relevent 
follow-up in this thread, made some additional suggestions, and put the 
result at http://wiki.postgresql.org/wiki/GUCS_Overhaul I think reading 
that version makes it a bit clearer what the proposed overhaul is aiming 
for.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Greg Smith

Date:

02 June 2008, 07:24:37

On Sun, 1 Jun 2008, Peter Eisentraut wrote:

> Josh Berkus wrote:
>> 1. Most people have no idea how to set these.
> Could you clarify this?  I can't really believe that people are incapable of
> editing a configuration file.

The big problem isn't the editing, it's knowing what to set the 
configuration values to.

This is not to say that editing a configuration file should be considered 
reasonable.  Any GUCS overhaul should include a design goal of being able 
to completely manage the configuration system using, say, pgadmin (the 
"manage settings via port access" part that Josh already mentioned). 
This is why I was suggesting additions aimed at assimilating all the 
things that are in the postgresql.conf file.

Joshua has been banging a drum for a while now that all this data needs to 
get pushing into the database itself.  The GUCS data is clearly structured 
like a database table.  Josh's suggested changes are basically adding all 
the columns needed to it in order to handle everything you'd want to do to 
the table.  If you think of it in those terms and make it possible to 
manipulate that data using the tools already available for updating 
tables, you'll open up the potential to add a whole new class of 
user-friendly applications for making configuration easier to manage.

However, I don't fully agree with taking that idea as far as Joshua has 
suggested (only having the config data in the database), because having 
everything in a simple text file that can be managed with SCM etc. has 
significant value.  It's nice to allow admins to be able to make simple 
changes with just a file edit.  It's nice that you can look at all the 
parameters in one place and browse them.  However, I do think that the 
internal database representation must be capable of holding everything in 
the original postgresql.conf file and producing an updated version of the 
file, either locally or remotely, as needed.

>> 4. We don't seem to be getting any closer to autotuning.
> True.  But how does your proposal address this?

The idea that Josh's suggestions are working toward is simplying the 
construction of tools that operate on the server configuration file, so 
that it's easier to write an autotuning tool.  Right now, writing such a 
tool in a generic way gets so bogged down just in parsing/manipulating the 
postgresql.conf file that it's hard to focus on actually doing the tuning 
part.  If we go back to his original suggestion:
http://wiki.postgresql.org/wiki/GUCS_Overhaul
>> Add a script called pg_generate_conf to generate a postgresql.conf 
>> based on guc.c and command-line switches (rather than 
>> postgresql.conf.sample)

It's an easy jump from there to imagine a pg_generate_conf that provide a 
"wizard" interface to update a configuration file.  I forsee a little GUI 
or web app that connects to a server on port 5432, finds out some basic 
information about the server, and gives something like this:

Parameter        Current    Recommended    Change?
shared_buffers        32MB    1024MB        [X]
effective_cache_size    128MB    3000MB        [ ]
work_mem        1MB    16MB        [ ]

Josh has the actual brains behind such an app all planned out if you look 
at his presentations, but without the larger overhaul it's just not 
possible to make the implementation elegant.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"Stephen R. van den Berg"

Date:

02 June 2008, 09:36:02

Greg Smith wrote:
>Joshua has been banging a drum for a while now that all this data needs to 
>get pushing into the database itself.  The GUCS data is clearly structured 

>tables, you'll open up the potential to add a whole new class of 
>user-friendly applications for making configuration easier to manage.

>However, I don't fully agree with taking that idea as far as Joshua has 
>suggested (only having the config data in the database), because having 
>everything in a simple text file that can be managed with SCM etc. has 
>significant value.  It's nice to allow admins to be able to make simple 
>changes with just a file edit.  It's nice that you can look at all the 
>parameters in one place and browse them.  However, I do think that the 
>internal database representation must be capable of holding everything in 
>the original postgresql.conf file and producing an updated version of the 
>file, either locally or remotely, as needed.

Depending on the complexity you want to have inside the generator, one
could imagine a middle ground solution like:

include "database-generated.conf"
include "local-overrides.conf"

Where the "database-generated.conf" does not necessarily needs a lot of
comments.

>Josh has the actual brains behind such an app all planned out if you look 
>at his presentations, but without the larger overhaul it's just not 
>possible to make the implementation elegant.

IMHO Greg's response is the most comprehensive and well-thought-through
contribution in the whole GUC thread.
-- 
Sincerely,          Stephen R. van den Berg.

Re: Overhauling GUCS

From

Tom Lane

Date:

02 June 2008, 11:13:23

Greg Smith <gsmith@gregsmith.com> writes:
> Joshua has been banging a drum for a while now that all this data needs to 
> get pushing into the database itself.

This is, very simply, not going to happen.  Shall we go over the reasons
why not, one more time?

1. Quite a few of the GUC parameters are needed *before* one can ever
read the database; in particular the ones about file locations and
shared memory sizing.

2. Suppose you change a parameter in a way that breaks the DB (eg,
set shared_buffers to a value larger than your kernel allows) and
try to restart.  Database doesn't start.  If the parameter can only
be changed back within an operating database, you're hosed.

I have no objection to providing alternative ways to edit the
configuration data, but the primary source of the settings is
going to continue to be an editable text file.  Any proposals for
alternatives-to-a-text-editor have to work within that reality.
        regards, tom lane

Re: Overhauling GUCS

From

"Stephen R. van den Berg"

Date:

02 June 2008, 12:07:32

Tom Lane wrote:
>Greg Smith <gsmith@gregsmith.com> writes:
>> Joshua has been banging a drum for a while now that all this data needs to 
>> get pushing into the database itself.

>This is, very simply, not going to happen.  Shall we go over the reasons
>why not, one more time?

>1. Quite a few of the GUC parameters are needed *before* one can ever
>read the database; in particular the ones about file locations and
>shared memory sizing.

Obviously.

>2. Suppose you change a parameter in a way that breaks the DB (eg,
>set shared_buffers to a value larger than your kernel allows) and
>try to restart.  Database doesn't start.  If the parameter can only
>be changed back within an operating database, you're hosed.

That's why those parameters will always have to be read from a textfile.
Which doesn't prohibit that textfile to be generated from within the
database (once up and running).  And yes, if someone changes the
parameter through the DB, then makes the DB write out the configfile,
then restarts the DB remotely, and the change didn't work out (i.e. DB
doesn't start, or crashes/hangs), he is hosed and needs to log in and change
the textfile with a texteditor, no way around it.

>I have no objection to providing alternative ways to edit the
>configuration data, but the primary source of the settings is
>going to continue to be an editable text file.  Any proposals for
>alternatives-to-a-text-editor have to work within that reality.

I think everyone is aware of that.
-- 
Sincerely,          Stephen R. van den Berg.

Re: Overhauling GUCS

From

"Jignesh K. Shah"

Date:

02 June 2008, 13:02:19

Simon Riggs wrote:
>
> Some other problems I see with GUCs
>
> * It's not possible to set one parameter depending upon the setting of
> another.
>   

To me this is more critical.. Most people I have seen will increase one 
or few but not all parameters related to memory which can result in loss 
of performance and productivity in figuring out.

What happened to AvailRAM setting and base all memory gucs on that.  
Ideally PostgreSQL should only create one big memory pool and allow all 
other variables to change runtime via dba or some tuner process or 
customized application as long as total is less than the allocated 
shared_memory and local_memory settings. (This will also reduce the need 
of restarting Postgres if a value needs to be changed)

-Jignesh

> * It's always unclear which GUCs can be changed, and when. That is much
> more infrequently understood than the meaning of them.
>
> * We should rename effective_cache_size to something that doesn't sound
> like it does what shared_buffers does
>
> * There is no config verification utility, so if you make a change and
> then try to restart and it won't, you are in trouble.
>
>

Re: Overhauling GUCS

From

Greg Smith

Date:

02 June 2008, 13:22:46

On Mon, 2 Jun 2008, Tom Lane wrote:

> Greg Smith <gsmith@gregsmith.com> writes:
>> Joshua has been banging a drum for a while now that all this data needs to
>> get pushing into the database itself.
>
> This is, very simply, not going to happen.

Right, there are also technical challenges in the way of that ideal.  I 
was only mentioning the reasons why it might not be the best idea even if 
it were feasible.  However, I do not see why the limitations you bring up 
must get in the way of thinking about how to interact and manage the 
configuration data in a database context, even though it ultimately must 
be imported and exported to a flat file.

The concerns you bring up again about leaving the database in an 
unstartable state are a particularly real danger in the "only has access 
to 5432" hosted provider case that this redesign is trying to satisfy.  I 
added a "Gotchas" section to the wiki page so that this issue doesn't get 
forgotten about.  The standard way to handle this situation is to have a 
known good backup configuration floating around.  Adding something in that 
area may end up being a hard requirement before remote editing makes 
sense.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Simon Riggs

Date:

02 June 2008, 13:27:26

On Mon, 2008-06-02 at 11:59 -0400, Jignesh K. Shah wrote:
> 
> Simon Riggs wrote:
> >
> > Some other problems I see with GUCs
> >
> > * It's not possible to set one parameter depending upon the setting of
> > another.
> >   
> 
> To me this is more critical.. Most people I have seen will increase one 
> or few but not all parameters related to memory which can result in loss 
> of performance and productivity in figuring out.
> 
> What happened to AvailRAM setting and base all memory gucs on that.  
> Ideally PostgreSQL should only create one big memory pool and allow all 
> other variables to change runtime via dba or some tuner process or 
> customized application as long as total is less than the allocated 
> shared_memory and local_memory settings. (This will also reduce the need 
> of restarting Postgres if a value needs to be changed)

Agreed.

Right now, we can't even do that in code, let alone in config file.

If we had a smart_memory_config = on then we'd be able to say in the
backend:if (smart_memory_config){    other_thing = 0.1 * Nbuffers;}

but the GUCs are evaluated in alphabetical order, without any way of
putting dependencies between them. So they are notionally orthogonal.   

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Overhauling GUCS

From

Greg Smith

Date:

02 June 2008, 13:48:06

On Mon, 2 Jun 2008, Jignesh K. Shah wrote:

> Most people I have seen will increase one or few but not all parameters 
> related to memory which can result in loss of performance and 
> productivity in figuring out.

If it becomes easier to build a simple tool available to help people tune 
their configurations, that should help here without having to do anything 
more complicated than that.

> What happened to AvailRAM setting and base all memory gucs on that.

Like some of the other GUC simplification ideas that show up sometimes 
(unifying all I/O and limiting background processes based on that total is 
another), this is hard to do internally.  Josh's proposal has a fair 
amount of work involved, but the code itself doesn't need to be clever or 
too intrusive.  Unifying all the memory settings would require being both 
clever and intrusive, and I doubt you'll find anybody who could pull it 
off who isn't already overtasked with more important improvements for the 
8.4 timeframe.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Josh Berkus

Date:

02 June 2008, 15:02:11

Greg,

> Like some of the other GUC simplification ideas that show up sometimes
> (unifying all I/O and limiting background processes based on that total
> is another), this is hard to do internally.  Josh's proposal has a fair
> amount of work involved, but the code itself doesn't need to be clever
> or too intrusive.  Unifying all the memory settings would require being
> both clever and intrusive, and I doubt you'll find anybody who could
> pull it off who isn't already overtasked with more important
> improvements for the 8.4 timeframe.

Plus, I'm a big fan of "enable an API" rather than "write a feature".  I
think that there are people & companies out there who can write better
autotuning tools than I can, and I'd rather give them a place to plug
those tools in than trying to write autotuning into the postmaster.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Paul van den Bogaard

Date:

03 June 2008, 09:26:08

to add some complexity to this topic :-)  Please note I admit upfront  
I am not familiar with every parameter out there, but during my quest  
in finding bottleneck while stressing the back-end I find many GUC  
parameters with names that show they should be interesting.
I read the comments, the docs (that I know of), go into the source to  
learn the parameters and their units.  And wondering if altering a  
setting would do miracles in my case.

Indeed how can I measure the effect of a new setting?  Enhanced  
throughput?   I learned in the past that fiddling with parameter A in  
a database that was not properly setup/used --and there were many of  
these--, had side effects in other parts of the engine. Yes, increased  
throughput was observed.  A new hype created. In the end it turned out  
the parameter A was not set correctly at all. That parameter B, once  
set to a sane value, cured the wrong behavior, and parameter A was not  
optimal at all after the cure. We were just side tracked because we  
did not know.   Incorrect "knowledge" was borne (parameter A  
setting).  Throughout the years  this database product has matured,  
many more parameter realized and much, much more instrumentation been  
implemented.  It still is quite a challenge to understand what is  
happening. But proper analysis is possible indeed. The black box is  
much more open now.

One current example: wal_writer_delay.  In my team there is an advise  
to set this parameter to 100. However, after implementing a counter  
(home grown instrumentation) I now know that the background log flush  
routine is never called when stressing the database. Therefore I now  
think the best setting is 10000 (its maximum) since it does not do  
useful work (in my context) and therefore should wake up as little  
times as possible. Without this instrumentation I can only guess about  
the usability of this parameter and spend many tests in order to get  
an impression of its validity to me.

So overhauling the GUC parameters is one step, but adding proper  
instrumentation in order to really measure the impact of the new  
setting is necessary too. Especially when looking in the direction of  
auto tuning.  Proper measurement is crucial to enable correct analysis.

Of course I am in favor of doing this with DTrace, however not all  
platforms can benefit in that case :-)


--Paul



On 2 jun 2008, at 20:06, Josh Berkus wr
> Greg,
>
>> Like some of the other GUC simplification ideas that show up  
>> sometimes
>> (unifying all I/O and limiting background processes based on that  
>> total
>> is another), this is hard to do internally.  Josh's proposal has a  
>> fair
>> amount of work involved, but the code itself doesn't need to be  
>> clever
>> or too intrusive.  Unifying all the memory settings would require  
>> being
>> both clever and intrusive, and I doubt you'll find anybody who could
>> pull it off who isn't already overtasked with more important
>> improvements for the 8.4 timeframe.
>
> Plus, I'm a big fan of "enable an API" rather than "write a  
> feature".  I
> think that there are people & companies out there who can write better
> autotuning tools than I can, and I'd rather give them a place to plug
> those tools in than trying to write autotuning into the postmaster.
>
> -- 
> --Josh
>
> Josh Berkus
> PostgreSQL @ Sun
> San Francisco
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Regards,

Paul van den Bogaard

---------------------------------------------------------------------------------------------
Paul van den Bogaard                               Paul.vandenBogaard@sun.com
ISV-E  -- ISV Engineering, Opensource Engineering group

Sun Microsystems, Inc                              phone:        +31  
334 515 918
Saturnus 1                                                 extentsion:  
x (70)15918
3824 ME Amersfoort                                 mobile:       +31  
651 913 354
The Netherlands                                        fax:             
+31 334 515 001

Re: Overhauling GUCS

From

Greg Smith

Date:

04 June 2008, 00:56:45

On Tue, 3 Jun 2008, Paul van den Bogaard wrote:

> So overhauling the GUC parameters is one step, but adding proper 
> instrumentation in order to really measure the impact of the new setting 
> is necessary too.

Correct, but completely off-topic regardless.  One problem to be solved 
here is to take PostgreSQL tuning from zero to, say, 50% automatic. 
Wander the user lists for a few months; the number of completely 
misconfigured systems out there is considerable, partly because the 
default values for many parameters are completely unreasonable for modern 
hardware and there's no easy way to improve on that without someone 
educating themselves.  Getting distracted by the requirements of the 
high-end systems will give you a problem you have no hope of executing in 
a reasonable time period.

By all means bring that up as a separate (and much, much larger) project: 
"Database Benchmarking and Sensitivity Analysis of Performance Tuning 
Parameters" would make a nice PhD project for somebody, and there's 
probably a good patent in there somewhere.  Even if you had such a tool, 
it wouldn't be usable by non-experts unless the mundate GUC generation 
issues are dealt with first, and that's where this is at right now.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Decibel!

Date:

04 June 2008, 10:01:13

On Sat, May 31, 2008 at 08:36:54PM +0100, Gregory Stark wrote:
> > 4) By shifting from a model where postgresql.conf is document-formatted and
> > hand-edited to one where it's machine generated, it becomes vastly easier to
> > write simple utilities to manage these settings.  Right now, the big
> > "obstacle" to things like SET PERSISTENT is "how to we preseve the
> > hand-edited comments in the file" -- and the answer is we *don't.*
>
> What this sounds like is a sly way to try to get rid of postgresql.conf
> entirely and replace it with parameters stored in the database so admins would
> adjust the parameters using an SQL syntax rather than a text file.
>
> There are pros and cons of such a system but I think for newbie admins that
> would be a thousand times *more* baffling. You would have to learn new
> commands and have no holistic view of what parameters had been set, what
> related parameters might exist. You also have no way to keep the file in a
> version control system or sync across servers etc.

It doesn't have to be in the database. I kinda like Oracle's method
here; they default to storing config options in a binary file that
(iirc) you can change via commands, but they also provide a way to turn
that binary file into text (and back). That would allow for a lot of
options when it comes to configuring stuff.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Re: Overhauling GUCS

From

Decibel!

Date:

04 June 2008, 10:18:29

On Mon, Jun 02, 2008 at 10:12:06AM -0400, Tom Lane wrote:
> Greg Smith <gsmith@gregsmith.com> writes:
> > Joshua has been banging a drum for a while now that all this data needs to 
> > get pushing into the database itself.
> 
> This is, very simply, not going to happen.  Shall we go over the reasons
> why not, one more time?
> 
<snip>
> 
> I have no objection to providing alternative ways to edit the
> configuration data, but the primary source of the settings is
> going to continue to be an editable text file.  Any proposals for
> alternatives-to-a-text-editor have to work within that reality.

There's no reason that the server has to deal with a text file. I
completely agree that there must be a method to change settings even if
the database isn't running, but that method does not necessarily need to
be a text file. If we can come up with a standard API for reading and
writing config changes, we (or anyone else) can write any number of
tools to deal with the settings. And once we have an API, we can provide
a SQL interface on top of it.

Instead of focusing on exactly what the 'new postgresql.conf' is going
to look like we should focus on creating a clean configuration API. Once
we have that we can figure out what (if anything) we're doing with the
existing .conf, and what a new one (if it exists) might look like.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Re: Overhauling GUCS

From

Andreas Pflug

Date:

04 June 2008, 11:20:24

Decibel! wrote:
>>
>> There's no reason that the server has to deal with a text file. I
>> completely agree that there must be a method to change settings even if
>> the database isn't running, but that method does not necessarily need to
>> be a text file. If we can come up with a standard API for reading and
>> writing config changes, we (or anyone else) can write any number of
>> tools to deal with the settings. And once we have an API, we can provide
>> a SQL interface on top of it.
Once in a lifetime, a man should plant a tree, father a child, and write 
an editor... :-)
Hiding the storage of config parameters opaquely behind an API is 
something I've been hating for a long time on win32.

When reading this thread, I'm wondering if anybody ever saw a config 
file for a complex software product that was easily editable and 
understandable. I don't know one. If there was one, it'd be nice to know 
it so we can learn from it.

IMHO the best compromise in machine and human readability is an XML 
format. It's easily decorateable with comments, easily interpreted and a 
pg_settings view could enhance it with even more comments, so an editor 
using pgsql functions (to read and write postresql.conf.xml) could be 
enabled to supply comprehensive help.

Regards,
Andreas

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

04 June 2008, 11:35:59

* Andreas Pflug <pgadmin@pse-consulting.de> [080604 10:20]:

> Hiding the storage of config parameters opaquely behind an API is 
> something I've been hating for a long time on win32.

;-)

> When reading this thread, I'm wondering if anybody ever saw a config 
> file for a complex software product that was easily editable and 
> understandable. I don't know one. If there was one, it'd be nice to know 
> it so we can learn from it.

PostreSQL, Apache, X.org

They are all easily editable, and "understandable", in the sense that I
understand that I'm supposed to edit the line, changing the value
(following the comments list of accepted values)

They are "less understandable" if you mean that I know the implications
of any change I make.  But guess what, having those values inputed
through some other mechanism (like a GUI config file editor, a SQL statement,
or a nice pgadmin-SQL-hiding-interface isn't going to change that part
of "understandable".  That part of understandable only comes through
good documentation and reference material, which is universally
applicable to any config method.

> IMHO the best compromise in machine and human readability is an XML 
> format. It's easily decorateable with comments, easily interpreted and a 
> pg_settings view could enhance it with even more comments, so an editor 
> using pgsql functions (to read and write postresql.conf.xml) could be 
> enabled to supply comprehensive help.

Well, In my past, I've generally not got around to installing and using
software that reqired me to edit some jumble of XML.  Ya, maybe I'm
lucky.  And since I've got a lot invested in PG, I'ld be forced to of PG
moved to an XML config, but I'ld be forced to kicking and screaming...

I just *know* that I'ld reload/restart postmaster some time, and the
config file wouldn't be quite correct, and I'ld search for 10 minutes
trying to find the extra (or lack) ", or missing closing /...  But maybe
most people are better at parsing XML than me.  And that also may be
because I've actively avoided it for so long ;-)

I don't know *how* I'ld decorate my XML config file with comments and
history the way I do my text-based #-commented config files.

a.


-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Andreas Pflug

Date:

04 June 2008, 12:11:10

Aidan Van Dyk wrote:
> * Andreas Pflug <pgadmin@pse-consulting.de> [080604 10:20]:
>
>   
>> Hiding the storage of config parameters opaquely behind an API is 
>> something I've been hating for a long time on win32.
>>     
>
> ;-)
>
>   
>> When reading this thread, I'm wondering if anybody ever saw a config 
>> file for a complex software product that was easily editable and 
>> understandable. I don't know one. If there was one, it'd be nice to know 
>> it so we can learn from it.
>>     
>
> PostreSQL, Apache, X.org
>
> They are all easily editable, and "understandable", in the sense that I
> understand that I'm supposed to edit the line, changing the value
> (following the comments list of accepted values)
>
> They are "less understandable" if you mean that I know the implications
> of any change I make.  But guess what, having those values inputed
> through some other mechanism (like a GUI config file editor, a SQL statement,
> or a nice pgadmin-SQL-hiding-interface isn't going to change that part
> of "understandable".  That part of understandable only comes through
> good documentation and reference material, which is universally
> applicable to any config method.
>   

Right. On the editing side, a column "link" in pg_settings that can be 
used to construct an URL to postgresql.org/docs/xxx#yyy could help 
creating editors that support the user. Whatever a text config file will 
look like, you need to know exactly which parameter to use and where to 
locate it; even structuring parameters won't help too much for the 
typical starter task "I installed pgsql, what to do next".
>   
>> IMHO the best compromise in machine and human readability is an XML 
>> format. It's easily decorateable with comments, easily interpreted and a 
>> pg_settings view could enhance it with even more comments, so an editor 
>> using pgsql functions (to read and write postresql.conf.xml) could be 
>> enabled to supply comprehensive help.
>>     
>
> Well, In my past, I've generally not got around to installing and using
> software that reqired me to edit some jumble of XML.  Ya, maybe I'm
> lucky.  And since I've got a lot invested in PG, I'ld be forced to of PG
> moved to an XML config, but I'ld be forced to kicking and screaming...
>
> I just *know* that I'ld reload/restart postmaster some time, and the
> config file wouldn't be quite correct, and I'ld search for 10 minutes
> trying to find the extra (or lack) ", or missing closing /...  But maybe
> most people are better at parsing XML than me.  And that also may be
> because I've actively avoided it for so long ;-)
>   
Well I'm an XML evangelist either. But the usual "commenting out a 
parameter will reset it to default on reload, no?" caveat isn't funny 
either, or duplicate parameter settings scattered throughout your file.
This may be avoided by *preferably* editing the parameters through pgsql 
itself; the current postgresql.conf file format isn't too machine write 
friendly (as I know since I wrote the pgadmin config file editor). But 
having a config file that can't be used with simple editors at all is a 
nightmare.

Regards,
Andreas

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 12:41:33

Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Well I'm an XML evangelist either. But the usual "commenting out a 
> parameter will reset it to default on reload, no?" caveat isn't funny 
> either, or duplicate parameter settings scattered throughout your file.

Surely everyone who is opining on this thread knows that we fixed that
in 8.3.
        regards, tom lane

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 14:00:30

On Jun 3, 2008, at 20:48, Greg Smith wrote:

> Correct, but completely off-topic regardless.  One problem to be  
> solved here is to take PostgreSQL tuning from zero to, say, 50%  
> automatic. Wander the user lists for a few months; the number of  
> completely misconfigured systems out there is considerable, partly  
> because the default values for many parameters are completely  
> unreasonable for modern hardware and there's no easy way to improve  
> on that without someone educating themselves.  Getting distracted by  
> the requirements of the high-end systems will give you a problem you  
> have no hope of executing in a reasonable time period.

Exactly. The issue is that application developers, who are not DBAs,  
have no idea how to tune PostgreSQL, and postgresql.conf is daunting  
and confusing. So they use a different database that's "faster".

I think that right now postgresql.conf is designed for full-time DBAs,  
rather than folks who might want to target PostgreSQL for an  
application they're developing. We want to attract the latter  
(without, of course, any expense with the former). Changing how  
configuration works so that it's easier to understand and, if  
possible, at least partly automatically tunable would go a long way  
towards making PostgreSQL friendlier for developers, IMHO.

Best,

David

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 14:02:10

On Jun 4, 2008, at 07:19, Andreas Pflug wrote:

> IMHO the best compromise in machine and human readability is an XML
> format. It's easily decorateable with comments, easily interpreted
> and a pg_settings view could enhance it with even more comments, so
> an editor using pgsql functions (to read and write
> postresql.conf.xml) could be enabled to supply comprehensive help.

I hope that was a joke…

Best,

David

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

04 June 2008, 14:09:31

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Exactly. The issue is that application developers, who are not DBAs,
> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
> and confusing. So they use a different database that's "faster".

Changing some of our defaults would go a long way as well.

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhGzCYACgkQvJuQZxSWSsi2AQCfZFa8PBZ16ljeFNmhY/YlkxF9
8KcAnjciYml1icuuxSPiKIyeE8YrVxOA
=q4Cf
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 15:22:49

"David E. Wheeler" <david@kineticode.com> writes:
> Exactly. The issue is that application developers, who are not DBAs,  
> have no idea how to tune PostgreSQL, and postgresql.conf is daunting  
> and confusing. So they use a different database that's "faster".

I take it you haven't looked at mysql's configuration file lately.

They aren't actually in any better shape than we are, except that
they supply several "preconfigured" sample files for people to choose
from.
        regards, tom lane

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 15:56:03

On Jun 4, 2008, at 11:22, Tom Lane wrote:

> "David E. Wheeler" <david@kineticode.com> writes:
>> Exactly. The issue is that application developers, who are not DBAs,
>> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
>> and confusing. So they use a different database that's "faster".
>
> I take it you haven't looked at mysql's configuration file lately.

I'm not much into MySQL, but in the work I've done with it, I've had
to create /etc/my.cnf myself. There *is* no configuration file
configuring MySQL until that file is created, is there? So there is no
configuration to learn at first. I'm not saying that this is
necessarily admirable -- it's kind of the opposite end of the spectrum
(PostgreSQL: "Here is every configuration tweak you could ever
possibly want, have fun!" vs MySQL: "There is no configuration until
you need one, then you have to find the docs for it."

> They aren't actually in any better shape than we are, except that
> they supply several "preconfigured" sample files for people to choose
> from.

Which would be a good start, if nothing else…

Best,

David

Re: Overhauling GUCS

From

Andrew Dunstan

Date:

04 June 2008, 16:49:11


David E. Wheeler wrote:
> On Jun 4, 2008, at 11:22, Tom Lane wrote:
>
>> "David E. Wheeler" <david@kineticode.com> writes:
>>> Exactly. The issue is that application developers, who are not DBAs,
>>> have no idea how to tune PostgreSQL, and postgresql.conf is daunting
>>> and confusing. So they use a different database that's "faster".
>>
>> I take it you haven't looked at mysql's configuration file lately.
>
> I'm not much into MySQL, but in the work I've done with it, I've had 
> to create /etc/my.cnf myself. There *is* no configuration file 
> configuring MySQL until that file is created, is there? So there is no 
> configuration to learn at first. I'm not saying that this is 
> necessarily admirable -- it's kind of the opposite end of the spectrum 
> (PostgreSQL: "Here is every configuration tweak you could ever 
> possibly want, have fun!" vs MySQL: "There is no configuration until 
> you need one, then you have to find the docs for it."

Tell me how that's better.

If that's what you want, simply remove all the comment lines from your 
config file. Problem solved.

>
>> They aren't actually in any better shape than we are, except that
>> they supply several "preconfigured" sample files for people to choose
>> from.
>
> Which would be a good start, if nothing else…

It's been suggested in the past. It is highly debatable that it would 
actually be an advance.

cheers

andrew

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 17:00:15

On Jun 4, 2008, at 12:48, Andrew Dunstan wrote:

>> I'm not much into MySQL, but in the work I've done with it, I've
>> had to create /etc/my.cnf myself. There *is* no configuration file
>> configuring MySQL until that file is created, is there? So there is
>> no configuration to learn at first. I'm not saying that this is
>> necessarily admirable -- it's kind of the opposite end of the
>> spectrum (PostgreSQL: "Here is every configuration tweak you could
>> ever possibly want, have fun!" vs MySQL: "There is no configuration
>> until you need one, then you have to find the docs for it."
>
> Tell me how that's better.
>
> If that's what you want, simply remove all the comment lines from
> your config file. Problem solved.

I didn't say it was better. The point is that it seems to be less
confusing to non-DBAs.

>> Which would be a good start, if nothing else…
>
> It's been suggested in the past. It is highly debatable that it
> would actually be an advance.

Agreed; we've bandied around some better ideas here. Well, I haven't,
I've just kibbitzed. But we can surely do better.

Best,

David

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 17:12:28

"David E. Wheeler" <david@kineticode.com> writes:
> I'm not much into MySQL, but in the work I've done with it, I've had  
> to create /etc/my.cnf myself. There *is* no configuration file  
> configuring MySQL until that file is created, is there? So there is no  
> configuration to learn at first.

Postgres will start happily with an empty configuration file, too.
What's your point?
        regards, tom lane

Re: Overhauling GUCS

From

"Pavel Stehule"

Date:

04 June 2008, 17:31:33

2008/6/4 David E. Wheeler <david@kineticode.com>:
> On Jun 3, 2008, at 20:48, Greg Smith wrote:
>
>> Correct, but completely off-topic regardless.  One problem to be solved
>> here is to take PostgreSQL tuning from zero to, say, 50% automatic. Wander
>> the user lists for a few months; the number of completely misconfigured
>> systems out there is considerable, partly because the default values for
>> many parameters are completely unreasonable for modern hardware and there's
>> no easy way to improve on that without someone educating themselves.
>>  Getting distracted by the requirements of the high-end systems will give
>> you a problem you have no hope of executing in a reasonable time period.
>
> Exactly. The issue is that application developers, who are not DBAs, have no
> idea how to tune PostgreSQL, and postgresql.conf is daunting and confusing.
> So they use a different database that's "faster".
>

do you thing, so any better config can help? It's not possible.  And
you can't tune database without well knowledge of applications that
use database. Any automatic tools are joy for child. But some default
PostgreSQL parameters are not optimal.


> I think that right now postgresql.conf is designed for full-time DBAs,
> rather than folks who might want to target PostgreSQL for an application
> they're developing. We want to attract the latter (without, of course, any
> expense with the former). Changing how configuration works so that it's
> easier to understand and, if possible, at least partly automatically tunable
> would go a long way towards making PostgreSQL friendlier for developers,
> IMHO.
>
> Best,
>
> David
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: Overhauling GUCS

From

Greg Smith

Date:

04 June 2008, 17:39:57

On Wed, 4 Jun 2008, Andreas Pflug wrote:

> When reading this thread, I'm wondering if anybody ever saw a config file for 
> a complex software product that was easily editable and understandable.

I would recommend Apache's httpd.conf as an example of something that's 
easy to edit and follow.  Like any complex product, the comments in the 
configuration file itself can't possibly be sufficient by themselves. 
But in general I've found Apache's config file to have enough comments to 
jog my memory when I'm editing it while not being overwhelming.  They 
provide enough detail that when I run into a setting I don't understand 
there's enough context provided that it's easy to search for more 
information.

Poking around with Google for a bit, here's a reasonable sample: 
http://webdav.org/goliath/dav_on_x/apache.conf

> IMHO the best compromise in machine and human readability is an XML format.

If the primary PostgreSQL configuration file becomes XML I will quit 
working with the project.  I'm not kidding.  If you think XML is easy to 
generate, edit by hand, and use revision control on, we are at such an 
fundamental disagreement that I wouldn't even try and directly argue with 
you.  Instead I'll quote Eric Raymond:

"The most serious problem with XML is that it doesn't play well with 
traditional Unix tools. Software that wants to read an XML format needs an 
XML parser; this means bulky, complicated programs." 
http://www.catb.org/esr/writings/taoup/html/ch05s02.html#id2907018

Let me suggest the following requirement instead which naturally rules it 
out:  it should be possible for a DBA-level coder to write a simple shell 
script that does something useful with the configuration file in order for 
having a text-based configuration to be useful in this context.  To give a 
simple example, I can write a single line [sed|awk|perl] command that will 
let me update the value for one parameter in the current postgresql.conf 
file.  When you can give me a one-liner that does that on an XML file in 
any shell language in that class, then we might have something to talk 
about.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 17:41:22

On Jun 4, 2008, at 13:31, Pavel Stehule wrote:

> do you thing, so any better config can help? It's not possible.  And
> you can't tune database without well knowledge of applications that
> use database. Any automatic tools are joy for child. But some default
> PostgreSQL parameters are not optimal.

Agreed. But, speaking as an app developer, I'm child-like in my DBA  
abilities. I could use a toy to help me with it. :-)

Best,

David

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 17:44:02

On Jun 4, 2008, at 13:12, Tom Lane wrote:

> "David E. Wheeler" <david@kineticode.com> writes:
>> I'm not much into MySQL, but in the work I've done with it, I've had
>> to create /etc/my.cnf myself. There *is* no configuration file
>> configuring MySQL until that file is created, is there? So there is  
>> no
>> configuration to learn at first.
>
> Postgres will start happily with an empty configuration file, too.
> What's your point?

That it's less daunting for inexperienced users to start with that.  
I'm not talking about how things work, I'm talking about what  
configurations are present to start with. That's all.

Best,

David

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 17:58:05

Greg Smith <gsmith@gregsmith.com> writes:
> On Wed, 4 Jun 2008, Andreas Pflug wrote:
>> IMHO the best compromise in machine and human readability is an XML format.

> If the primary PostgreSQL configuration file becomes XML I will quit 
> working with the project.  I'm not kidding.

I have no particular use for XML in this scenario either, but really
this thread seems to be arguing about mostly-irrelevant details.  There
is not anything fundamentally broken about keeping configuration in a
text file, as is proven by the fact that all those other packages do it.

The real problem we need to solve is how to allow newbies to have the
system auto-configured to something that more or less solves their
problems.  Putting the config settings in XML does not accomplish that,
and neither does putting them inside the database.  It might knock a
day or two off the time needed to develop a tool that actually does
solve the newbie's problem ... but it's unlikely that the effort of
changing Postgres to use some other configuration representation would
get repaid through easier tool development.

So I think we should stop worrying about the file format and think about
these two problems:

* Can we present the config options in a more helpful way (this is 99%
a documentation problem, not a code problem)?

* Can we build a "configuration wizard" to tell newbies what settings
they need to tweak?
        regards, tom lane

Re: Overhauling GUCS

From

Steve Atkins

Date:

04 June 2008, 18:10:38

On Jun 4, 2008, at 1:57 PM, Tom Lane wrote:

> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Probably. Given the demographics of a lot of the newbies is
Windows this likely needs to be a pointy-clicky sort of thing
if it's going to be widely useful.

Doing it in a smart way is likely hard, but even a very simple,
crude one would likely be helpful. Doing it as a standalone
GUI editor / Wizard wouldn't be too hard, but it sounds more
like something that should be integrated in pgadmin, perhaps?

Cheers,  Steve

Re: Overhauling GUCS

From

"BRUSSER Michael"

Date:

04 June 2008, 18:54:42

If I can add my 2 cents as the end user, IMHO having the configuration
file
in the xml format is unnecessary and only makes it less user-friendly.

Thanks,
Michael.

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Wednesday, June 04, 2008 4:58 PM
To: Greg Smith
Cc: Andreas Pflug; Decibel!; Peter Eisentraut;
pgsql-hackers@postgresql.org; Josh Berkus
Subject: Re: [HACKERS] Overhauling GUCS

Greg Smith <gsmith@gregsmith.com> writes:
> On Wed, 4 Jun 2008, Andreas Pflug wrote:
>> IMHO the best compromise in machine and human readability is an XML
format.

> If the primary PostgreSQL configuration file becomes XML I will quit
> working with the project.  I'm not kidding.

I have no particular use for XML in this scenario either, but really
this thread seems to be arguing about mostly-irrelevant details.  There
is not anything fundamentally broken about keeping configuration in a
text file, as is proven by the fact that all those other packages do it.

The real problem we need to solve is how to allow newbies to have the
system auto-configured to something that more or less solves their
problems.  Putting the config settings in XML does not accomplish that,
and neither does putting them inside the database.  It might knock a
day or two off the time needed to develop a tool that actually does
solve the newbie's problem ... but it's unlikely that the effort of
changing Postgres to use some other configuration representation would
get repaid through easier tool development.

So I think we should stop worrying about the file format and think about
these two problems:

* Can we present the config options in a more helpful way (this is 99%
a documentation problem, not a code problem)?

* Can we build a "configuration wizard" to tell newbies what settings
they need to tweak?
        regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

04 June 2008, 18:58:24

On Jun 4, 2008, at 13:57, Tom Lane wrote:

> So I think we should stop worrying about the file format and think  
> about
> these two problems:
>
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?
>
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Amen, Tom. Nicely put.

Thanks,

David

Re: Overhauling GUCS

From

Rainer Bauer

Date:

04 June 2008, 20:34:09

"Pavel Stehule" wrote:

>2008/6/4 David E. Wheeler <david@kineticode.com>:
>>
>> Exactly. The issue is that application developers, who are not DBAs, have no
>> idea how to tune PostgreSQL, and postgresql.conf is daunting and confusing.
>> So they use a different database that's "faster".
>
>do you thing, so any better config can help? It's not possible.  And
>you can't tune database without well knowledge of applications that
>use database. Any automatic tools are joy for child. But some default
>PostgreSQL parameters are not optimal.

I think it would be an enourmous help for beginners if they had a "simple"
tuning tool which would tell them which values where altered (and possibly
why) from Postgres' default settings based on some basic information.
Like:
- machine hardware (disk layout, OS, installed memory, etc.)
- application usage (no. of clients, read/write activity, etc)

I don't think that such a tool could tune the database perfectly (or even very
good), but at least people new to Postgres would know where to start looking
for tuning it to their needs.

And I am not speaking about end users running an application that uses
Postgres. I talk about application developers like me that port their
application to use Postgres.

Rainer

Re: Overhauling GUCS

From

Andreas Pflug

Date:

04 June 2008, 21:08:21

Tom Lane wrote:

> 
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?
> 
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

It's certainly one thing to create an initial postgresql.conf from
scratch after some inquiry, but a different level of problems to deal
with when offering to change the settings. IMHO initial creation isn't
enough, users will feel even more left alone if there are no tools
helping them further. I guess most users will start tweaking after the
server is already running for a while, with some config already in place.
That's when file format and/or APIs come into play. Preserving comments
and/or using them in a wizard isn't too easy with the current format.

Regards,
Andreas

Re: Overhauling GUCS

From

Andrew Dunstan

Date:

04 June 2008, 21:15:28


Tom Lane wrote:
>
> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?
>
>     
>   

That would trump all the other suggestions conclusively. Anyone good at 
expert systems?

cheers

andrew

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 21:29:48

Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?

> It's certainly one thing to create an initial postgresql.conf from
> scratch after some inquiry, but a different level of problems to deal
> with when offering to change the settings. IMHO initial creation isn't
> enough, users will feel even more left alone if there are no tools
> helping them further. I guess most users will start tweaking after the
> server is already running for a while, with some config already in place.

Indeed, the wizard should be designed to assist with tweaking an
existing installation.  I see no value at all to trying to run it
before/during initdb.

> That's when file format and/or APIs come into play. Preserving comments
> and/or using them in a wizard isn't too easy with the current format.

[ shrug... ]  We have already wasted more effort arguing about this
issue than it would take to do something good enough for a wizard
tool.  The great thing about simple text formats is that they're pretty
robust.  I think all we need to do is comment out any existing
setting(s) of the target variable and add the new setting before the
first such, perhaps with a comment indicating what added it and when.
If the user finds this too ugly, he can tweak it with a text editor.

This is even assuming that the tool needs to edit the file itself,
rather than just give advice.  The advice is the hard part, folks;
could we stop obsessing about trivia?
        regards, tom lane

Re: Overhauling GUCS

From

Greg Smith

Date:

04 June 2008, 21:41:16

On Wed, 4 Jun 2008, Tom Lane wrote:

> The real problem we need to solve is how to allow newbies to have the
> system auto-configured to something that more or less solves their
> problems.  Putting the config settings in XML does not accomplish that,
> and neither does putting them inside the database.

The subtle issue here is that what makes sense for the database 
configuration changes over time; there's not just one initial generation 
and you're done.  postgresql.conf files can end up moving from one machine 
to another for example.  I think something that doesn't recognize that 
reality and move toward a "tune-up" capability as well as initial 
generation wouldn't be as useful, and that's where putting the settings 
inside the database helps so much.

Also, there's a certain elegance to having a optimization tool that works 
again either a new installation or an existing one.  I personally have 
zero interest in a one-shot config generator.  It just doesn't solve the 
problems I see in the field.  Performance starts out just fine even with 
the default settings when people first start, and then goes to hell after 
the system has been running for a while (and possibly moved to another 
machine).  By that point nobody wants to mess with their configuration 
file unless it's one simple change at a time.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 21:46:09

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?

> That would trump all the other suggestions conclusively. Anyone good at 
> expert systems?

How far could we get with the answers to just three questions:

* How many concurrent queries do you expect to have?

* How much RAM space are you willing to let Postgres use?

* How much "overhead" disk space are you willing to let Postgres use?

concurrent queries drives max_connections, obviously, and RAM space
would drive shared_buffers and effective_cache_size, and both of them
would be needed to size work_mem.  The third one is a bit weird but
I don't see any other good way to set the checkpoint parameters.

If those aren't enough questions, what else must we ask?  Or maybe they
aren't the right questions at all --- maybe we should ask "is this a
dedicated machine or not" and try to extrapolate everything else from
what we (hopefully) can find out about the hardware.
        regards, tom lane

Re: Overhauling GUCS

From

Tom Lane

Date:

04 June 2008, 21:47:16

Greg Smith <gsmith@gregsmith.com> writes:
> On Wed, 4 Jun 2008, Tom Lane wrote:
>> The real problem we need to solve is how to allow newbies to have the
>> system auto-configured to something that more or less solves their
>> problems.  Putting the config settings in XML does not accomplish that,
>> and neither does putting them inside the database.

> The subtle issue here is that what makes sense for the database 
> configuration changes over time; there's not just one initial generation 
> and you're done.  postgresql.conf files can end up moving from one machine 
> to another for example.  I think something that doesn't recognize that 
> reality and move toward a "tune-up" capability as well as initial 
> generation wouldn't be as useful,

As I just mentioned to someone else, I don't see any point in framing it
as an "initial generation" problem at all.  initdb will already give you
settings that work, for some value of "work".  The config wizard really
only needs to deal with the case of tuning an existing installation.

> and that's where putting the settings 
> inside the database helps so much.

How does it help, pray tell?  If you mean being able to see what the
existing settings are, pg_settings already does that.
        regards, tom lane

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

04 June 2008, 22:15:03

* Tom Lane <tgl@sss.pgh.pa.us> [080604 20:46]:
> If those aren't enough questions, what else must we ask?  Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

For these three things:
1) max connections/workmem
2) shared buffers/effective cache
3) bgwriter/checkpoint parameters

What are the stats or measures that PostgreSQL produces (or that you
want it too currently doesn't, but you would like it to produce) that
the "masters" (i.e. people who tune PostgreSQL effectively, like you,
Greg, Simon, Robert, Peter, Josh, Jim, etc - sorry if I missed others)
actually use to decide whether to increase or decrease a value?

I tune my postgresql.conf mainly on folklore, and "assimilated
understanding" from reading the lists and blogs...  But I haven't come
across (or rather, haven't come across and remembered/bookmarked)
anything that helps someone sample/read any stats or counts to find
bottleneck points which to start tuning.

Stuff like (remembering that I've never had to really work at tuning
because PG has always been "fast enough" for my needs, so take this with
a grain of salt)
* Are backends always writing out dirty buffers because there are no free ones?  This might mean tweaking settings
affectingbgwriter.
 
* Are the evicted buffers ones with really high usage counts?  This might mean an increase shared buffers would help?
* Are we always spilling small amounts of data to disk for sorting?  A a small work_mem increase might help...
* Are all our reads from disk really quick?  This probably means OS pagecache has our whole DB, and means
random_page_costcould be tweaked?  
 

If we could get a definitive list of things like this, or maybe just
comprehensive, or even at least agreed-to-not-be-wrong things to look
at, that would go a long way to documentation *how* to tune PG
effectively, and could lead to any projects that want to tackle
examining a running cluster and suggesting some config changes...

a.
-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Steve Atkins

Date:

04 June 2008, 22:20:56

On Jun 4, 2008, at 5:23 PM, Tom Lane wrote:

> Andrew Dunstan <andrew@dunslane.net> writes:
>> Tom Lane wrote:
>>> * Can we build a "configuration wizard" to tell newbies what  
>>> settings
>>> they need to tweak?
>
>> That would trump all the other suggestions conclusively. Anyone  
>> good at
>> expert systems?
>
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?
>
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem.  The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
>
> If those aren't enough questions, what else must we ask?  Or maybe  
> they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

I'd be interested in putting together a framework+GUI client to do this
cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
minimum) sort of way, if no-one else already has such a thing.

A framework doesn't get you all the way there, but it makes it a
whole lot easier to work on what base data and information you
need, and how easy it is to map pgsql-performance and #postgresql
gut feel onto something more algorithmic.

Cheers,  Steve

Re: Overhauling GUCS

From

Greg Smith

Date:

04 June 2008, 22:28:18

On Wed, 4 Jun 2008, Andrew Dunstan wrote:

> Tom Lane wrote:
>> * Can we build a "configuration wizard" to tell newbies what settings
>> they need to tweak?
>
> That would trump all the other suggestions conclusively. Anyone good at 
> expert systems?

Sigh.  I guess we need to start over again.

Last year around this time, there was one of the recurring retreads of 
this topic named "PostgreSQL Configuration Tool for Dummies": 
http://archives.postgresql.org/pgsql-performance/2007-06/msg00386.php

Josh Berkus pointed out that he already had the "expert system" part of 
this problem solved pretty well with a spreadsheet:

http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc (that's in 
the OpenOffice Calc format if you don't know the extension)

That particular spreadsheet has more useful tuning suggestions in this 
area than 99.9% of PostgreSQL users have or will ever know.  You can 
nitpick the exact recommendations, but the actual logic and thinking 
involved is pretty well solved.  It could use a touch of tweaking and 
modernization but it's not too far off from being as good as you're likely 
to get at making guesses without asking the user too many questions. 
There is one ugly technical issue, that you can't increase shared_buffers 
usefully in many situations because of SHMMAX restrictions, and that issue 
will haunt any attempt to be completely automatic.

Where Josh got hung up, where I got hung up, where Lance Campbell stopped 
at with his Dummies tool, and what some unknown number of other people 
have been twarted by, is that taking that knowledge and turning it into a 
tool useful to users is surprisingly difficult.  The reason for that is 
the current postgresql.conf file and how it maps internally to GUC 
information isn't particularly well suited to automated generation, 
analysis, or updates.  I think Josh got lost somewhere in the parsing the 
file stage.  The parts I personally got stuck on were distinguishing 
user-added comments from ones the system put in, plus being completely 
dissatisfied with how lossy the internal GUC process was (I would like a 
lot more information out of pg_settings than are currently there). 
Lance's helper tool was hobbled by the limitations of being a simple web 
application.

That's the background to Josh's proposal.  It has about an 80% overlap 
with what I was working on suggesting, which is why I jumped on his 
bandwagon so fast.  The outline at 
http://wiki.postgresql.org/wiki/GUCS_Overhaul includes the superset of our 
respective thinking on the first step here toward straightening out this 
mess, further expanded with observations made in this thread.

I would respectively point out that comments about the actual tuning 
itself have no bearing whatsoever on this proposal.  This is trying to 
nail down all the features needed to support both doing an initial 
generation and subsequent incremental improvements to the postgresql.conf 
file, while also reducing some redundancy in the code itself.  Reducing 
the scope to only handling initial generation would make this a smaller 
task.  But it happens to fall out that the work required to cut down on 
the redundancy and that required to better support incremental updates as 
well happen to be almost the same.  Josh's stated agenda is to get this 
right in one swoop, with only one version worth of disruption to the 
format, and that goal is served better IMHO as well by addressing all 
these changes as one batch.

I will attempt to resist further outbursts about non-productive comments 
here, and each time I am tempted instead work on prototyping the necessary 
code I think this really needs instead.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Steve Atkins

Date:

04 June 2008, 22:33:58

On Jun 4, 2008, at 6:28 PM, Greg Smith wrote:
>
>
> Josh Berkus pointed out that he already had the "expert system" part  
> of this problem solved pretty well with a spreadsheet:
>
> http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc  
> (that's in the OpenOffice Calc format if you don't know the extension)

On Jun 4, 2008, at 6:20 PM, Steve Atkins wrote:
> I'd be interested in putting together a framework+GUI client to do  
> this
> cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum) sort of way, if no-one else already has such a thing.

/me makes go together motions, if nobody has any objection

Cheers,  Steve

Re: Overhauling GUCS

From

Greg Smith

Date:

04 June 2008, 23:05:06

On Wed, 4 Jun 2008, Aidan Van Dyk wrote:

> * Are backends always writing out dirty buffers because there are no free
>  ones?  This might mean tweaking settings affecting bgwriter.

What you mean on the first one is "are backends always writing out dirty 
buffers becuase there are no *clean* ones"; the server operates with no 
*free* buffers as standard operations.  Figuring that out is now easy in 
8.3 with the pg_stat_bgwriter view.

> * Are the evicted buffers ones with really high usage counts?  This
>  might mean an increase shared buffers would help?

Evicted buffers must have a 0 usage count.  The correct question to ask is 
"are buffers never getting high usage counts because they keep getting 
evicted too fast?".  You can look at that in 8.3 using pg_buffercache, 
I've got suggested queries as part of my buffer cache presentation at 
http://www.westnet.com/~gsmith/content/postgresql/

> * Are we always spilling small amounts of data to disk for sorting?  A
>  a small work_mem increase might help...

I was just talking to someone today about building a monitoring tool for 
this.  Not having a clear way to recommend people monitor use of work_mem 
and its brother spilled to disk sorts is an issue right now, I'll whack 
that one myself if someone doesn't beat me to it before I get time.

> * Are all our reads from disk really quick?  This probably means OS
>  pagecache has our whole DB, and means random_page_cost could be
>  tweaked?

This is hard to do with low overhead in an OS-independant way.  The best 
solution available now would use dtrace to try and nail it down.  There's 
movement in this area (systemtap for Linux, recent discussion at the PGCon 
Developer Meeting of possibly needing more platform-specific code) but 
it's not quite there yet.

So everything you mentioned is either recently added/documented or being 
actively worked on somewhere, and the first two were things I worked on 
myself after noticing they were missing.  Believe me, I feel the items 
that still aren't there, but they're moving along at their own pace. 
There's already more tuning knowledge available than tools to help apply 
that knowledge to other people's systems, which is why I think a diversion 
to focus just on that part is so necessary.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

04 June 2008, 23:37:28

* Greg Smith <gsmith@gregsmith.com> [080604 22:14]:

> So everything you mentioned is either recently added/documented or being 
> actively worked on somewhere, and the first two were things I worked on 
> myself after noticing they were missing.  Believe me, I feel the items 
> that still aren't there, but they're moving along at their own pace. 
> There's already more tuning knowledge available than tools to help apply 
> that knowledge to other people's systems, which is why I think a diversion 
> to focus just on that part is so necessary.

But as an administrator/developer, I don't understand the focus on a new
API for "writing my config" for me...  I'ld love a tool that helped me
"analyze" my current running PG database (and yes, that includes getting
*current* settings), and "suggest" config changes, ideally in order that
the tool thinks will make a difference...  I can make that change, and
distribute it.  That's the easy part.  If I really trust the tool then
I'll just blindly run it (depending on it's output format):$TOOL > $PGDATA/postgresql.conf
in which case, I don't care if it groked any of my previous comments and
cruft.  Otherwise, I'll look at it, and integrate some (or all) of the
changes into postgresql.conf using my preferred method of
commenting/SCM/quirks.

a.


-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 00:44:53

On Wed, 4 Jun 2008, Aidan Van Dyk wrote:

> I'd love a tool that helped me "analyze" my current running PG database
> (and yes, that includes getting *current* settings), and "suggest"
> config changes

Fine. To demonstrate why the overhaul is needed, let's start designing a
simple tool whose sole purpose in life is to suggest new settings for
shared_buffers and work_mem. Say we want to show people their current
setting and what we'd recommend instead.

I just created a new cluster on my laptop. It has the following in the
postgresql.conf:

shared_buffers = 32MB # min 128kB or max_connections*16kB #
(changerequires restart)

#work_mem = 1MB # min 64kB

Say I first start designing such a tool by trying to read the
postgresql.conf file to figure out what these values are set to. In order
to accomplish that, I need to parse the whole file correctly, doing things
like turning "32MB" into the actual numeric value so my program can make
decisions based on its value[1]. This basically requires someone writing
a tuning tool replicate the GUC parsing code, which is crazy; at this
point you've already lost most potential tool authors.

But you're smarter than that; instead you use pg_settings:

psql=# select name,setting,unit from pg_settings where name='shared_buffers' or name='work_mem'; name |
setting| unit

----------------+---------+------ shared_buffers | 4096 | 8kB work_mem | 1024 | kB

Now: what do you tell the user their current value is? The way the
postgresql.conf is parsed into memory is lossy; at this point you don't
know anymore what units where specified in the original file. If someone
sees their current setting shown as "4096" but they know they set it to
"32MB", they'll end up confused unless they understand the whole page size
concept--and if they understood that, they'd automatically be disqualified
from being the type of user our theoretical tool is targeted toward. If
you try and make things more user-friendly by always showing the most
human readable version, what about the person who ended up setting this
parameter because they copied an old config file that recommended setting
it to 4096. You show it to them as "32MB"; they'll also be confused and
blame the tool for being bad.

And even if you work through all that, to give good advice here you have
to know things like that shared_buffers is a server parameter requiring
restart, while work_mem is a per-session parameter. Right now, the only
way to know all that is for tool authors to assemble their own database
and keep it up to date with each release. And you just lost another set
of potential authors with that realization.

Next up, we manage to work through all those issues, and someone happily
follows our advice and gets their file updated with a much larger value
for work_mem. Yeah, we are heroes! Or not. You see, in the config file
we just helpful updated for them was this comment just above that setting:

# OMG don't set this too high or the Linux OOM killer will
# take down the server!

(This is not a theoretical example; I have found variations on that text
in two postgresql.conf files and I vaguely recall Josh mentioned running
into it as well).

And now you just *crashed their server* by giving bad advice that was
clearly against the invaluable comment history in the file already. Oh,
but where are those comments located at? Before the setting? After the
setting? In a group at the top? Who can say? Since there's no strong
standard, people put them all over the place.

I don't know about you, but I'm too busy to spend a bunch of time writing
a tool to work around all these issues knowing it is always going to be
fragile, incomplete, and need updating with every GUC change no matter
what. Instead, reconsider http://wiki.postgresql.org/wiki/GUCS_Overhaul ,
having gotten a taste of the motivation behind those changes, and consider
how this would play out after those improvements.

The approach where you connect to the database and query is the easy path.
There is never a reason to consider parsing postgresql.conf. Anybody who
knows how to write a simple script that connects to a database and reads a
table (basically anyone who's written the database client equivilent of
"hello, world") can feel like a potential tool author.

All the information about the postgresql.conf side of every setting is
retained in case you want to show where they came from, or to generate a
new file that's as similar as possible to the original.

Any recommendations you suggest can be trivially annotated with whether
you need to consider max_connections because it's per-session, and whether
people need to restart the server or can just send it a signal, and that
will continue to be the case in the future with minimal work on the tool
author's part.

The defaults are now available, so that it's easy to figure out what
people changed. That is sometimes handy to include as part of this sort
of analysis, and it's necessary to provide improvements like a "strip the
unnecessary junk out of this file" that many people would like from this
sort of tool.

When you show people that you recommend increasing a value to something
larger, any comments about that setting will be shown and they'll know not
to follow the tool's advice if there's a history there.

This seems like such a better place to be that I'd rather drive toward the
server-side changes necessary to support it rather than fight the
difficult tool creation problems. That's why the "focus on a new API for
'writing my config' for me"; that particular goal is just one part of a
set of revisions that streamline the tool creation process in a not
necessarily obvious way. Unless, of course, you've tried to write a
full-circle config tuning tool, in which case most of the proposed changes
in this overhaul jump right out at you.

[1] In the shared_buffers case, it may be possible to just recommend a
value without caring one bit what the current one is. But for work_mem,
you really need to actually understand the value if you want any real
intelligence that combines that information with the maximum connections,
so that you can compute how much memory is left over for things like
effective_cache_size.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"Heikki Linnakangas"

Date:

05 June 2008, 03:38:42

Tom Lane wrote:
> * How much "overhead" disk space are you willing to let Postgres use?
> 
> ... The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.

The way I think about the checkpoint settings is:

1. Set checkpoint_timeout to the max. time you're willing to spend in 
recovery in case of crash or power loss.

2. Set checkpoint_segments to a high value. "High" meaning high enough 
that you'll never reach it in practice. The purpose is just to keep you 
from running out of disk space if something weird happens.

The amount of downtime one is willing to accept in case of power loss is 
a good question to ask because it doesn't require any knowledge of how 
PostgreSQL works; it can be answered directly from the application 
requirements. And if the DBA/developer don't know the answer, he needs 
to figure it out, because it's a very important question not only for 
the database but in general.

I believe checkpoint_timeout correlates quite well with the max. time 
required in recovery. If it took 10 minutes to generate X amount of WAL, 
replaying that WAL will need to do at most the same amount of I/O, which 
should take roughly the same amount of time, regardless of whether the 
I/O was sequential or random. If the system wasn't busy doing updates 
during between the checkpoints, it will of course take less.

As with all settings, the tool will need to explain the tradeoff. 
Smaller checkpoint_timeout means more checkpointing which means more I/O 
which means less average TPS and more WAL generated (because of 
full_page_writes).

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Overhauling GUCS

From

"Heikki Linnakangas"

Date:

05 June 2008, 03:54:22

Guys,

A configuration wizard would be nice, but it would be a good start to 
add a section to the manual on how to do the basic tuning. AFAICS we 
don't have one. Clear instructions on how to set the few most important 
settings like shared_buffers and checkpoint_timeout/segments would 
probably be enough, with a link to the main configuration section that 
explains the rest of the settings.

If people don't read the manual, we can add a link to it from 
postgresql.conf.sample, add a screen to the Windows installer suggesting 
to read it, or even open postgresql.conf in Notepad.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 05:16:06

On Thu, 5 Jun 2008, Heikki Linnakangas wrote:

> A configuration wizard would be nice, but it would be a good start to add a 
> section to the manual on how to do the basic tuning. AFAICS we don't have 
> one. Clear instructions on how to set the few most important settings like 
> shared_buffers and checkpoint_timeout/segments would probably be enough, with 
> a link to the main configuration section that explains the rest of the 
> settings.

It hasn't gelled yet but I'm working on that.  Most of the text needed is 
now linked to at http://wiki.postgresql.org/wiki/Performance_Optimization

I already talked with Chris Browne about merging his document I put first 
in that list with useful pieces from some of mine into one more 
comprehensive document on the Wiki, covering everything you mention here. 
If we took a snapshot of that when it's done and dumped that into the 
manual, I don't think that would be a problem to wrap up before 8.4 is 
done.  I'd like to include a link to the above performance page in that 
section of the manual as well, both so that people are more likely to find 
fresh content as well as to give them pointers toward more resources than 
the manual can possibly cover.

> If people don't read the manual, we can add a link to it from 
> postgresql.conf.sample, add a screen to the Windows installer suggesting 
> to read it, or even open postgresql.conf in Notepad.

They don't.  Putting pointers toward a relatively simple performance 
tuning document a bit more in people's faces might help lower some of the 
criticism the project takes over providing low defaults for so many 
things.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Martijn van Oosterhout

Date:

05 June 2008, 12:37:36

On Thu, Jun 05, 2008 at 01:23:53AM +0200, Rainer Bauer wrote:
> I think it would be an enourmous help for beginners if they had a "simple"
> tuning tool which would tell them which values where altered (and possibly
> why) from Postgres' default settings based on some basic information.
> Like:
> - machine hardware (disk layout, OS, installed memory, etc.)
> - application usage (no. of clients, read/write activity, etc)

It would be possible to make a program that worked like:
# pg_autotune
Detected 4GB memory, 2.3GB free
Measured random_page_cost=2.3
Select expected usage: (d)edicated, (n)ormal, (t)iny
> t
Shared_buffers 64MB
Configuration stored to /etc/postgresql/8.3/postgresql.conf
#

This would probably solve most issues.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: Overhauling GUCS

From

Tom Lane

Date:

05 June 2008, 12:54:15

"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> A configuration wizard would be nice, but it would be a good start to 
> add a section to the manual on how to do the basic tuning.

+1.  If we can't write an explanation of what to do, we certainly aren't
going to be able to implement it in a wizard.  Agreeing on what to put
in the manual would also go a long way towards providing an
implementation spec for the wizard ...
        regards, tom lane

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

05 June 2008, 13:23:56

Greg Smith wrote:

> Where Josh got hung up, where I got hung up, where Lance Campbell stopped 
> at with his Dummies tool, and what some unknown number of other people  
> have been twarted by, is that taking that knowledge and turning it into a 
> tool useful to users is surprisingly difficult.  The reason for that is  
> the current postgresql.conf file and how it maps internally to GUC  
> information isn't particularly well suited to automated generation,  
> analysis, or updates.

What I think this says is that we should be pushing Magnus more to
continue work on the configuration API thing he was designing.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Overhauling GUCS

From

"Heikki Linnakangas"

Date:

05 June 2008, 14:05:20

Alvaro Herrera wrote:
> Greg Smith wrote:
> 
>> Where Josh got hung up, where I got hung up, where Lance Campbell stopped 
>> at with his Dummies tool, and what some unknown number of other people  
>> have been twarted by, is that taking that knowledge and turning it into a 
>> tool useful to users is surprisingly difficult.  The reason for that is  
>> the current postgresql.conf file and how it maps internally to GUC  
>> information isn't particularly well suited to automated generation,  
>> analysis, or updates.
> 
> What I think this says is that we should be pushing Magnus more to
> continue work on the configuration API thing he was designing.

There's some magic in pgAdmin to parse and write the file. You could 
take a look a that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

05 June 2008, 14:10:56

Heikki Linnakangas wrote:
> Alvaro Herrera wrote:

>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
>
> There's some magic in pgAdmin to parse and write the file. You could  
> take a look a that.

That's what they want to get rid of.  It's a problem that every GUI
writer needs to reimplement that functionality; ISTM this is something
worth having in the backend.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Overhauling GUCS

From

Magnus Hagander

Date:

05 June 2008, 14:12:16

Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Greg Smith wrote:
>>
>>> Where Josh got hung up, where I got hung up, where Lance Campbell
>>> stopped at with his Dummies tool, and what some unknown number of
>>> other people  have been twarted by, is that taking that knowledge and
>>> turning it into a tool useful to users is surprisingly difficult. 
>>> The reason for that is  the current postgresql.conf file and how it
>>> maps internally to GUC  information isn't particularly well suited to
>>> automated generation,  analysis, or updates.
>>
>> What I think this says is that we should be pushing Magnus more to
>> continue work on the configuration API thing he was designing.
> 
> There's some magic in pgAdmin to parse and write the file. You could
> take a look a that.

Not really. You don't want to go there. We really need a "proper API"
for it, and the stuff in pgAdmin isn't even enough to base one on.

//Magnus

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 14:24:33

On Thu, 5 Jun 2008, Magnus Hagander wrote:

> We really need a "proper API" for it, and the stuff in pgAdmin isn't 
> even enough to base one on.

I would be curious to hear your opinion on whether the GUC overhaul 
discussed in this thread is a useful precursor to building such a proper 
API.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Andreas Pflug

Date:

05 June 2008, 14:35:23

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't 
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul 
> discussed in this thread is a useful precursor to building such a 
> proper API.

Since I'm the guy who initially wrote that config file editing stuff, I 
feel somehow addressed.
The answer is a clear ABSOLUTELY.

- The current implementation is able to edit the file directly or 
through pgsql functions; any format change will affect that function 
immediately.
- If documentation is enhanced by adding more comments in the 
postgresql.conf file, this won't help the editor because it can't rely 
on it to present help and hints to the user. It needs the comments/help 
in pg_settings or alike.


Regards,
Andreas

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

05 June 2008, 14:42:01

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
>
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't  
>> even enough to base one on.
>
> I would be curious to hear your opinion on whether the GUC overhaul  
> discussed in this thread is a useful precursor to building such a proper  
> API.

I must say that I am confused by this thread.  What's the discussed GUC
overhaul?  Things that I vaguely recall being proposed are (in the order
they came to mind):

(1) Add a lot more comments to each setting
(2) Add documentation links to each setting
(3) Move more frequently used settings to the top of the file
(4) Ship different sample config files
(5) Create an expert system to suggest tuning
(6) Other random ideas (XML, settings in database, others?)

To me, there are two ideas that are doable right now, which are (2) and
(4).  (1) seems to be a step backwards in pg_hba.conf experience, and we
would have to maintain duplicate documentation.  (3) seems messy.  (5)
is a lot of work; do we have volunteers?  As for (6), the two examples I
give can be easily dismissed.

(2) and (4) do not seem necessary to get the config API built.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Overhauling GUCS

From

Robert Lor

Date:

05 June 2008, 14:42:49

Steve Atkins wrote:
>
> I'd be interested in putting together a framework+GUI client to do this
> cleanly in a cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum) sort of way, if no-one else already has such a thing.
>
This is a great idea, and I was thinking along the same line. The 
framework can provide generic interfaces for the GUI/Web client, and 
leave it up to the OS to provide the needed data.

-Robert

Re: Overhauling GUCS

From

Robert Lor

Date:

05 June 2008, 15:42:43

Tom Lane wrote:
> This is even assuming that the tool needs to edit the file itself,
> rather than just give advice.  The advice is the hard part, folks;
> could we stop obsessing about trivia?
+1. IMHO, the tool doesn't need to worry about generating a prettied 
version of postgresql.conf. It should just inform the user about the 
appropriate settings or create a postgresql.conf.recommend and have the 
user update postgresql.conf himself.

-Robert

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 16:17:12

On Thu, 5 Jun 2008, Alvaro Herrera wrote:

> I must say that I am confused by this thread.  What's the discussed GUC
> overhaul?

http://wiki.postgresql.org/wiki/GUCS_Overhaul

I drop that URL in every other message in hopes that people might start 
commenting on it directly if they see it enough; the fact that you're 
confused says I may need to keep that up :(

> (1) Add a lot more comments to each setting
> (2) Add documentation links to each setting
> (3) Move more frequently used settings to the top of the file
> (4) Ship different sample config files
> (5) Create an expert system to suggest tuning
> (6) Other random ideas (XML, settings in database, others?)
>
> To me, there are two ideas that are doable right now, which are (2) and
> (4).  (1) seems to be a step backwards in pg_hba.conf experience, and we
> would have to maintain duplicate documentation.  (3) seems messy.  (5)
> is a lot of work; do we have volunteers?  As for (6), the two examples I
> give can be easily dismissed.
> (2) and (4) do not seem necessary to get the config API built.

(1) is in that proposal but is strictly optional as something to put in 
the configuration file itself.  The idea behind (2) is to enable tool 
authors to have an easier way to suggest where to head for more 
information.  I'd like for it to be trivial for a tool to say "Suggested 
value for <x> is <y>; see 
http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html 
for more information".  I know what most of the settings I tinker with do, 
but even I'd like it to be easier to find the right spot in the manual; 
for newbies it's vital.  You are correct that (2) isn't strictly necessary 
here, but it's valuable and will be easier to wrap into this than to bolt 
on later.

(3) (4) (5) and (6) were off-topic diversions.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Robert Lor

Date:

05 June 2008, 16:46:45

Tom Lane wrote:
> If those aren't enough questions, what else must we ask?  Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.
>   
I think probably a combination of  high and low-level questions and make 
the low-level (more specific) questions optional since some users may 
not be able to provide low-level info.  Here's a rough idea of how I 
envision this tool should work.

$ pg_config_wizard

Is this machine dedicated to Postgres? (y/n) n   (now tool 
auto-discovers available HW resources)
Your system has 32GB memory. What percentage do you want to allocate to 
Postgres? (1=50%, 2=33%, 3=25%, etc) 1
What type of workload? (OLTP, DSS, etc)
...

At the end, the tool runs for a while checking to see if certain 
thresholds are reached to determine which parameters need to be 
increased.  The tool would change the parameters causing the 
bottlenecks, rerun Postgres/workload, and iterate a few times until the 
results are satfisfactory. Write the recommended settings to 
postgresql.conf.recommend and let the user update postgresql.conf 
himself or whatever.

I update my postgresql.conf, rerun the app and see 100% increased in 
throughput :-)

-Robert

Re: Overhauling GUCS

From

Ron Mayer

Date:

05 June 2008, 16:50:42

Tom Lane wrote:
> How far could we get with the answers to just three questions:
> 
> * How many concurrent queries do you expect to have?
> * How much RAM space are you willing to let Postgres use?
> * How much "overhead" disk space are you willing to let Postgres use?

+1 to this approach - these are the kinds of questions that
make sense to me when first setting up a new installation.
They sound useful for both large servers and tiny (salesguy
laptop for demos) installations.

> If those aren't enough questions, what else must we ask? 

* Perhaps something to guess FSM settings?  I think FSM is  tunable I most often get wrong with more painful
consequences(bloat) than other tunables.  My approach is to have cron run database-wide vacuums  even on systems with
autovacuumjust to see the log  messages about FSM.

* Something to tune vacuum delay?  Perhaps:  How much I/O bandwidth can be dedicated to Postgres  background
activities?

Re: Overhauling GUCS

From

Ron Mayer

Date:

05 June 2008, 16:53:58

Steve Atkins wrote:
> ... cross-platform (Windows, Linux, Solaris, OS X as a bare
> minimum) 

I wonder how cross-platform the tuning algorithm itself is.

I could also imagine that decisions like "do I let the OS page
cache, or postgres's buffer cache get most of the memory" are
extremely OS dependent.

Also, the configuration tool would be even more useful if it could
offer extra platform-specific advice like "hey, I see you're using
this filesystem. You should be using this journaling option for WAL
and this other one for data", or "on your system you should be using
this fsync method", or "use relatime or noatime when mounting your
disk".

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

05 June 2008, 18:21:37

* Greg Smith <gsmith@gregsmith.com> [080605 15:17]:
> On Thu, 5 Jun 2008, Alvaro Herrera wrote:
> 
> >I must say that I am confused by this thread.  What's the discussed GUC
> >overhaul?
> 
> http://wiki.postgresql.org/wiki/GUCS_Overhaul
> 
> I drop that URL in every other message in hopes that people might start 
> commenting on it directly if they see it enough; the fact that you're 
> confused says I may need to keep that up :(

I've read it.  A couple times now.  And I like lots of it.

> >(1) Add a lot more comments to each setting
> >(2) Add documentation links to each setting
> >(3) Move more frequently used settings to the top of the file
> >(4) Ship different sample config files
> >(5) Create an expert system to suggest tuning
> >(6) Other random ideas (XML, settings in database, others?)
> (3) (4) (5) and (6) were off-topic diversions.

But, right from the above mentioned page:

*Goals*
 By shifting from a model where postgresql.conf is document-formatted and hand-edited to one where it's machine
generated,it becomes vastly easier to write simple utilities to manage these settings. Right now, the big "obstacle" to
thingslike SET PERSISTENT is "how to we preserve the hand-edited comments in the file" -- and the answer is we don't.

This little goal leads to:
 * By having a generated postgresql.conf and an easy way to generate   it, writing autoconfiguration scripts (as well
asshortcuts like SET    PERSISTENT) become vastly easier. 

And later:
 There needs to be a way to modify the underlying settings and save that into a new machine-generated postgresql.conf
file.Is implementing SET PERSISTENT sufficient for that? 

I think that these parts, seemingly "snuck into" the GUC overhaul
proposal is what people like me a wary of.   People like me don't want
to have postgresql.conf be *only* a machine-generated file, which I am
not allowed to edit anymore because next DBA doing a "SET PERSISTANT"
type of command is going to cause postgres to write out something else,
over-writing my carefully documented reason for some particular setting.

But the big issue I have (not that it really matters, because I'm not
one of the ones working on it, so I please don't take this as me telling
anyone what they can or can't do) is that that goal doesn't solve any of
the listed problems stated in the proposal:
  1.  Most people have no idea how to set these.
  2. The current postgresql.conf file is a huge mess of 194 options,     the vast majority of which most users will
nevertouch.  

  3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,     and settings.sgml), which are only synched
witheach other manually.  

  4. We don't seem to be getting any closer to autotuning. 

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 18:48:00

On Thu, 5 Jun 2008, Aidan Van Dyk wrote:

> People like me don't want to have postgresql.conf be *only* a 
> machine-generated file, which I am not allowed to edit anymore because 
> next DBA doing a "SET PERSISTANT" type of command is going to cause 
> postgres to write out something else, over-writing my carefully 
> documented reason for some particular setting.

This is why there's the emphasis on preserving comments as they pass into 
the GUC structure and back to an output file.  This is one of the 
implementation details I haven't fully made up my mind on:  how to clearly 
label user comments in the postgresql.conf to distinguish them from 
verbose ones added to the file.  I have no intention of letting manual 
user edits go away; what I'm trying to do here (and this part is much more 
me than Josh) is make them more uniform such that they can co-exist with 
machine edits without either stomping on the other.  Right now doing that 
is difficult, because it's impossible to tell the default comments from 
the ones the users added and the current comment structure bleeds onto the 
same lines as the settings.

> But the big issue I have (not that it really matters, because I'm not
> one of the ones working on it, so I please don't take this as me telling
> anyone what they can or can't do) is that that goal doesn't solve any of
> the listed problems stated in the proposal
>   1.  Most people have no idea how to set these.

Making it much easier to build recommendation tools is how this helps 
here.

>   2. The current postgresql.conf file is a huge mess of 194 options,
>      the vast majority of which most users will never touch.

The proposed pg_generate_conf tool includes options to spit out a basic 
configuration file instead of the complete one.

>   3. GUCS lists are kept in 3 different places (guc.c, postgresql.conf,
>      and settings.sgml), which are only synched with each other manually.

The proposal throws away having a separate postgresql.conf file, so that 
reduces it from 3 places to 2.  That's moving in the right direction

>   4. We don't seem to be getting any closer to autotuning.

If you try to build a tuning tool, these areas end up being the 
unnecessarily hard parts.

Thanks for the comments on the proposal.  I'm only bothering to respond to 
messages like yours now, am deleting all of the continuing attemps to 
divert the discussion over to parameter tuning details or expanding the 
scope here.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

05 June 2008, 21:18:20

On Jun 5, 2008, at 14:47, Greg Smith wrote:

> This is why there's the emphasis on preserving comments as they pass  
> into the GUC structure and back to an output file.  This is one of  
> the implementation details I haven't fully made up my mind on:  how  
> to clearly label user comments in the postgresql.conf to distinguish  
> them from verbose ones added to the file.  I have no intention of  
> letting manual user edits go away; what I'm trying to do here (and  
> this part is much more me than Josh) is make them more uniform such  
> that they can co-exist with machine edits without either stomping on  
> the other.  Right now doing that is difficult, because it's  
> impossible to tell the default comments from the ones the users  
> added and the current comment structure bleeds onto the same lines  
> as the settings.

How about a simple rule, such as that machine-generated comments start  
with "##", while user comments start with just "#"? I think that I've  
seen such a rule used before. At any rate, I think that, unless you  
have some sort of line marker for machine-generated comments, there  
will be no way to tell them apart from user comments.

Other possibilities for machine-comments:

## Machine comment
### Machine comment
#! Machine comment
#@ Machine comment
#$ Machine comment
#^ Machine comment
# Machine comment

I actually kinda like "#!". It's distinctive and unlikely to appear in  
a user comment. Anyway, just food for thought.

Best,

David

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

05 June 2008, 21:28:56

David E. Wheeler wrote:

> How about a simple rule, such as that machine-generated comments start  
> with "##", while user comments start with just "#"? I think that I've  
> seen such a rule used before. At any rate, I think that, unless you have 
> some sort of line marker for machine-generated comments, there will be no 
> way to tell them apart from user comments.

FWIW smb.conf uses ; for one purpose and # for the other.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Overhauling GUCS

From

Greg Smith

Date:

05 June 2008, 21:53:28

On Thu, 5 Jun 2008, Alvaro Herrera wrote:

> FWIW smb.conf uses ; for one purpose and # for the other.

They're actually combining the way UNIX files use # with how Windows INI 
files use ; in a config file context, which I personally find a little 
weird.

I was already considering keeping user comments as # while making all 
system-inserted ones #! ; many people are already used to #! having a 
special system-related meaning from its use in UNIX shell scripting which 
makes it easier to remember.

I think the next step to this whole plan is to generate a next-gen 
postgresql.conf mock-up showing what each of the outputs from the 
pg_generate_conf tool might look like to get feedback on that; it will 
make what is planned here a bit easier to understand as well.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

05 June 2008, 22:07:08

On Jun 5, 2008, at 17:53, Greg Smith wrote:

> I was already considering keeping user comments as # while making  
> all system-inserted ones #! ; many people are already used to #!  
> having a special system-related meaning from its use in UNIX shell  
> scripting which makes it easier to remember.

Oooh, yeah. I hadn't even thought of that! I was just looking at  
characters on my keyboard and typing them in to see which ones I  
thought were most distinctive. This may be part of the reason I  
thought that #! was distinctive. :-)

Best,

David

Re: Overhauling GUCS

From

"Heikki Linnakangas"

Date:

06 June 2008, 03:08:41

David E. Wheeler wrote:
> On Jun 5, 2008, at 14:47, Greg Smith wrote:
> 
>> This is why there's the emphasis on preserving comments as they pass 
>> into the GUC structure and back to an output file.  This is one of the 
>> implementation details I haven't fully made up my mind on:  how to 
>> clearly label user comments in the postgresql.conf to distinguish them 
>> from verbose ones added to the file.  I have no intention of letting 
>> manual user edits go away; what I'm trying to do here (and this part 
>> is much more me than Josh) is make them more uniform such that they 
>> can co-exist with machine edits without either stomping on the other.  
>> Right now doing that is difficult, because it's impossible to tell the 
>> default comments from the ones the users added and the current comment 
>> structure bleeds onto the same lines as the settings.
> 
> How about a simple rule, such as that machine-generated comments start 
> with "##", while user comments start with just "#"? I think that I've 
> seen such a rule used before. At any rate, I think that, unless you have 
> some sort of line marker for machine-generated comments, there will be 
> no way to tell them apart from user comments.

What comments do we consider machine-generated? Just the ones used to 
comment out settings, like

#shared_buffers = 32MB

or something else?

If the automatic tool lets alone all other kind of comments, I think 
we're fine. In fact, it wouldn't necessarily need to modify those 
comments either, it could simply add a new setting line below that:

#shared_buffers = 32MB
shared_buffers = 1024MB

For extra safety, it could comment out old settings, perhaps with 
something like this:

#shared_buffers = 32MB
#shared_buffers = 1024MB  # commented out by wizard on 2008-06-05
shared_buffers = 2048MB

This would preserve a full change history in the file. It would become 
quite messy after a lo of changes, of course, but a user can trim the 
history by hand if he wants to.

Or perhaps we should explicitly mark the settings the tool has 
generated, and comment out:

#shared_buffers = 32MB   # commented out by wizard on 2008-06-05
shared_buffers = 1024MB  # automatically set by wizard on 2008-06-05

That way the tool could safely replace automatically set settings, 
without replacing manually set ones without leaving a clear trace of 
what happened.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 05:51:55

David E. Wheeler wrote:
>
> How about a simple rule, such as that machine-generated comments start 
> with "##", while user comments start with just "#"? I think that I've 
> seen such a rule used before. At any rate, I think that, unless you 
> have some sort of line marker for machine-generated comments, there 
> will be no way to tell them apart from user comments. 

Two heretical questions:
Do we need user generated comments at all?
I can't remember ever having used any comment in postgresql.conf.

Why do so many people here insist on editing postgresql.conf as primary 
means of changing config params?
Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing 
postgresql.conf or doing it manually?


Looking around for different approaches, network appliances come to my 
mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
- use a command line (using ssh or telnet, eqivalent to psql); WRITE 
MEMORY to make the changes survive a reboot.
- use a web interface (or similar tool)
- use tftp to up/download the complete config in and out, editing the 
file. User comments will be lost, with the exception of those that have 
been applied with special comment commands (equivalent to "comment on").



Regards,
Andreas

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 07:13:55

"Andreas Pflug" <pgadmin@pse-consulting.de> writes:

> Why do so many people here insist on editing postgresql.conf as primary means
> of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
> postgresql.conf or doing it manually?

no, it's awful.

> Looking around for different approaches, network appliances come to my mind,
> e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE MEMORY to
> make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing the file.
> User comments will be lost, with the exception of those that have been applied
> with special comment commands (equivalent to "comment on").

And in every major installation I've seen people use the last option. They
treat the original text file which is kept elsewhere -- normally checked into
some revision control system, tracked and managed like source code -- as the
canonical and authoritative version.

The fact that you have to go through an especially complicated and awkward
process to load it onto the switches is just a technical detail to be worked
around. And the fact that somebody could have altered the configuration
directly on the switch and forgotten to write it to the configuration file is
a danger to be worked around with policy, not a benefit.

Essentially what good cisco network admins are doing here is working around
the awkward interface using policy and replacing it with a static text
configuration file kept elsewhere.

Experience with other attempts at automatic tools to edit configuration files
like webmin etc are that editing text files which are also under user control
is just a terrible idea. You can make it almost work but it's always fragile
and can't keep up with the flexibility of free-form text and the creativity of
users.

What I would suggest is going back to the plan when we added include files.
Have an automatically generated config file which is freshly re-generated each
time and doesn't try to preserve any user-formatting. Users then include that
file at the top of postgresql.conf and override any of the settings by setting
them in postgresql.conf after the include.

The important thing here is to separate files which are "under postgres
control" and those "under user control". That doesn't necessarily mean we have
to ban users from touching the automatically generated config file or switch
formats, but it relieves us of any responsibility for maintaining free-form
text.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 07:38:10

Gregory Stark wrote:
> "Andreas Pflug" <pgadmin@pse-consulting.de> writes:
>
>   
>> Why do so many people here insist on editing postgresql.conf as primary means
>> of changing config params?
>> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
>> postgresql.conf or doing it manually?
>>     
>
> no, it's awful.
>
>   

So I wonder why you accept it when configuring schemas. What's the big 
difference between setting a config param, and creating a table?

And ultimately, the config param file format may well look like an SQL 
command file, restricted to SET only.

> And in every major installation I've seen people use the last option. They
> treat the original text file which is kept elsewhere -- normally checked into
> some revision control system, tracked and managed like source code -- as the
> canonical and authoritative version.
>   

That's how you'd have to manage the schema sources too, no? Your 
comments are lost as well after schema creation scripts are executed, 
and manual changes may interfere with that.

Regards,
Andreas

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 08:19:54

"Andreas Pflug" <pgadmin@pse-consulting.de> writes:

> Gregory Stark wrote:
>> "Andreas Pflug" <pgadmin@pse-consulting.de> writes:
>>   
>>> Why do so many people here insist on editing postgresql.conf as primary means
>>> of changing config params?
>>> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing
>>> postgresql.conf or doing it manually?
>>
>> no, it's awful.
>
> So I wonder why you accept it when configuring schemas. What's the big
> difference between setting a config param, and creating a table?

Yes, this is a big impedence mismatch between the old-school DBMS (with
emphasis on the MS part of the acronym) architecture and more modern
approaches. You don't configure your web site with "ALTER PAGE home.html SET
FORM contactus INPUT TYPE SUBMIT" either...

This is partly out of necessity though as those database objects contain
*data*. So merely replacing them with new data objects doesn't give the system
enough information to understand what to do with that data. The DDL commands
which modify the schema give that kind of data mutating instruction.

And I'll note that DBAs go to *great* lengths to do exactly the same kind of
thing I described the CISCO DBAs doing. Usually that means doing an ALTER
command and separately editing a creation script and trying to keep the two in
sync. It's a huge burden.

> And ultimately, the config param file format may well look like an SQL command
> file, restricted to SET only.

So all you have is our existing file except with an additional layer of
quoting to deal with, a useless SET keyword to annoy users, and a file that
you need a bison parser to deal instead of a simple keyword-value syntax that
sed can manipulate.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 08:45:45

Gregory Stark wrote:
> So all you have is our existing file except with an additional layer of
> quoting to deal with, a useless SET keyword to annoy users, and a file that
> you need a bison parser 
Don't you think that's a little over the top, throwing bison at the 
simple task to extend postgresql.conf scanning so it accepts --, /**/ 
and SET?

> to deal instead of a simple keyword-value syntax that
> sed can manipulate.
>
>   
sed seems to be THE killer application... Apart from the fact that sed 
easily could identify SET, my answer would be to use psql to modify the 
config, not sed.

Text config files are NOT friendly for beginner and mediocre users. IMHO 
the current restriction on GUC changes is a major obstacle towards pgsql 
tuning tools, e.g. written as a Google SoC project. Graphic tools aren't 
too popular at pgsql-hackers, but please contemplate a little how much 
pgadmin may have contributed to the pgsql usage boost, esp. on windows.

Regards,
Andreas

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

06 June 2008, 09:24:34

Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk:
> > When reading this thread, I'm wondering if anybody ever saw a config
> > file for a complex software product that was easily editable and
> > understandable. I don't know one. If there was one, it'd be nice to know
> > it so we can learn from it.
>
> PostreSQL, Apache, X.org

And note that one of the major advances in X.org over XFree86 was that all the 
useless garbage was removed from the configuration file, so that the final 
and usable configuration fits on one screen, and you can even write it from 
memory if you dare to.

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

06 June 2008, 09:35:08

Am Mittwoch, 4. Juni 2008 schrieb Tom Lane:
> * Can we present the config options in a more helpful way (this is 99%
> a documentation problem, not a code problem)?

ack

> * Can we build a "configuration wizard" to tell newbies what settings
> they need to tweak?

Some questions to clarify this:

- What settings do "newbies" (or anyone else) typically need to change?  
Please post a list.

- What values would you set those settings to?  Please provide a description 
for arriving at a value, which can later be transformed into code.  Note that 
in some cases, not even the documentation provides more than handwaving help.

- If we know better values, why don't we set them by default?

Another orthogonal stumbling block on the way to making all of this automatic 
is that the surely criticial shared_buffers setting will in any useful 
configuration require messing around with kernel settings that no PostgreSQL 
tool can really help with.

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

06 June 2008, 09:39:38

Am Donnerstag, 5. Juni 2008 schrieb Tom Lane:
> How far could we get with the answers to just three questions:
>
> * How many concurrent queries do you expect to have?
>
> * How much RAM space are you willing to let Postgres use?
>
> * How much "overhead" disk space are you willing to let Postgres use?

This is surely a good start.  We could optimize this even more by saying, disk 
space is cheap, so let's just use a much higher default setting for 
checkpoint_segments.  (If PostgreSQL is installed but not actually used, not 
all the space is actually going to be used anyway.)  Then, increase 
max_connections a bit; that should be OK for most users.  Then you are left 
with the memory settings, and those need kernel tuning in most cases, so any 
automation tool loses.  Hmm.

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

06 June 2008, 09:41:09

* Andreas Pflug <pgadmin@pse-consulting.de> [080606 04:50]:
> David E. Wheeler wrote:
> >
> >How about a simple rule, such as that machine-generated comments start 
> >with "##", while user comments start with just "#"? I think that I've 
> >seen such a rule used before. At any rate, I think that, unless you 
> >have some sort of line marker for machine-generated comments, there 
> >will be no way to tell them apart from user comments. 
> 
> Two heretical questions:
> Do we need user generated comments at all?
> I can't remember ever having used any comment in postgresql.conf.

Well, I have, mainly to leave traces as to what and why I've changed
something from a default/previous value, based on "chagne, hope, and
test" style tuning.

And the one that Greg brought up earlier:
## Don't make this too high, or linux OOM will kill ther server!!!

I'm guessing that comment was put in for a reason too.

> Why do so many people here insist on editing postgresql.conf as primary 
> means of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as sed'ing 
> postgresql.conf or doing it manually?

I would guess main for 3 reasons:

1) It's always been that way, it's the traditional "unix" way, and many  admins are comfortable being able to
comment/makequick test  changes with simple text files.
 
2) Their postgresql.conf are distrubuted/synced/generated from central  provisioning/SCM system
3) PostgreSQL the server isn't even running

> Looking around for different approaches, network appliances come to my 
> mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE 
> MEMORY to make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing the 
> file. User comments will be lost, with the exception of those that have 
> been applied with special comment commands (equivalent to "comment on").

And, of course, other server software comes to mind too:
apache, bind, postfix, sendmail, dhcpd, sshd, cron, xinetd... ;-)

a.
-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

06 June 2008, 10:16:55

* Peter Eisentraut <peter_e@gmx.net> [080606 08:25]:
> Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk:
> > > When reading this thread, I'm wondering if anybody ever saw a config
> > > file for a complex software product that was easily editable and
> > > understandable. I don't know one. If there was one, it'd be nice to know
> > > it so we can learn from it.
> >
> > PostreSQL, Apache, X.org
> 
> And note that one of the major advances in X.org over XFree86 was that all the 
> useless garbage was removed from the configuration file, so that the final 
> and usable configuration fits on one screen, and you can even write it from 
> memory if you dare to.

Ya, and unfortunately, I haven't got my dual-head to work with recent
distro X.org either, probably because I haven't found how how/where/what
to set in Xorg.conf for new version (and my old one doesn't "just work"
on new X.org either).  So I've got an old x.org version pinned, even
though I'm pretty sure that a newer X.org is probably better, dispite my
lack of ability to configure it.

Go figure ;-)

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 11:29:29

"Andreas Pflug" <pgadmin@pse-consulting.de> writes:

> Gregory Stark wrote:
>> So all you have is our existing file except with an additional layer of
>> quoting to deal with, a useless SET keyword to annoy users, and a file that
>> you need a bison parser 
>
> Don't you think that's a little over the top, throwing bison at the simple task
> to extend postgresql.conf scanning so it accepts --, /**/ and SET?

Well you're the one who proposed generalising the simple text file format to
SET commands. 

The whole point of a machine-editable format would be to make it more
restricted and predictable, not less so. We had a proposal to do a
postgresql.conf.auto which was automatically generated included from a
free-form postgresql.conf. That file could generated in a simple format
without worrying about formatting, comments, ordering, etc.

> Text config files are NOT friendly for beginner and mediocre users. IMHO the
> current restriction on GUC changes is a major obstacle towards pgsql tuning
> tools, e.g. written as a Google SoC project. Graphic tools aren't too popular
> at pgsql-hackers, but please contemplate a little how much pgadmin may have
> contributed to the pgsql usage boost, esp. on windows.

Like it or not computers actually have to store state when you're done
entering it via the GUI. If you design around the GUI you end up with system
that can *only* be used via a GUI and spend years trying to work around that
(witness Windows which is only now with a lot of effort recovering from that
mistake).

People got all focused on comments and sed, but in fact there are a host of
reasons to want to be able to directly edit the state stored in a simple text
file. You might like to group together related options, for example, perhaps
grouping together the options that your site have to adjust between dev and
production -- and probably have instructions on how to make the transition in
a commented section.

I happen to think being presented with 200 options is less confusing than
being presented with an empty file and being told to go read the documentation
to find out whether I might want to put anything in it. The documentation is a
good reference but isn't good as a "here's what you'll want to check before
you start" guide.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 12:00:00

"Peter Eisentraut" <peter_e@gmx.net> writes:

> And note that one of the major advances in X.org over XFree86 was that all the 
> useless garbage was removed from the configuration file, so that the final 
> and usable configuration fits on one screen, and you can even write it from 
> memory if you dare to.

This is a good point, though I think the big advance wasn't in actually
removing the garbage so much as making it irrelevant. That is, the advances
were in making the server actually work automatically with less manual
configuration.

That's always a noble goal. I would be interested in what parameters people
thought we could eliminate from the initial config file because the server
normally gets it right anyways and users shouldn't have to adjust it anyways.

(I'm specifically *not* referring to sections like the kerberos parameters
which even if most users can leave them alone those who do use that feature
*do* need to adjust those parameters because the server does *not* get them
right automatically)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Tom Lane

Date:

06 June 2008, 12:13:05

Andreas Pflug <pgadmin@pse-consulting.de> writes:
> Text config files are NOT friendly for beginner and mediocre users. IMHO 
> the current restriction on GUC changes is a major obstacle towards pgsql 
> tuning tools, e.g. written as a Google SoC project. Graphic tools aren't 
> too popular at pgsql-hackers, but please contemplate a little how much 
> pgadmin may have contributed to the pgsql usage boost, esp. on windows.

I grow weary of this thread.  I will say it once more: I do not believe
for one instant that the current formatting of postgresql.conf is the
major impediment, or even a noticeable impediment, to producing a useful
configuration wizard.  If you wish to prove otherwise, provide a
complete wizard except for the parts that touch the config file, and
I will promise to finish it.

I will not read or respond to any further discussion of changing the
config file format.  It's a waste of bandwidth.
        regards, tom lane

Re: Overhauling GUCS

From

Tom Lane

Date:

06 June 2008, 12:17:22

Peter Eisentraut <peter_e@gmx.net> writes:
> - If we know better values, why don't we set them by default?

The problem is: better for what?  In particular, I'm uncomfortable with
any changes in the direction of trying to make Postgres take over the
entire machine by default.  I'd want some fairly explicit permission
from the user for that ...
        regards, tom lane

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 12:37:08

Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> - If we know better values, why don't we set them by default?
> 
> The problem is: better for what?  In particular, I'm uncomfortable with
> any changes in the direction of trying to make Postgres take over the
> entire machine by default.  I'd want some fairly explicit permission
> from the user for that ...

That is where some 80% solution sample config files come in.

Joshua D. Drake

> 
>             regards, tom lane
>

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

06 June 2008, 12:51:28

Am Freitag, 6. Juni 2008 schrieb Tom Lane:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > - If we know better values, why don't we set them by default?
>
> The problem is: better for what?  In particular, I'm uncomfortable with
> any changes in the direction of trying to make Postgres take over the
> entire machine by default.  I'd want some fairly explicit permission
> from the user for that ...

Yes, those are decisions we are going to have to make, eventually.  But recall 
the three step process:

1. What values need changing?

2. What to change them to?

3. Could that be the new default value?

I haven't seen a proposal for item 1 yet, so the rest is idle discussion at 
this time.

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 12:55:42

Peter Eisentraut wrote:
> Am Freitag, 6. Juni 2008 schrieb Tom Lane:
>> Peter Eisentraut <peter_e@gmx.net> writes:
>>> - If we know better values, why don't we set them by default?
>> The problem is: better for what?  In particular, I'm uncomfortable with
>> any changes in the direction of trying to make Postgres take over the
>> entire machine by default.  I'd want some fairly explicit permission
>> from the user for that ...
> 
> Yes, those are decisions we are going to have to make, eventually.  But recall 
> the three step process:
> 
> 1. What values need changing?

shared_buffers
work_mem
maintenance_work_mem
checkpoint_segments
wal_sync_method
effective_cache_size

> I haven't seen a proposal for item 1 yet, so the rest is idle discussion at 
> this time.

I think those cover the biggest low hanging fruit, async_commit is arguable.

Sincerely,

Joshua D. Drake

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

06 June 2008, 12:59:21

Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake:
> That is where some 80% solution sample config files come in.

Considering that writing a sample configuration file is trivial, yet I haven't 
seen a single one posted in the six or more years of GUC, I have no faith in 
this plan until I actually see it implemented.

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 13:22:15

Peter Eisentraut wrote:
> Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake:
>> That is where some 80% solution sample config files come in.
> 
> Considering that writing a sample configuration file is trivial, yet I haven't 
> seen a single one posted in the six or more years of GUC, I have no faith in 
> this plan until I actually see it implemented.

I fail to see why anyone would bother with the amount of noise on this 
thread. If people start leaning toward an actual solution, I am sure the 
work will get done.

Joshua D. Drake

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

06 June 2008, 13:23:24

On Jun 5, 2008, at 23:08, Heikki Linnakangas wrote:

> What comments do we consider machine-generated? Just the ones used  
> to comment out settings, like
>
> #shared_buffers = 32MB
>
> or something else?

Those and documentation comments.

> If the automatic tool lets alone all other kind of comments, I think  
> we're fine. In fact, it wouldn't necessarily need to modify those  
> comments either, it could simply add a new setting line below that:
>
> #shared_buffers = 32MB
> shared_buffers = 1024MB

Well, we've been talking about having varying levels of documentation  
in the comments of the file based on the options passed to the  
configuration program. I think that these are the primary concern,  
though Greg, please do correct me if I'm mistaken.

> For extra safety, it could comment out old settings, perhaps with  
> something like this:
>
> #shared_buffers = 32MB
> #shared_buffers = 1024MB  # commented out by wizard on 2008-06-05
> shared_buffers = 2048MB
>
> This would preserve a full change history in the file. It would  
> become quite messy after a lo of changes, of course, but a user can  
> trim the history by hand if he wants to.

I guess that could be a feature. Personally, I use a vcs system for  
that.

Best,

David

Re: Overhauling GUCS

From

"David E. Wheeler"

Date:

06 June 2008, 13:25:37

On Jun 6, 2008, at 01:50, Andreas Pflug wrote:

> Two heretical questions:
> Do we need user generated comments at all?
> I can't remember ever having used any comment in postgresql.conf.

That's a valid point. I've used comments to note by whom and when when  
a setting was changed.

> Why do so many people here insist on editing postgresql.conf as  
> primary means of changing config params?
> Isn't a psql -c "SET foo=bar; MAKE PERSISTENT" just as good as  
> sed'ing postgresql.conf or doing it manually?

I think that there has been enough pushback against housing all the  
settings in the database, not to mention that it calls for an API,  
that just starting with something simpler to parse the file and  
rewrite it from the command-line might be a better first step.

> Looking around for different approaches, network appliances come to  
> my mind, e.g. Cisco routers and PIX. You have 3 ways to configure a  
> pix:
> - use a command line (using ssh or telnet, eqivalent to psql); WRITE  
> MEMORY to make the changes survive a reboot.
> - use a web interface (or similar tool)
> - use tftp to up/download the complete config in and out, editing  
> the file. User comments will be lost, with the exception of those  
> that have been applied with special comment commands (equivalent to  
> "comment on").

I think the first option there is the one that's been getting the most  
support here.

Best,

David

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 13:45:01

On Wednesday 04 June 2008 15:48:47 Andrew Dunstan wrote:
> simply remove all the comment lines from your
> config file. 

+1. That would clear up a lot of confusion on it's own. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 14:01:13

On Saturday 31 May 2008 17:34:27 David E. Wheeler wrote:
> On May 31, 2008, at 12:36, Gregory Stark wrote:
> > What this sounds like is a sly way to try to get rid of
> > postgresql.conf
> > entirely and replace it with parameters stored in the database so
> > admins would
> > adjust the parameters using an SQL syntax rather than a text file.
> >
> > There are pros and cons of such a system but I think for newbie
> > admins that
> > would be a thousand times *more* baffling. You would have to learn new
> > commands and have no holistic view of what parameters had been set,
> > what
> > related parameters might exist. You also have no way to keep the
> > file in a
> > version control system or sync across servers etc.
>
> FWIW, this has not been a barrier to MySQL adoption.
>

agreed. using sql like syntax is often a more familiar operation than editing 
a text file for DBAs.  Consider the number of DBA's that manage other DB's 
strictly through graphical interfaces (something we are rather lacking in 
ability to do comparativly).  consider also that if this is built atop the 
current pg_settings, all the normal sql tools can still help... ie. 
pg_dump -t if you want to check something into svn. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Aidan Van Dyk

Date:

06 June 2008, 14:07:41

* David E. Wheeler <david@kineticode.com> [080606 12:22]:
> I guess that could be a feature. Personally, I use a vcs system for  
> that.

Bugger... Now we only need to make postgresql check postmaster.conf
into git everytime it makes a change...

;-)


-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 14:16:23


On Fri, 2008-06-06 at 13:07 -0400, Aidan Van Dyk wrote:
> * David E. Wheeler <david@kineticode.com> [080606 12:22]:
>  
> > I guess that could be a feature. Personally, I use a vcs system for  
> > that.
> 
> Bugger... Now we only need to make postgresql check postmaster.conf
> into git everytime it makes a change...

Just work with the FAM api to automate it :P

Joshua D. Drake


> 
> ;-)
> 
>

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

06 June 2008, 14:53:30

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Bugger... Now we only need to make postgresql check postmaster.conf
> into git everytime it makes a change...

Been there, wrote that. (for postgresql.conf anyway).

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200806061351
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhJeXQACgkQvJuQZxSWSshS3QCfQMC092pMGm/FiRJkYmu9j68e
VDYAoM051BtIF8Hsemsrs//jSv3dkGel
=otzg
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 15:33:20

Gregory Stark wrote:
>   
>> Text config files are NOT friendly for beginner and mediocre users. IMHO the
>> current restriction on GUC changes is a major obstacle towards pgsql tuning
>> tools, e.g. written as a Google SoC project. Graphic tools aren't too popular
>> at pgsql-hackers, but please contemplate a little how much pgadmin may have
>> contributed to the pgsql usage boost, esp. on windows.
>>     
>
> Like it or not computers actually have to store state when you're done
> entering it via the GUI. If you design around the GUI you end up with system
> that can *only* be used via a GUI and spend years trying to work around that
> (witness Windows which is only now with a lot of effort recovering from that
> mistake).
>   

I never advocated a file format that isn't editable any more; au 
contraire. And the statement that a GUI configuration contradicts manual 
editing is plain wrong, even with most windows software (if written 
nicely to the registry, you could dump the key, edit and reload it, or 
use regedit).
OTOH, you can't make most windows users happy with a text file version only.

Regards,
Andreas

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 15:46:14

Tom Lane wrote:
> I grow weary of this thread.  I will say it once more: I do not believe
> for one instant that the current formatting of postgresql.conf is the
> major impediment, or even a noticeable impediment, to producing a useful
> configuration wizard.  If you wish to prove otherwise, provide a
> complete wizard except for the parts that touch the config file, and
> I will promise to finish it.
>
> I will not read or respond to any further discussion of changing the
> config file format.  It's a waste of bandwidth.
>   

Your statement doesn't really surprise me. Apparently you kind of 
celebrate misunderstanding my point, which isn't primarily about the 
file format, but about config param accessibility via api/SQL.

I personally wouldn't even think about starting such a wizard, unless I 
have an idea how to push the result into the database. No, not a file, 
but via SQL! So your statement you won't react unless a wizard is almost 
ready is prohibitive, apart from the fact that not only wizards 
(featuring AI) are interesting, but simple config tools as well.

Regards,
Andrads

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 16:06:34

On Monday 02 June 2008 10:12:06 Tom Lane wrote:
> I have no objection to providing alternative ways to edit the
> configuration data, but the primary source of the settings is
> going to continue to be an editable text file.  Any proposals for
> alternatives-to-a-text-editor have to work within that reality.
>

I think there is some disagreement that using text files needs to be 
the "primary" way to edit configurations.  Some people are of the opinion 
that we should focus on text files as a secondary method, generally to be 
used only when a more sql/gui oriented way wont work (ie. your shared buffer 
restart scenario).  I think most of the user base would like to approach 
administration from that point-of-view, and as of yet I haven't seen a 
technical reason why that world view is wrong, only philosphical ones. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Greg Smith

Date:

06 June 2008, 16:23:13

On Fri, 6 Jun 2008, Peter Eisentraut wrote:

> - What settings do "newbies" (or anyone else) typically need to change?
> Please post a list.
> - What values would you set those settings to?  Please provide a description
> for arriving at a value, which can later be transformed into code.  Note that
> in some cases, not even the documentation provides more than handwaving help.

Josh's spreadsheet at 
http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc provides 
five different models for setting the most critical parameters based on 
different types of workloads.  Everyone can quibble over the fine tuning, 
but having a good starter set of reasonable settings for these parameters 
is a solved problem.  It's just painful to build a tool to apply the 
available expert knowledge that is already around.

> - If we know better values, why don't we set them by default?

Because there's not enough information available; the large differences 
between how you tune for different workloads is one example.  Another is 
that people tune for peak and projected activity rather than just what's 
happening right now.  Every model suggested for a tuning wizard recognizes 
you need to ask some set of questions to nail things down.  I continue to 
repeat in broken-record style, exactly what a tuning tool will ask about 
and what settings it will suggest is not important, and getting into that 
is an entirely different discussion (one that gets hashed out every single 
day on pgsql-performance).  The fact that writing such a tool is harder 
than it should be is the issue here.

> Another orthogonal stumbling block on the way to making all of this 
> automatic is that the surely criticial shared_buffers setting will in 
> any useful configuration require messing around with kernel settings 
> that no PostgreSQL tool can really help with.

Yes.  So?  All you can do is point this out to users.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 16:41:42

On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
> On Wed, 4 Jun 2008, Aidan Van Dyk wrote:
> > * Are we always spilling small amounts of data to disk for sorting?  A
> >  a small work_mem increase might help...
>
> I was just talking to someone today about building a monitoring tool for
> this.  Not having a clear way to recommend people monitor use of work_mem
> and its brother spilled to disk sorts is an issue right now, I'll whack
> that one myself if someone doesn't beat me to it before I get time.
>

I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and 
print out anytime something showed up... you could probably find that in the 
archives if you look around. 

of course to me this sounds like an excellent idea for a dtrace probe ;-)

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Steve Atkins

Date:

06 June 2008, 16:50:05

On Jun 6, 2008, at 12:22 PM, Greg Smith wrote:

> On Fri, 6 Jun 2008, Peter Eisentraut wrote:
>
>> - What settings do "newbies" (or anyone else) typically need to  
>> change?
>> Please post a list.
>> - What values would you set those settings to?  Please provide a  
>> description
>> for arriving at a value, which can later be transformed into code.   
>> Note that
>> in some cases, not even the documentation provides more than  
>> handwaving help.
>
> Josh's spreadsheet at http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc 
>  provides five different models for setting the most critical  
> parameters based on different types of workloads.  Everyone can  
> quibble over the fine tuning, but having a good starter set of  
> reasonable settings for these parameters is a solved problem.


>  It's just painful to build a tool to apply the available expert  
> knowledge that is already around.

I hope to have something in a week or so that's a first cut at that  
tool. I'm aiming at a GUI tool at first, as I think that's the main  
need, though once the basics are done an interactive or non- 
interactive CLI version shouldn't be a big deal.

Cheers,  Steve

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 16:57:23

"Andreas Pflug" <pgadmin@pse-consulting.de> writes:

> I personally wouldn't even think about starting such a wizard, unless I have an
> idea how to push the result into the database. No, not a file, but via SQL! So
> your statement you won't react unless a wizard is almost ready is prohibitive,
> apart from the fact that not only wizards (featuring AI) are interesting, but
> simple config tools as well.

Well there's a perfectly good place to start today. Dump out a config file
with the settings your wizard or config tool wants to change from their
current values.

We could come up with a canonical filename for such a file and include an
include directive for it in the standard configuration file or you could just
include instructions saying what include line to add to your config file in
the installation instructions for your tool.

Great, I'm glad we've resolved that issue.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning

Re: Overhauling GUCS

From

Greg Smith

Date:

06 June 2008, 16:59:12

On Fri, 6 Jun 2008, Heikki Linnakangas wrote:

> Or perhaps we should explicitly mark the settings the tool has generated, and 
> comment out:
>
> #shared_buffers = 32MB   # commented out by wizard on 2008-06-05
> shared_buffers = 1024MB  # automatically set by wizard on 2008-06-05

What I would like to do is make the tool spit out a revision history in 
the same way I find all "big IT" shops handling this already:  by putting 
a revision history style commentary above the current setting.  Here's a 
sample:

# 2008-03-02 : 32MB : postgres : Database default
# 2008-05-02 : 512MB : pg_autotune : Wizard update
# 2008-05-15 : 1024MB : gsmith : Increased after benchmark tests
shared_buffers = 1024MB

If the first tuning tool that comes into existance used this format, and 
the format was reasonable, I think it would be possible to get people 
making manual edits to adopt it as well.

The exact details of how this should look are off-topic for the main 
discussion here, though, so I'd prefer if this whole line of discussion 
died off.  Anyone who wants to comment on this whole area, feel free to 
contact me off-list or edit the Wiki page (which has a section on this 
topic now) to hash out suggestions in this area, I'm trying to keep this 
somewhat thread focused now.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 17:02:54

On Thursday 05 June 2008 15:15:14 Greg Smith wrote:
> (1) is in that proposal but is strictly optional as something to put in
> the configuration file itself.  The idea behind (2) is to enable tool
> authors to have an easier way to suggest where to head for more
> information.  I'd like for it to be trivial for a tool to say "Suggested
> value for <x> is <y>; see
> http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html
> for more information".  I know what most of the settings I tinker with do,
> but even I'd like it to be easier to find the right spot in the manual;
> for newbies it's vital.  You are correct that (2) isn't strictly necessary
> here, but it's valuable and will be easier to wrap into this than to bolt
> on later.
>

One idea I have been kicking around is having every guc have a anchor in the 
website (rahter than anchoring on parameter family).  It might be enough to 
just populate the search bot with every guc anchored to family though... 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Robert Treat

Date:

06 June 2008, 17:25:33

On Friday 06 June 2008 08:35:00 Peter Eisentraut wrote:
> Am Mittwoch, 4. Juni 2008 schrieb Tom Lane:
> > * Can we present the config options in a more helpful way (this is 99%
> > a documentation problem, not a code problem)?
>
> ack
>
> > * Can we build a "configuration wizard" to tell newbies what settings
> > they need to tweak?
>
> Some questions to clarify this:
>
> - What settings do "newbies" (or anyone else) typically need to change?
> Please post a list.
>

i do have a defined set (sort of a "top ten you need to know") that i tell 
people about in my postgres 101 tutorials... off the top of my head it looks 
like this:

shared_buffers
effective_cache_size
default_stats_target
work_mem
maintainance_work_mem
listen_address
max_connections
the fsm parameters
checkpoint_segements
random_page_cost

i think others close to the list are constraint_exclusion (unless it defaults 
to on now), max_prepared_transactions (set it to 0 unless you use it... i may 
stop recommending this after noticing its effects on max # system locks 
allowed), and then the logging parameters (which is to say, you need to set 
up logging that works, however you want to do it). 

There are a couple more that might go on the list, like synchronous_commit, 
but its on the fence for now. 

> - What values would you set those settings to?  Please provide a
> description for arriving at a value, which can later be transformed into
> code.  Note that in some cases, not even the documentation provides more
> than handwaving help.
>

I can provide this if you want, will need to look over my notes. one issue 
we've faced in the past with this is something like shared_buffers, where the 
settings is based on 1) dedicated server?, 2) available ram, 3) amount of 
red/write traffic, 4) disk subsystem.  Those types of input are hard to 
quantify in code. 

> - If we know better values, why don't we set them by default?
>

There is a saying, something like "The accumulation of annecdotes is not 
data".  Well, we seem to have a high bar on what proof we need to actually 
change a default GUC settings. default_statistics_target is a prime example, 
where almost no one i know has ever recommended 10 as a default, or suggests 
setting it to 10 as an way to improve performance, but the effort to get it 
changed to something more reasonable has been monumental. 

> Another orthogonal stumbling block on the way to making all of this
> automatic is that the surely criticial shared_buffers setting will in any
> useful configuration require messing around with kernel settings that no
> PostgreSQL tool can really help with.

yep.  seems it might be possible to just compare the shared_buffer setting 
with the kernel parameters before making the change though... not sure in 
which way you would slant the output though. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Greg Smith

Date:

06 June 2008, 17:30:42

On Fri, 6 Jun 2008, Tom Lane wrote:

> I grow weary of this thread.

If we keep it up for, oh, another three years, then maybe you'll be as 
weary as I am of struggling with problems in this area.  Strinking a 
balance between the wants and needs of people who want a fancy GUI tool 
for configuring database settings with those who want to edit things 
manually is a difficult problem that is not going away.  If this didn't 
keep coming back to haunt me all the time I'd like to forget about it 
myself.

> I will say it once more: I do not believe for one instant that the 
> current formatting of postgresql.conf is the major impediment, or even a 
> noticeable impediment, to producing a useful configuration wizard.

Arguments about formatting change to postgresql.conf are a tangent to the 
central questions here, and having just closed some open comments on that 
I am with you on ignoring those as off-topic the same way I keep 
minimizing "what are the parameters to tune?" comments.

Here are the relevant questions around since the first message that are 
not attracting discussion:

1) Is it worthwhile to expand the information stored in the GUC structure 
to make it better capable of supporting machine generation and to provide 
more information for tool authors via pg_settings?  The exact fields that 
should or shouldn't be included remains controversial; consider "default 
value", "per-session/runtime/restart", and "enum lists" as the list of 
things that are most needed there.

2) Should the sample postgresql.conf file be replaced by a program that 
generates it using that beefed up structure instead, therefore removing 
one file that has to be manually kept in sync with the rest of the code 
base right now?

3) What now makes sense for a way to update database parameters for users 
whose primary (or only in some cases) access to the server is over the 
database port, given the other changes have improved automatic config file 
generation?

> If you wish to prove otherwise, provide a complete wizard except for the 
> parts that touch the config file, and I will promise to finish it.

You do realize that if I provided you with such a sample, the not 
implemented yet "config API" stubs it needs to work would be exactly what 
are suggested to add in the proposal page, right?  I (and Josh) didn't 
just make them all up out of nowhere you know.  I wrote a message here 
already about what the seemingly inevitable path the budding "wizard tool 
hacker" follows and why that leads into some of the changes suggested.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

06 June 2008, 17:53:51

Robert Treat wrote:

> One idea I have been kicking around is having every guc have a anchor in the 
> website (rahter than anchoring on parameter family).  It might be enough to 
> just populate the search bot with every guc anchored to family though... 

+1 on the anchor per variable.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Overhauling GUCS

From

Ron Mayer

Date:

06 June 2008, 18:07:07

Joshua D. Drake wrote:
> Tom Lane wrote:
>> Peter Eisentraut <peter_e@gmx.net> writes:
>>> - If we know better values, why don't we set them by default?
>>
>> The problem is: better for what?
> 
> That is where some 80% solution sample config files come in.
+1.

At work I use 3 templates.
* One for salespeople's demo laptops.
* One for a developer's desktop.
* One for our bigger production servers.

The old old default postgresql.conf used to be nice for
the first group.  The newer set of defaults is nicer
for the second group.  Emailing the lists here's the
current best way of tuning for that last case.

I wonder if the fastest way to generate the configurator
would be to simply ask everyone to post their tuned
postgresql.conf files along with a brief description of
the use case for that file.  The we could group the
use-cases into various classes; and average the values
of the submitted files.  Then the configurator's one
question "choose which use case most closely matches
yours from this list".

Re: Overhauling GUCS

From

Gregory Stark

Date:

06 June 2008, 18:11:03

"Greg Smith" <gsmith@gregsmith.com> writes:

> 1) Is it worthwhile to expand the information stored in the GUC structure to
> make it better capable of supporting machine generation and to provide more
> information for tool authors via pg_settings?  The exact fields that should or
> shouldn't be included remains controversial; consider "default value",
> "per-session/runtime/restart", and "enum lists" as the list of things that are
> most needed there.

Isn't that a list of what's *already* there?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Andreas 'ads' Scherbaum

Date:

06 June 2008, 18:23:37

On Thu, 05 Jun 2008 12:53:55 -0700 Ron Mayer wrote:

> Steve Atkins wrote:
> > ... cross-platform (Windows, Linux, Solaris, OS X as a bare
> > minimum) 
> 
> I wonder how cross-platform the tuning algorithm itself is.
> 
> I could also imagine that decisions like "do I let the OS page
> cache, or postgres's buffer cache get most of the memory" are
> extremely OS dependent.

But you can hide most of the internal stuff from the user, either by
generating the config file for the platform the tool is running on or
by an option like "generate config for platform xyz".

If you have cross-platform already in place, this should not be much
overhead.

Kind regards

--             Andreas 'ads' Scherbaum
German PostgreSQL User Group

Re: Overhauling GUCS

From

Robert Lor

Date:

06 June 2008, 18:28:25

Robert Treat wrote:
> On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
>   
>> I was just talking to someone today about building a monitoring tool for
>> this.  Not having a clear way to recommend people monitor use of work_mem
>> and its brother spilled to disk sorts is an issue right now, I'll whack
>> that one myself if someone doesn't beat me to it before I get time.
>>     
>
> I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and 
> print out anytime something showed up... you could probably find that in the 
> archives if you look around. 
>
> of course to me this sounds like an excellent idea for a dtrace probe ;-)
>
>   

Actually, you can find out from the sort-end probe now whether or not 
the sort spilled to disk and number of disk blocks used. This is one of 
the probes from Simon.

TRACE_POSTGRESQL_SORT_END(state->tapeset,                       (state->tapeset ? 
LogicalTapeSetBlocks(state->tapeset) :                       (state->allowedMem - state->availMem + 1023) / 
1024));

-Robert

Re: Overhauling GUCS

From

Greg Smith

Date:

06 June 2008, 18:38:47

On Fri, 6 Jun 2008, Gregory Stark wrote:

> "Greg Smith" <gsmith@gregsmith.com> writes:
>
>> 1) Is it worthwhile to expand the information stored in the GUC structure to
>> make it better capable of supporting machine generation and to provide more
>> information for tool authors via pg_settings?  The exact fields that should or
>> shouldn't be included remains controversial; consider "default value",
>> "per-session/runtime/restart", and "enum lists" as the list of things that are
>> most needed there.
>
> Isn't that a list of what's *already* there?

I should have been clearer there.  Some of the items suggested are already 
in the structure, but aren't visible via pg_settings. In those cases it's 
just exporting information that's already there.  In others (like the 
suggestion to add a URL to the documentation) it is actually a new field 
being added as well as its corresponding entry in the settings view.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Chris Browne

Date:

06 June 2008, 19:04:34

gsmith@gregsmith.com (Greg Smith) writes:
> On Fri, 6 Jun 2008, Heikki Linnakangas wrote:
>
>> Or perhaps we should explicitly mark the settings the tool has
>> generated, and comment out:
>>
>> #shared_buffers = 32MB   # commented out by wizard on 2008-06-05
>> shared_buffers = 1024MB  # automatically set by wizard on 2008-06-05
>
> What I would like to do is make the tool spit out a revision history
> in the same way I find all "big IT" shops handling this already:  by
> putting a revision history style commentary above the current setting.
> Here's a sample:
>
> # 2008-03-02 : 32MB : postgres : Database default
> # 2008-05-02 : 512MB : pg_autotune : Wizard update
> # 2008-05-15 : 1024MB : gsmith : Increased after benchmark tests
> shared_buffers = 1024MB
>
> If the first tuning tool that comes into existance used this format,
> and the format was reasonable, I think it would be possible to get
> people making manual edits to adopt it as well.

My first reaction to this is...
"And why is it that you seem to imagine it a good idea to recreate RCS, badly?"

While that may be an unkind reaction, I don't think it's a
particularly wrong one!

Getting into "comment formatting" certainly *does* seem like an
attempt to do, poorly, what RCS does.  And I am pointing at a tool
from 1985 as opposed to "modern" ones to point out that there were
better approaches to managing versioning of simple text documents
*THIRTY YEARS AGO*.

Indeed, CVS, which we are still using doesn't fundamentally do
anything, in this context, that RCS didn't.

I kind of wish that there was some sort of library for "doing RCS";
even in the absence of that, it would be perfectly logical for a tool
that manages this to DEMAND that there be a "diff" utility or some
equivalent available.

In the absence of that, let me suggest that it would be preferable to
follow VMS's lead and, if you "automagically" make any changes, create
a version called "postgresql.conf;[n]" where "[n]" is either: a) 1, if "postgresql.conf;1" does not exist, or b) 1 +
thelargest integer found.

Replace ";" with some other character as need be.

Add a line (or a few lines) at the top of the current version that
indicates some bit of commented-out-metadata that is obviously
formatted and easily recognizable such as:

### [pgauto] postgresql.conf-autotuner - ran 2008-07-02
### [pgauto] little bit of safely-turfable header
### [pgauto] further little bit of safely-turfable header

In effect, what it does is thus:
- Keep the old copy as postgresql.conf;24  (because we have some 24 elder versions)- Strips off any leading lines that
beginwith "### [pgauto]"- Puts any interesting comments in a set of leading lines that begin with "### [pgauto]"

Merits, over some wacky rewrite-comment-formatting:- This requires only the thinnest bit of "special handling" of
comments.-It doesn't force in a dependancy on [pick-someone's-pet-SCM that you happen to hate]- Even in the absence of
anSCM, it keeps versioning information around- If you have a "diff," it'll work perfectly well, and be reasonably
informative-It doesn't throw ANY old data away!

Sorry, that's more than enough rant for today!  :-)
-- 
"cbbrowne","@","linuxdatabases.info"
http://cbbrowne.com/info/rdbms.html
Rules of the Evil Overlord  #220. "Whatever my one vulnerability is, I
will fake a  different one. For example, ordering  all mirrors removed
from the palace, screaming and flinching whenever someone accidentally
holds up a mirror, etc. In the climax when the hero whips out a mirror
and thrusts it at my face,  my reaction will be ``Hmm...I think I need
a shave.''"  <http://www.eviloverlord.com/>

Re: Overhauling GUCS

From

Andreas Pflug

Date:

06 June 2008, 20:31:29

Gregory Stark wrote:
> "Andreas Pflug" <pgadmin@pse-consulting.de> writes:
> 
>> I personally wouldn't even think about starting such a wizard, unless I have an
>> idea how to push the result into the database. No, not a file, but via SQL! So
>> your statement you won't react unless a wizard is almost ready is prohibitive,
>> apart from the fact that not only wizards (featuring AI) are interesting, but
>> simple config tools as well.
> 
> Well there's a perfectly good place to start today. Dump out a config file

I think I made my point very clear when stating "not a file, but via
SQL". Though I'm not a native English speaker, and I'm sure you
understood. I must assume you're polluting this thread deliberately in
order to sabotage the original intention of this thread. I find this
disgusting.

Ok, trying to contribute gui tools for pgsql is for masochists. We have
vi, sed and grep, no more tools required, right?

Regards,
Andreas

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 20:45:29

On Sat, 2008-06-07 at 01:30 +0200, Andreas Pflug wrote:
> Gregory Stark wrote:
> > "Andreas Pflug" <pgadmin@pse-consulting.de> writes:

> I think I made my point very clear when stating "not a file, but via
> SQL". Though I'm not a native English speaker, and I'm sure you
> understood. I must assume you're polluting this thread deliberately in
> order to sabotage the original intention of this thread. I find this
> disgusting.
> 
> Ok, trying to contribute gui tools for pgsql is for masochists. We have
> vi, sed and grep, no more tools required, right?

Hold on guys, let's not get all out of whack. No one in their right mind
is going to suggest that vi, sed and grep are the only tools we need to
do this.

However a GUI wizard isn't really a way to go either. SQL? Yes it would
be very nice to be able to:

BEGIN;
SET shared_buffers to 65536 COMMENT IS 'Since I have 4G of ram I want
512 megs of shared buffers';
SET effective_cache_size to 262144 COMMENT IS 'Since I have 4G of ram I
want 1 gig of effective_cache';
COMMIT;

When that is done, it would write out the postgresql.conf.

Regardless of all of this, this thread is way out of hand. We went from
overhauling the configuration file to wizards, sample postgresql.conf
files and possibly even half of us migrating to MySQL :P.

Can we refine the topic a bit?

Joshua D. Drake

Re: Overhauling GUCS

From

Tom Lane

Date:

06 June 2008, 21:32:04

Robert Treat <xzilla@users.sourceforge.net> writes:
> There is a saying, something like "The accumulation of annecdotes is not 
> data".  Well, we seem to have a high bar on what proof we need to actually 
> change a default GUC settings. default_statistics_target is a prime example, 
> where almost no one i know has ever recommended 10 as a default, or suggests 
> setting it to 10 as an way to improve performance, but the effort to get it 
> changed to something more reasonable has been monumental. 

Actually, the reason it's still 10 is that the effort expended to get it
changed has been *ZERO*.  I keep asking for someone to make some
measurements, do some benchmarking, anything to make a plausible case
for a specific higher value as being a reasonable place to set it.

The silence has been deafening.
        regards, tom lane

Re: Overhauling GUCS

From

Tom Lane

Date:

06 June 2008, 21:42:59

Greg Smith <gsmith@gregsmith.com> writes:
> On Fri, 6 Jun 2008, Gregory Stark wrote:
>> "Greg Smith" <gsmith@gregsmith.com> writes:
>>> 1) Is it worthwhile to expand the information stored in the GUC structure to
>>> make it better capable of supporting machine generation and to provide more
>>> information for tool authors via pg_settings?  The exact fields that should or
>>> shouldn't be included remains controversial; consider "default value",
>>> "per-session/runtime/restart", and "enum lists" as the list of things that are
>>> most needed there.
>> 
>> Isn't that a list of what's *already* there?

> I should have been clearer there.  Some of the items suggested are already 
> in the structure, but aren't visible via pg_settings.

Well, you can't see the default or reset values in pg_settings, only the
current value.  However, I fail to see the use of either of those for
a configure wizard.  It'll presumably be attached to a fresh connection
so the reset value is not different from the current; and any decent
wizard is going to know perfectly well what the defaults are.

> ... In others (like the 
> suggestion to add a URL to the documentation) it is actually a new field 
> being added as well as its corresponding entry in the settings view.

Offhand I would argue that we should choose the URLs in such a way that
they can be derived automatically if you know the PG version number and
GUC variable name.  (We're almost there already, except that you have
to know which section of the chapter it's in.)  So the need to have a
pg_settings column seems questionable --- seems like it would mostly
be clutter.  Perhaps instead of a view column, it would be useful to
encapsulate the algorithm as a function?  pg_variable_help_url(name)
        regards, tom lane

Re: Overhauling GUCS

From

Greg Smith

Date:

06 June 2008, 22:02:32

On Fri, 6 Jun 2008, Tom Lane wrote:

> Well, you can't see the default or reset values in pg_settings, only the
> current value.  However, I fail to see the use of either of those for
> a configure wizard.

I'm under the impression that the primary reason to put the default in 
there is to make it easier for a file generator program to be decoupled a 
bit from the internal representation.  Regardless, these values should be 
exposed for tool writers.  If you build a prototype interface for an 
interactive settings changing tool, you quickly discover that showing the 
default, range, and recommended setting are all valuable things people 
would like to see when deciding what the change a setting to.  And there's 
no reason accumulating all that info should be the responsibility of a 
tool writer when it's easy to expose and keep up to date inside the 
database itself.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

06 June 2008, 22:08:19

On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:

> Actually, the reason it's still 10 is that the effort expended to get it
> changed has been *ZERO*.  I keep asking for someone to make some
> measurements, do some benchmarking, anything to make a plausible case
> for a specific higher value as being a reasonable place to set it.

> The silence has been deafening.

Not surprising really. It is a simple adjustment to make and it also is
easy to spot when its a problem. However it is not trivial to test for
(in terms of time and effort). I know 10 is wrong and so do you. If you
don't I am curious why I see so many posts from you saying, "Your
estimates are off, what is your default_statistics_target?" with yet
even more responses saying, "Uhh 10." 

Sincerely,

Joshua D. Drake

Re: Overhauling GUCS

From

Tom Lane

Date:

06 June 2008, 23:00:57

"Joshua D. Drake" <jd@commandprompt.com> writes:
> Not surprising really. It is a simple adjustment to make and it also is
> easy to spot when its a problem. However it is not trivial to test for
> (in terms of time and effort). I know 10 is wrong and so do you.

Sure.  But what is right?  I'm afraid to just push it to (say) 100
because of the possibility of O(N^2) behavior in eqjoinsel.  Somebody
needs to do some measurements on somewhat realistic scenarios.
        regards, tom lane

Re: Overhauling GUCS

From

Gregory Stark

Date:

08 June 2008, 20:07:40

"Joshua D. Drake" <jd@commandprompt.com> writes:

> On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
>> Robert Treat <xzilla@users.sourceforge.net> writes:
>
>> Actually, the reason it's still 10 is that the effort expended to get it
>> changed has been *ZERO*.  I keep asking for someone to make some
>> measurements, do some benchmarking, anything to make a plausible case
>> for a specific higher value as being a reasonable place to set it.
>
>> The silence has been deafening.
>
> Not surprising really. It is a simple adjustment to make and it also is
> easy to spot when its a problem. However it is not trivial to test for
> (in terms of time and effort). I know 10 is wrong and so do you. If you
> don't I am curious why I see so many posts from you saying, "Your
> estimates are off, what is your default_statistics_target?" with yet
> even more responses saying, "Uhh 10." 

Ah, but we only ever hear about the cases where it's wrong of course. In other
words even if we raised it to some optimal value we would still have precisely
the same experience of seeing only posts on list about it being insufficient.

What's needed is some speed benchmarks for complex queries with varying size
statistics so we can see how badly large statistic tables hurt planning time. 

The flip side of seeing how much larger tables help planning accuracy is much
harder to measure. Offhand I don't see any systematic way to go about it.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!

Re: Overhauling GUCS

From

Robert Treat

Date:

08 June 2008, 20:54:53

On Sunday 08 June 2008 19:07:21 Gregory Stark wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
> > On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote:
> >> Robert Treat <xzilla@users.sourceforge.net> writes:
> >>
> >> Actually, the reason it's still 10 is that the effort expended to get it
> >> changed has been *ZERO*.  I keep asking for someone to make some
> >> measurements, do some benchmarking, anything to make a plausible case
> >> for a specific higher value as being a reasonable place to set it.
> >>
> >> The silence has been deafening.
> >
> > Not surprising really. It is a simple adjustment to make and it also is
> > easy to spot when its a problem. However it is not trivial to test for
> > (in terms of time and effort). I know 10 is wrong and so do you. If you
> > don't I am curious why I see so many posts from you saying, "Your
> > estimates are off, what is your default_statistics_target?" with yet
> > even more responses saying, "Uhh 10."
>
> Ah, but we only ever hear about the cases where it's wrong of course. In
> other words even if we raised it to some optimal value we would still have
> precisely the same experience of seeing only posts on list about it being
> insufficient.
>

The slipside to this is that we're not trying to find the perfect setting, 
we're just trying to determine a number that will cause more benefit than 
harm compared to the number we have now. While I am sure there are cases 
where 100 is too low as well, I cannot recall ever having seen someone 
suggest lowering the default_stats_target to something less than 100.  (I 
know sit back and wait for someone to comb the archives, just to find that 1 
time). 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Josh Berkus

Date:

09 June 2008, 15:08:57

Tom,

>>> Actually, the reason it's still 10 is that the effort expended to get it
>>> changed has been *ZERO*.  I keep asking for someone to make some
>>> measurements, do some benchmarking, anything to make a plausible case
>>> for a specific higher value as being a reasonable place to set it.
>>> The silence has been deafening.
>> Not surprising really. It is a simple adjustment to make and it also is
>> easy to spot when its a problem. However it is not trivial to test for
>> (in terms of time and effort). I know 10 is wrong and so do you. If you
>> don't I am curious why I see so many posts from you saying, "Your
>> estimates are off, what is your default_statistics_target?" with yet
>> even more responses saying, "Uhh 10." 

I tried (back in 7.4) to do some systematic testing of this.  The 
problem is that the cases were higher d_s_t are required are 
specifically ones with complex, unbalanced data distributions and/or 
very large databases.  This makes test cases extremely difficult and 
time-consuming to generate; further, I found that the test cases I had 
from my clients' databases were not portable (in addition to being 
confidential).

Also, I'd actually assert that "10" seems to be perfectly adequate for 
the majority of users.  That is, the number of users where I've 
recommended increasing d_s_t for the whole database is smaller than the 
number where I don't, and of course we never hear from most users at 
all.  So I'm pretty happy recommending "Leave the default.  If you 
encounter problem queries, increase it to 100, and analyse the database.  If you're running a data warehouse, increase
itto 1000."
 

Where analyze does systematically fall down is with databases over 500GB 
in size, but that's not a function of d_s_t but rather of our tiny 
sample size.

--Josh

Re: Overhauling GUCS

From

Gregory Stark

Date:

09 June 2008, 17:35:50

"Josh Berkus" <josh@agliodbs.com> writes:

> Where analyze does systematically fall down is with databases over 500GB in
> size, but that's not a function of d_s_t but rather of our tiny sample size.

Speak to the statisticians. Our sample size is calculated using the same
theory behind polls which sample 600 people to learn what 250 million people
are going to do on election day. You do NOT need (significantly) larger
samples for larger populations.

In fact where those polls have difficulty is the same place we have some
problems. For *smaller* populations like individual congressional races you
need to have nearly the same 600 sample for each of those small races. That
adds up to a lot more than 600 total. In our case it means when queries cover
a range much less than a whole bucket then the confidence interval increases
too.

Also, our estimates for n_distinct are very unreliable. The math behind
sampling for statistics just doesn't work the same way for properties like
n_distinct. For that Josh is right, we *would* need a sample size proportional
to the whole data set which would practically require us to scan the whole
table (and have a technique for summarizing the results in a nearly constant
sized data structure).

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Overhauling GUCS

From

"Hakan Kocaman"

Date:

09 June 2008, 18:51:30

On 6/9/08, Gregory Stark <stark@enterprisedb.com> wrote:

"Josh Berkus" <josh@agliodbs.com> writes:

> Where analyze does systematically fall down is with databases over 500GB in
> size, but that's not a function of d_s_t but rather of our tiny sample size.

n_distinct. For that Josh is right, we *would* need a sample size proportional
to the whole data set which would practically require us to scan the whole
table (and have a technique for summarizing the results in a nearly constant
sized data structure).

Hi,
is this (summarizing results in a constant sized data structure) something which could be achived by Bloom-Filters ?
http://archives.postgresql.org/pgsql-general/2008-06/msg00076.php

Kind regards
Hakan Kocaman

Re: Overhauling GUCS

From

Gregory Stark

Date:

09 June 2008, 20:02:36

"Hakan Kocaman" <hkocam@googlemail.com> writes:

> On 6/9/08, Gregory Stark <stark@enterprisedb.com> wrote:
>>
>> n_distinct. For that Josh is right, we *would* need a sample size
>> proportional to the whole data set which would practically require us to
>> scan the whole table (and have a technique for summarizing the results in a
>> nearly constant sized data structure).
>
> is this (summarizing results in a constant sized data structure) something
> which could be achived by Bloom-Filters ?

Uhm, it would be a bit of a strange application of them but actually it seems
to me that would be a possible approach. It would need a formula for
estimating the number of distinct values given the number of bits set in the
bloom filter. That should be a tractable combinatorics problem (in fact it's
pretty similar to the combinatorics I posted a while back about getting all
the drives in a raid array busy). And if you have a dynamic structure where
the filter size grows then it would overestimate because extra copied bits
would be set.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: Overhauling GUCS

From

Josh Berkus

Date:

09 June 2008, 20:50:27

Greg,

> Speak to the statisticians. Our sample size is calculated using the same
> theory behind polls which sample 600 people to learn what 250 million
> people are going to do on election day. You do NOT need (significantly)
> larger samples for larger populations.

Your analogy is bad.  For elections, the voters have only a few choices.  
In a 300 million row table, there could be 300 million different values, 
and the histogram becomes less accurate for every order of magnitude 
smaller than 300 million it is.

> Also, our estimates for n_distinct are very unreliable. The math behind
> sampling for statistics just doesn't work the same way for properties
> like n_distinct. For that Josh is right, we *would* need a sample size
> proportional to the whole data set which would practically require us to
> scan the whole table (and have a technique for summarizing the results
> in a nearly constant sized data structure).

Actually, a number of papers have shown block-based algorithms which can 
arrive a reasonably confident (between 50% and 250% of accurate) estimates 
based on scanning only 5% of *blocks*.  Simon did some work on this a 
couple years ago, but he and I had difficultly convincing -hackers that a 
genuine problem existed.

You're correct that we'd need to change pg_statistic, though.  For one 
thing, we need to separate the sample size from the histogram size.

Also, we seem to be getting pretty far away from the original GUC 
discussion.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Gregory Stark

Date:

10 June 2008, 07:10:51

"Josh Berkus" <josh@agliodbs.com> writes:

> Greg,
>
>> Speak to the statisticians. Our sample size is calculated using the same
>> theory behind polls which sample 600 people to learn what 250 million
>> people are going to do on election day. You do NOT need (significantly)
>> larger samples for larger populations.
>
> Your analogy is bad.  For elections, the voters have only a few choices.  
> In a 300 million row table, there could be 300 million different values, 
> and the histogram becomes less accurate for every order of magnitude 
> smaller than 300 million it is.

I think you're right that you need, for example, 600 people *for each answer*
to give good poll results. So for a two-way election which is about even
that's about 1,200 people. If one of the candidates is much less popular you
might have to sample many more people before you have 600 people in that
bucket.

The analogous case in our situation is not having 300 million distinct values,
since we're not gathering info on specific values, only the buckets. We need,
for example, 600 samples *for each bucket*. Each bucket is chosen to have the
same number of samples in it. So that means that we always need the same
number of samples for a given number of buckets.

>> Also, our estimates for n_distinct are very unreliable. The math behind
>> sampling for statistics just doesn't work the same way for properties
>> like n_distinct. For that Josh is right, we *would* need a sample size
>> proportional to the whole data set which would practically require us to
>> scan the whole table (and have a technique for summarizing the results
>> in a nearly constant sized data structure).
>
> Actually, a number of papers have shown block-based algorithms which can 
> arrive a reasonably confident (between 50% and 250% of accurate) estimates 
> based on scanning only 5% of *blocks*.  Simon did some work on this a 
> couple years ago, but he and I had difficultly convincing -hackers that a 
> genuine problem existed.

Really? Could you send references? The paper I read surveyed previous work and
found that you needed to scan up to 50% of the table to get good results.
50-250% is considerably looser than what I recall it considering "good"
results so these aren't entirely inconsistent but I thought previous results
were much worse than that. 

> You're correct that we'd need to change pg_statistic, though.  For one 
> thing, we need to separate the sample size from the histogram size.

That amounts to giving users control over the sample size per bucket. Which
allows them to get a more or less accurate estimate for a range covering a
single bucket without changing the size of the bucket.

I'm a bit puzzled which direction you want to go.

> Also, we seem to be getting pretty far away from the original GUC 
> discussion.

Thank heavens :)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: pg_statistics and sample size WAS: Overhauling GUCS

From

Josh Berkus

Date:

10 June 2008, 12:35:20

Greg,

> The analogous case in our situation is not having 300 million distinct
> values, since we're not gathering info on specific values, only the
> buckets. We need, for example, 600 samples *for each bucket*. Each bucket
> is chosen to have the same number of samples in it. So that means that we
> always need the same number of samples for a given number of buckets.

I think that's plausible.  The issue is that in advance of the sampling we 
don't know how many buckets there *are*.  So we first need a proportional 
sample to determine the number of buckets, then we need to retain a histogram 
sample proportional to the number of buckets.  I'd like to see someone with a 
PhD in this weighing in, though.

> Really? Could you send references? The paper I read surveyed previous work
> and found that you needed to scan up to 50% of the table to get good
> results. 50-250% is considerably looser than what I recall it considering
> "good" results so these aren't entirely inconsistent but I thought previous
> results were much worse than that.

Actually, based on my several years selling performance tuning, I generally 
found that as long as estimates were correct within a factor of 3 (33% to 
300%) the correct plan was generally chosen.

There are papers on block-based sampling which were already cited on -hackers; 
I'll hunt through the archives later.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Josh Berkus

Date:

10 June 2008, 13:39:29

Ron,

> I wonder if the fastest way to generate the configurator
> would be to simply ask everyone to post their tuned
> postgresql.conf files along with a brief description of
> the use case for that file.  The we could group the
> use-cases into various classes; and average the values
> of the submitted files.  Then the configurator's one
> question "choose which use case most closely matches
> yours from this list".

This has been proposed a number of times, but is still infeasable for the 
simple reason that a lot of settings are dependant on how much RAM you have 
on the machine.  "Sample configs" would just result in people trying to set 
their shared_buffers to 150% of system RAM if they have a lightweight system, 
or in having 500 different sample configurations, which would produce even 
*more* user confusion.

A generator makes more sense.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Josh Berkus

Date:

10 June 2008, 13:40:08

Robert,

> shared_buffers
> effective_cache_size
> default_stats_target
> work_mem
> maintainance_work_mem
> listen_address
> max_connections
> the fsm parameters
> checkpoint_segements
> random_page_cost

My list is very similar, execept that I drop random_page_cost and add 
synchronous_commit, autovaccum (on or off) and offer 4 "packages" of log 
settings.

So I think that we can get consensus on what are the "newbie" parameters 
pretty easily.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Josh Berkus

Date:

10 June 2008, 13:49:50

On Tuesday 10 June 2008 09:37, Josh Berkus wrote:
> Robert,
>
> > shared_buffers
> > effective_cache_size
> > default_stats_target
> > work_mem
> > maintainance_work_mem
> > listen_address
> > max_connections
> > the fsm parameters
> > checkpoint_segements
> > random_page_cost
>
> My list is very similar, execept that I drop random_page_cost and add
> synchronous_commit, autovaccum (on or off) and offer 4 "packages" of log
> settings.

Oh, and wal_buffers, the default for which we should just change if it weren't 
for SHMMAX.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Tom Lane

Date:

10 June 2008, 15:27:02

Josh Berkus <josh@agliodbs.com> writes:
> Oh, and wal_buffers, the default for which we should just change if it
> weren't for SHMMAX.

Uh, why?  On a workload of mostly small transactions, what value is
there in lots of wal_buffers?
        regards, tom lane

Re: Overhauling GUCS

From

Ron Mayer

Date:

10 June 2008, 15:39:23

Gregory Stark wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
>> ...default_statistics_target?"..."Uhh 10." 
> 
> Ah, but we only ever hear about the cases where it's wrong of course. In other
> words even if we raised it to some optimal value we would still have precisely
> the same experience of seeing only posts on list about it being insufficient.

Yet if it was raised above the optimal, wouldn't you expect to hear cases
where it was too high?

It seems a slow painful way of finding a good setting might be to
keep increasing it until we had cases where the lists start telling
people they need to lower the parameter?

Re: Overhauling GUCS

From

"Heikki Linnakangas"

Date:

11 June 2008, 03:52:17

Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> Oh, and wal_buffers, the default for which we should just change if it
>> weren't for SHMMAX.
> 
> Uh, why?  On a workload of mostly small transactions, what value is
> there in lots of wal_buffers?

None. But there's also little to no harm in having a higher setting; at 
worst you waste a few megabytes of memory. Besides, most databases are 
initialized from some outside source in the beginning, and data loading 
does benefit from a higher wal_buffers setting.

Ideally, of course, there would be no wal_buffers setting, and WAL 
buffers would be allocated from shared_buffers pool on demand...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 12:45:30

Greg Sabino Mullane wrote:
> * The word 'paramters' is still misspelled. :)

Corrected for 8.4.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

"Joshua D. Drake"

Date:

11 June 2008, 13:27:51


On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> Greg Sabino Mullane wrote:
> > * The word 'paramters' is still misspelled. :)
> 
> Corrected for 8.4.

Technically this is a bug fix... why not backpatch it too?

> 
> -- 
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
>

Re: Overhauling GUCS

From

Josh Berkus

Date:

11 June 2008, 13:37:15

Heikki,

> Ideally, of course, there would be no wal_buffers setting, and WAL 
> buffers would be allocated from shared_buffers pool on demand...
> 

+1

--Josh

Re: Overhauling GUCS

From

Josh Berkus

Date:

11 June 2008, 13:37:51

Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> Oh, and wal_buffers, the default for which we should just change if it
>> weren't for SHMMAX.
> 
> Uh, why?  On a workload of mostly small transactions, what value is
> there in lots of wal_buffers?

Actually, it's also useful for any workload with many connections.  Any 
time you have high throughput, really.  We've seen this on DBT2, 
SpecJAppserver and iGen.

--Josh

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

11 June 2008, 13:39:12

Josh Berkus wrote:
> Heikki,
>
>> Ideally, of course, there would be no wal_buffers setting, and WAL  
>> buffers would be allocated from shared_buffers pool on demand...

Same for pg_subtrans, pg_clog, etc (as previously discussed)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 14:19:30

Robert Lor wrote:
> Robert Treat wrote:
> > On Wednesday 04 June 2008 22:04:54 Greg Smith wrote:
> >   
> >> I was just talking to someone today about building a monitoring tool for
> >> this.  Not having a clear way to recommend people monitor use of work_mem
> >> and its brother spilled to disk sorts is an issue right now, I'll whack
> >> that one myself if someone doesn't beat me to it before I get time.
> >>     
> >
> > I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and 
> > print out anytime something showed up... you could probably find that in the 
> > archives if you look around. 
> >
> > of course to me this sounds like an excellent idea for a dtrace probe ;-)
> >
> >   
> 
> Actually, you can find out from the sort-end probe now whether or not 
> the sort spilled to disk and number of disk blocks used. This is one of 
> the probes from Simon.
> 
> TRACE_POSTGRESQL_SORT_END(state->tapeset,
>                         (state->tapeset ? 
> LogicalTapeSetBlocks(state->tapeset) :
>                         (state->allowedMem - state->availMem + 1023) / 
> 1024));

8.3 has GUC log_temp_files.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 14:25:30

Joshua D. Drake wrote:
> 
> 
> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> > Greg Sabino Mullane wrote:
> > > * The word 'paramters' is still misspelled. :)
> > 
> > Corrected for 8.4.
> 
> Technically this is a bug fix... why not backpatch it too?

That might show up as a diff for people doing upgrades where the minor
version changed the spelling.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Magnus Hagander

Date:

11 June 2008, 14:28:44

Bruce Momjian wrote:
> Joshua D. Drake wrote:
>>
>> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
>>> Greg Sabino Mullane wrote:
>>>> * The word 'paramters' is still misspelled. :)
>>> Corrected for 8.4.
>> Technically this is a bug fix... why not backpatch it too?
> 
> That might show up as a diff for people doing upgrades where the minor
> version changed the spelling.

People upgrading won't see it, I think. You only see it when you do a
new initdb..

//Magnus

Re: Overhauling GUCS

From

"Jignesh K. Shah"

Date:

11 June 2008, 14:54:37


Josh Berkus wrote:
> Heikki,
>
>> Ideally, of course, there would be no wal_buffers setting, and WAL 
>> buffers would be allocated from shared_buffers pool on demand...
>>
>
> +1
>
> --Josh
>

+1

-Jignesh

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 14:56:33

Dave Page wrote:
> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
> 
> >> I've seen people not doing so more often
> >> than you would think. Perhaps because they are DBAs and not sysadmins? I
> >> also
> >> meant a tool to do things like verify that the changes are valid, as
> >> someone
> >> else mentioned elsewhere in this thread.
> >
> > pg_ctl -D data check?
> >
> > I would +1 that.
> 
> I would also really like to see that - though I'd also like to see an
> SQL interface so we can check a config before saving when editing via
> pgAdmin or similar.

Should this be a TODO?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

"Dave Page"

Date:

11 June 2008, 15:35:37

On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Dave Page wrote:
>> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
>>
>> > pg_ctl -D data check?
>> >
>> > I would +1 that.
>>
>> I would also really like to see that - though I'd also like to see an
>> SQL interface so we can check a config before saving when editing via
>> pgAdmin or similar.
>
> Should this be a TODO?

Yes please.


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 17:54:53

Dave Page wrote:
> On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Dave Page wrote:
> >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
> >>
> >> > pg_ctl -D data check?
> >> >
> >> > I would +1 that.
> >>
> >> I would also really like to see that - though I'd also like to see an
> >> SQL interface so we can check a config before saving when editing via
> >> pgAdmin or similar.
> >
> > Should this be a TODO?
> 
> Yes please.

Added to TODO:
* Add pg_ctl option to do a syntax check of postgresql.conf


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Tom Lane

Date:

11 June 2008, 18:05:35

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Josh Berkus wrote:
>> Ideally, of course, there would be no wal_buffers setting, and WAL  
>> buffers would be allocated from shared_buffers pool on demand...

> Same for pg_subtrans, pg_clog, etc (as previously discussed)

I agree with that for pg_clog and friends, but I'm much more leery of
folding WAL into the same framework.  Its access pattern is *totally*
unlike standard caches, so the argument that this would be good for
performance is resting on nothing but imagination.  Also I'm concerned
about possible deadlocks, because WAL is customarily accessed while
holding one or more exclusive buffer locks.
        regards, tom lane

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 20:01:08

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Tom Lane wrote:
> >> * Can we build a "configuration wizard" to tell newbies what settings
> >> they need to tweak?
> 
> > That would trump all the other suggestions conclusively. Anyone good at 
> > expert systems?
> 
> How far could we get with the answers to just three questions:
> 
> * How many concurrent queries do you expect to have?
> 
> * How much RAM space are you willing to let Postgres use?
> 
> * How much "overhead" disk space are you willing to let Postgres use?
> 
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem.  The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
> 
> If those aren't enough questions, what else must we ask?  Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.

Having returned from Japan, I read through this thread.  It had lots of
ideas (new format for postgresql.conf, more/less comments in
postgresql.conf) but I didn't see any of the ideas getting a majority.

I think we do a good job of making many settings automatic (meaning no
one even sees them), but we don't to a great job of making the visible
settings easy to set, both in the process (no GUI) and in knowing the
proper value.

There are two ideas I did think had merit.  First, using ## for
system-supplied comments, so user comments would be easier to identify.
There might be value in doing that even if it were not helpful for
scripts.

The second idea is the idea of having one parameter depend on another. 
Not only could we do that for some of our existing parameters, but we
could have pseudo-parameters like concurrent_queries, memory_usage, and
extra_disk_space that could be at the top of postgresql.conf and then
affect the other settings.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Tom Lane

Date:

11 June 2008, 20:36:53

Bruce Momjian <bruce@momjian.us> writes:
> The second idea is the idea of having one parameter depend on another. 
> Not only could we do that for some of our existing parameters, but we
> could have pseudo-parameters like concurrent_queries, memory_usage, and
> extra_disk_space that could be at the top of postgresql.conf and then
> affect the other settings.

We have tried to do that in the past, and it didn't work well *at all*.
The idea has a fundamental logical flaw, which is that it's not clear
which parameter wins if the user changes both.
        regards, tom lane

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 20:41:16

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > The second idea is the idea of having one parameter depend on another. 
> > Not only could we do that for some of our existing parameters, but we
> > could have pseudo-parameters like concurrent_queries, memory_usage, and
> > extra_disk_space that could be at the top of postgresql.conf and then
> > affect the other settings.
> 
> We have tried to do that in the past, and it didn't work well *at all*.

We have?  When?

> The idea has a fundamental logical flaw, which is that it's not clear
> which parameter wins if the user changes both.

Yes, you could get into problems by having variable dependency loops,
but I see no way to easily improve configuration without it.

My idea was to have:
memory_usage = 128MB            ## pseudo-parametershared_buffers = $memory_usage * 0.75

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Gregory Stark

Date:

11 June 2008, 21:54:22

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Josh Berkus wrote:
>>> Ideally, of course, there would be no wal_buffers setting, and WAL  
>>> buffers would be allocated from shared_buffers pool on demand...
>
>> Same for pg_subtrans, pg_clog, etc (as previously discussed)
>
> I agree with that for pg_clog and friends, but I'm much more leery of
> folding WAL into the same framework.  Its access pattern is *totally*
> unlike standard caches, so the argument that this would be good for
> performance is resting on nothing but imagination.  Also I'm concerned
> about possible deadlocks, because WAL is customarily accessed while
> holding one or more exclusive buffer locks.

Well it may still be worthwhile stealing buffers from shared_buffers even if
we set a special flag marking them as owned by WAL and out of bounds for
the normal buffer manager.

At least that way we could always steal more if we want or return some, as
long as we're careful about when we do it. That would open the door to having
these parameters be dynamically adjustable. That alone would be worthwhile
even if we bypass all bells and whistles of the buffer manager.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: Overhauling GUCS

From

Josh Berkus

Date:

11 June 2008, 22:03:08

Greg,

> At least that way we could always steal more if we want or return some, as
> long as we're careful about when we do it. That would open the door to having
> these parameters be dynamically adjustable. That alone would be worthwhile
> even if we bypass all bells and whistles of the buffer manager.
> 

One hitch, though, is that asynchronous commit could consume big chunks 
of shared_buffers.  So we might still need a limit for people who are 
using async.

--Josh

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 22:09:22

Magnus Hagander wrote:
> Bruce Momjian wrote:
> > Joshua D. Drake wrote:
> >>
> >> On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote:
> >>> Greg Sabino Mullane wrote:
> >>>> * The word 'paramters' is still misspelled. :)
> >>> Corrected for 8.4.
> >> Technically this is a bug fix... why not backpatch it too?
> > 
> > That might show up as a diff for people doing upgrades where the minor
> > version changed the spelling.
> 
> People upgrading won't see it, I think. You only see it when you do a
> new initdb..

The problem is that people doing initdb with different minor versions
will have different stock postgresql.conf files.  That isn't a huge
problem, but I don't see a need to create the problem just to fix a
spelling mistake that few have observed.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Tom Lane

Date:

11 June 2008, 22:11:35

Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> The second idea is the idea of having one parameter depend on another. 

>> We have tried to do that in the past, and it didn't work well *at all*.

> We have?  When?

Just a couple months ago we had to give up enforcing an
interrelationship between NBuffers and MaxConnections, because it got
too complicated and un-explainable.  I seem to recall some other
interactions in the distant past, but a quick look through the CVS logs
didn't find any smoking guns.

>> The idea has a fundamental logical flaw, which is that it's not clear
>> which parameter wins if the user changes both.

> Yes, you could get into problems by having variable dependency loops,

Who said anything about loops?  What I am talking about is what happens
duringset memory_usage = X;  // implicitly sets work_mem = X/100, sayset work_mem = Y;set memory_usage = Z;
What is work_mem now, and what's your excuse for saying so, and how
will you document the behavior so that users can understand it?
(Just to make things interesting, assume that some of the above SETs
happen via changing postgresql.conf rather than directly.)

If the objective is to make configuration easier to understand,
I don't believe that behind-the-scenes changes of configuration values
will advance that goal.

> but I see no way to easily improve configuration without it.

The higher-level concepts should be things that a configuration wizard
works with, and then tells you how to set the postmaster parameters.
They should not end up in the configure file (unless maybe as comments?)
        regards, tom lane

Re: Overhauling GUCS

From

Tom Lane

Date:

11 June 2008, 22:12:37

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> I agree with that for pg_clog and friends, but I'm much more leery of
>> folding WAL into the same framework.

> Well it may still be worthwhile stealing buffers from shared_buffers even if
> we set a special flag marking them as owned by WAL and out of bounds for
> the normal buffer manager.

> At least that way we could always steal more if we want or return some, as
> long as we're careful about when we do it.

... and as long as you can acquire the WAL per-buffer management space out
of nowhere ...
        regards, tom lane

Re: Overhauling GUCS

From

Gregory Stark

Date:

11 June 2008, 22:12:38

"Josh Berkus" <josh@agliodbs.com> writes:

> Greg,
>
>> At least that way we could always steal more if we want or return some, as
>> long as we're careful about when we do it. That would open the door to having
>> these parameters be dynamically adjustable. That alone would be worthwhile
>> even if we bypass all bells and whistles of the buffer manager.
>>
>
> One hitch, though, is that asynchronous commit could consume big chunks of
> shared_buffers.  So we might still need a limit for people who are using async.

Well currently we use a fixed number of fixed-sized buffers, no? I doubt we'll
change that even if we take this tact of making wal_buffers resizable by
stealing buffers from the buffer manager for precisely the reasons Tom was
describing.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: Overhauling GUCS

From

Greg Smith

Date:

11 June 2008, 22:46:48

On Wed, 11 Jun 2008, Tom Lane wrote:

> Who said anything about loops?  What I am talking about is what happens
> during
>     set memory_usage = X;  // implicitly sets work_mem = X/100, say
>     set work_mem = Y;
>     set memory_usage = Z;
> What is work_mem now, and what's your excuse for saying so, and how
> will you document the behavior so that users can understand it?
> (Just to make things interesting, assume that some of the above SETs
> happen via changing postgresql.conf rather than directly.)

People are already exposed to issues in this area via things like the 
include file mechanism.  You can think of that two ways.  You can say, 
"there's already problems like this so who cares if there's another one". 
Or, you can say "let's not add even more confusion like that".

Having a mini programming language for setting parameters is interesting 
and all, and it might be enough to do a good job of handling the basic 
newbie setup chores.  But I don't think it's a complete solution and 
therefore I find moving in that direction a bit of a distraction; your 
concerns about ambiguity just amplify that feeling.  It's unlikely that 
will get powerful enough to enable the "one true config file" that just 
works for everybody.  There's too many things that depend a bit on both 
data access pattern and on overall database size/structure no matter what 
you do.

[If only there were some technology that did workload profiling and set 
the server parameters based on that.  Some sort of dynamic tuning tool; 
wouldn't that be great?  Oh well, that's just a dream right now I guess.]

I'm not sure if I've stated this explicitly yet, but I personally have no 
interest in just solving the newbie problem.  I want a tool to help out 
tuning medium to large installs, and generating a simple config file is 
absolutely something that should come out of that as a bonus.  Anything 
that just targets the simple installs, though, I'm not very motivated to 
chase after.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 23:35:25

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> The idea has a fundamental logical flaw, which is that it's not clear
> >> which parameter wins if the user changes both.
> 
> > Yes, you could get into problems by having variable dependency loops,
> 
> Who said anything about loops?  What I am talking about is what happens
> during
>     set memory_usage = X;  // implicitly sets work_mem = X/100, say
>     set work_mem = Y;
>     set memory_usage = Z;

My initial thought was that this would behave like a shell script
variable, meaning once you set something it would affect all references
to it below in postgresql.conf.  The problem with that is that we
comment out all settings, so there isn't a logical order like you would
have in a shell script.

I was not thinking of memory_usage implicity changing anything.  I
figured postgresql.conf would have:
memory_usage = 100work_mem = $memory_usage * 0.75

If you change memory_usage via SET, it will not change work_mem at all
because you are not re-initializing the variables.

I am kind of lost how this would work logically and am willing to think
about it some more, but I do think we aren't going to simplify
postgresql.conf without such a facility.

The big problem I see is that right now everything has a constant
default.  If we allowed memory_usage to change some of the defaults, how
would we signal that we want the variables based on it to change their
values?  This is your behind-the-scenes problem you mentioned.

> What is work_mem now, and what's your excuse for saying so, and how
> will you document the behavior so that users can understand it?
> (Just to make things interesting, assume that some of the above SETs
> happen via changing postgresql.conf rather than directly.)
> 
> If the objective is to make configuration easier to understand,
> I don't believe that behind-the-scenes changes of configuration values
> will advance that goal.
> 
> > but I see no way to easily improve configuration without it.
> 
> The higher-level concepts should be things that a configuration wizard
> works with, and then tells you how to set the postmaster parameters.
> They should not end up in the configure file (unless maybe as comments?)

I am concerned that each wizzard is going to have to duplicate the same
logic each time, and adjust to release-based changes.  I thought having
the configuration file itself help with setting some values would be
helpful.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Bruce Momjian

Date:

11 June 2008, 23:36:27

Greg Smith wrote:
> On Wed, 11 Jun 2008, Tom Lane wrote:
> 
> > Who said anything about loops?  What I am talking about is what happens
> > during
> >     set memory_usage = X;  // implicitly sets work_mem = X/100, say
> >     set work_mem = Y;
> >     set memory_usage = Z;
> > What is work_mem now, and what's your excuse for saying so, and how
> > will you document the behavior so that users can understand it?
> > (Just to make things interesting, assume that some of the above SETs
> > happen via changing postgresql.conf rather than directly.)
> 
> People are already exposed to issues in this area via things like the 
> include file mechanism.  You can think of that two ways.  You can say, 
> "there's already problems like this so who cares if there's another one". 
> Or, you can say "let's not add even more confusion like that".
> 
> Having a mini programming language for setting parameters is interesting 
> and all, and it might be enough to do a good job of handling the basic 
> newbie setup chores.  But I don't think it's a complete solution and 
> therefore I find moving in that direction a bit of a distraction; your 
> concerns about ambiguity just amplify that feeling.  It's unlikely that 
> will get powerful enough to enable the "one true config file" that just 
> works for everybody.  There's too many things that depend a bit on both 
> data access pattern and on overall database size/structure no matter what 
> you do.

You are right that the complete solution is going to have to ask users
questions, and my idea of using variables is not going to get us that
far.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

12 June 2008, 13:20:23

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Also, I'd actually assert that "10" seems to be perfectly adequate for
> the majority of users.  That is, the number of users where I've
> recommended increasing d_s_t for the whole database is smaller than the
> number where I don't, and of course we never hear from most users at
> all.  So I'm pretty happy recommending "Leave the default.  If you
> encounter problem queries, increase it to 100, and analyse the database.

Really? I'm the opposite: I never leave a client's setting at 10, that's
just asking for trouble. Making it 100 *after* you encounter problem
queries is reactive; I prefer being proactive. Nor is a setting of 10
"perfectly adequate": I think you might be the last person on the
lists who thinks so. That train has left the station, we've been trying
to decide what a better default should be other than 10, and, more to the
point, how to quantitatively measure it. The problem is, you really can't.
Sure, you can graph a tiny increase in ANALYZE time and disk space, but there
are no stock queries we can use to measure an increase in planning time.
Frankly, I'd be shocked if there is any significant difference and all
compared to the actual query run time.

The orders of magnitude speed up of certain queries when the d_s_t goes
above 98 is what spawned my original thread proposing a change to 100:

http://markmail.org/message/tun3a3juxlsyjbsw

While it's easy to get bogged down in theory about what things
d_s_t should measure, the optimal size of buckets, etc., it's still
a severe performance regression bug that should be fixed, IMO.

Changing the subject line as well: this is only tangentially related
to overhauling GUCS, although I'll point out again that this particular
config is a good example of one that needs more comments.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200806121213
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRTKYACgkQvJuQZxSWSsjGvACeJkXZJ8cP385W9UXKzLHdzhvw
gqQAoJWdrepFbkxR2be7oetK8/o/yd9I
=w469
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

Tom Lane

Date:

12 June 2008, 14:03:28

"Greg Sabino Mullane" <greg@turnstep.com> writes:
> The orders of magnitude speed up of certain queries when the d_s_t goes
> above 98 is what spawned my original thread proposing a change to 100:
> http://markmail.org/message/tun3a3juxlsyjbsw

That was a pretty special case (LIKE/regex estimation), and we've since
eliminated the threshold change in the LIKE/regex estimates anyway, so
there's no longer any reason to pick 100 as opposed to any other number.
So we're still back at "what's a good value and why?".

> Frankly, I'd be shocked if there is any significant difference and all
> compared to the actual query run time.

I'm still concerned about the fact that eqjoinsel() is O(N^2).  Show me
some measurements demonstrating that a deep nest of equijoins doesn't
get noticeably more expensive to plan --- preferably on a datatype with
an expensive equality operator, eg numeric --- and I'm on board.
        regards, tom lane

Re: Overhauling GUCS

From

Gregory Stark

Date:

12 June 2008, 14:27:18

"Greg Sabino Mullane" <greg@turnstep.com> writes:

>> Also, I'd actually assert that "10" seems to be perfectly adequate for
>> the majority of users.  That is, the number of users where I've
>> recommended increasing d_s_t for the whole database is smaller than the
>> number where I don't, and of course we never hear from most users at
>> all.  So I'm pretty happy recommending "Leave the default.  If you
>> encounter problem queries, increase it to 100, and analyse the database.
>
> Really? I'm the opposite: I never leave a client's setting at 10, that's
> just asking for trouble. Making it 100 *after* you encounter problem
> queries is reactive; I prefer being proactive. 

Have you ever measured the system speed before and after?

> Nor is a setting of 10 "perfectly adequate":

What percentage of your plans actually change with the larger statistics? How
many for the better? How many were massively improved?

I suspect you're looking at some single-digit percentage slowdown for planning
across the board. In exchange if you a) have simple queries you probably see
none improving. If you b) have moderately complex queries you probably get
some single-digit percentage of them with improvements. And if you c) have
very complex queries you probably have a handful of them which see massive
improvements. Across the internet there are a whole lot more applications of
type (a) than the others...

> Frankly, I'd be shocked if there is any significant difference and all
> compared to the actual query run time.

Well you might start preparing to be shocked. Note that retrieving the
statistics is a query itself so it's not hard for it to be comparable to a
similarly simple query. It's not hard for a simple query using multiple
columns to be using more records of statistics than it is from the actual
data. And things can look much worse if that data is TOASTed and requires
further lookups and/or decompression...

> The orders of magnitude speed up of certain queries when the d_s_t goes
> above 98 is what spawned my original thread proposing a change to 100:
>
> http://markmail.org/message/tun3a3juxlsyjbsw
>
> While it's easy to get bogged down in theory about what things
> d_s_t should measure, the optimal size of buckets, etc., it's still
> a severe performance regression bug that should be fixed, IMO.

It was, three months ago:

http://archives.postgresql.org/pgsql-committers/2008-03/msg00129.php

100 is no longer a magic threshold for LIKE queries (in CVS HEAD)

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!

Re: Overhauling GUCS

From

Josh Berkus

Date:

12 June 2008, 14:42:29

Bruce,

> I am concerned that each wizzard is going to have to duplicate the same
> logic each time, and adjust to release-based changes. 

I think that's a feature, not a bug.  Right now, I'm not at all convinced that 
my algorithms for setting the various major dials are great (I just think 
that nobody yet has better).  So I think we should *encourage* people to 
write their own wizards until we find one that works reasonably well.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Bruce Momjian

Date:

12 June 2008, 15:21:36

Josh Berkus wrote:
> Bruce,
> 
> > I am concerned that each wizard is going to have to duplicate the same
> > logic each time, and adjust to release-based changes. 
> 
> I think that's a feature, not a bug.  Right now, I'm not at all convinced that 
> my algorithms for setting the various major dials are great (I just think 
> that nobody yet has better).  So I think we should *encourage* people to 
> write their own wizards until we find one that works reasonably well.

I am thinking a web-based wizard would make the most sense.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Josh Berkus

Date:

12 June 2008, 15:23:53

Bruce,

> I am thinking a web-based wizard would make the most sense.

I'd prefer command-line, so that people could run it on their own servers.  
For one thing, we need to generate at least two files on many platforms; a 
postgresql.conf and a sysctl.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Overhauling GUCS

From

Bruce Momjian

Date:

12 June 2008, 15:27:16

Josh Berkus wrote:
> Bruce,
> 
> > I am thinking a web-based wizard would make the most sense.
> 
> I'd prefer command-line, so that people could run it on their own servers.  
> For one thing, we need to generate at least two files on many platforms; a 
> postgresql.conf and a sysctl.

They can just download the files the need from the web page, no?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

Steve Atkins

Date:

12 June 2008, 16:08:32

On Jun 12, 2008, at 11:21 AM, Bruce Momjian wrote:

> Josh Berkus wrote:
>> Bruce,
>>
>>> I am concerned that each wizard is going to have to duplicate the  
>>> same
>>> logic each time, and adjust to release-based changes.
>>
>> I think that's a feature, not a bug.  Right now, I'm not at all  
>> convinced that
>> my algorithms for setting the various major dials are great (I just  
>> think
>> that nobody yet has better).  So I think we should *encourage*  
>> people to
>> write their own wizards until we find one that works reasonably well.
>
> I am thinking a web-based wizard would make the most sense.

There's a definite need for an interactive GUI wizard (bundle with the  
Windows and OS X installers, at least).

And a commandline wizard would certainly be nice, both interactive and  
non-interactive. Mostly for including in install scripts on unix  
platforms.

And a web-based wizard would be useful too.

And all of them would benefit from being able to both modify an  
existing configuration file, and to generate one from scratch.

It looks like it's going to be reasonably easy to abstract away the  
interface to the user such that the first two (and likely the third)  
can be built from the same codebase, meaning that the smarts about how  
to set the various GUC settings (based on RAM available, estimates of  
database size and usage) can be maintained in one place.

Cheers,  Steve

Re: Overhauling GUCS

From

Greg Smith

Date:

12 June 2008, 16:30:13

On Thu, 12 Jun 2008, Bruce Momjian wrote:

> I am thinking a web-based wizard would make the most sense.

I have not a single customer I work with who could use an external 
web-based wizard.  Way too many companies have privacy policy restrictions 
that nobody dare cross by giving out any info about their server, or 
sometimes that they're even using PostgreSQL inside the firewall.  If it's 
not a tool that you can run on the same server you're running PostgreSQL 
on, I'd consider that another diversion that's not worth pursuing.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Decibel!

Date:

12 June 2008, 19:19:00

On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote:
> Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> Tom Lane wrote:
>>>> The idea has a fundamental logical flaw, which is that it's not  
>>>> clear
>>>> which parameter wins if the user changes both.
>>
>>> Yes, you could get into problems by having variable dependency  
>>> loops,
>>
>> Who said anything about loops?  What I am talking about is what  
>> happens
>> during
>>     set memory_usage = X;  // implicitly sets work_mem = X/100, say
>>     set work_mem = Y;
>>     set memory_usage = Z;
>
> My initial thought was that this would behave like a shell script
> variable, meaning once you set something it would affect all  
> references
> to it below in postgresql.conf.  The problem with that is that we
> comment out all settings, so there isn't a logical order like you  
> would
> have in a shell script.
>
> I was not thinking of memory_usage implicity changing anything.  I
> figured postgresql.conf would have:
>
>     memory_usage = 100
>     work_mem = $memory_usage * 0.75
>
> If you change memory_usage via SET, it will not change work_mem at all
> because you are not re-initializing the variables.

Why? That's the exact opposite of what I'd expect. If I want a  
setting's value to be $memory_usage * .75, that's what I want it to  
be. Not some value based on whatever $memory_usage was set to when  
work_mem happened to be changed.

Of course, if you set something to a hard value with no variables,  
then that's what it's set to.

> I am kind of lost how this would work logically and am willing to  
> think
> about it some more, but I do think we aren't going to simplify
> postgresql.conf without such a facility.

Agreed. And I think it's a lot more elegant for dealing with day-to- 
day tuning than some kind of external wizzard.

> The big problem I see is that right now everything has a constant
> default.  If we allowed memory_usage to change some of the  
> defaults, how
> would we signal that we want the variables based on it to change their
> values?  This is your behind-the-scenes problem you mentioned.


I would suggest that we just re-evaluate everything whenever any  
setting is changed (this assumes that we store postgresql.conf  
internally in some fashion, so that we're not actually hitting the  
file all the time and possibly picking up random edits).
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Re: Overhauling GUCS

From

Tom Lane

Date:

12 June 2008, 20:19:36

Decibel! <decibel@decibel.org> writes:
> On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote:
>> I am kind of lost how this would work logically and am willing to  
>> think
>> about it some more, but I do think we aren't going to simplify
>> postgresql.conf without such a facility.

> Agreed. And I think it's a lot more elegant for dealing with day-to- 
> day tuning than some kind of external wizzard.

You guys call this "simplification"?  You're out of your minds.
This proposal is ridiculously complicated, and yet it still fails
even to consider adjusting non-numeric parameters.  And what about
things that require more than a trivial arithmetic expression to
compute?  It's not hard at all to imagine wanting log, sqrt, etc.

We do not need to put computational capability into GUC.  Any
computations needed to determine a parameter setting should be done
by a wizard. 
        regards, tom lane

Re: Better default_statistics_target

From

"Greg Sabino Mullane"

Date:

12 June 2008, 21:55:18

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> That was a pretty special case (LIKE/regex estimation), and we've since
> eliminated the threshold change in the LIKE/regex estimates anyway, so
> there's no longer any reason to pick 100 as opposed to any other number.
> So we're still back at "what's a good value and why?".

Glad to hear that, although I think this is only in HEAD, not backpatched,
right? Well at any rate, I withdraw my strong support for 100 and join in
the quest for a good number. The "anything but 10" campaign.

> I'm still concerned about the fact that eqjoinsel() is O(N^2).  Show me
> some measurements demonstrating that a deep nest of equijoins doesn't
> get noticeably more expensive to plan --- preferably on a datatype with
> an expensive equality operator, eg numeric --- and I'm on board.

I hope someone else on the list can do this, because I can't. :)

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200806122054
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRxToACgkQvJuQZxSWSsj0OwCfel+zN/jQth79RvIHtxpUefQD
APMAmQEKIDS6BzqUjn4eTMzP9NDlxTbE
=JZTe
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

"Greg Sabino Mullane"

Date:

12 June 2008, 22:13:45

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>> Really? I'm the opposite: I never leave a client's setting at 10, that's
>> just asking for trouble. Making it 100 *after* you encounter problem
>> queries is reactive; I prefer being proactive.

> Have you ever measured the system speed before and after?

Yes. No change (see below on caching) or faster (better plans).

>> Nor is a setting of 10 "perfectly adequate":

> What percentage of your plans actually change with the larger statistics? How
> many for the better? How many were massively improved?

It matters not if there is a slight increase in planning time: the penalty
of choosing a Bad Plan far outweighs any increased analyze or planning
cost, period. Are you arguing that 10 is a good default, or just against
larger values in general?

> I suspect you're looking at some single-digit percentage slowdown for planning
> across the board. In exchange if you a) have simple queries you probably see
> none improving. If you b) have moderately complex queries you probably get
> some single-digit percentage of them with improvements. And if you c) have
> very complex queries you probably have a handful of them which see massive
> improvements. Across the internet there are a whole lot more applications of
> type (a) than the others...

I'm still skeptical that it's the case, but I wouldn't mind seeing some figures
about how slowed down a "simple" database gets going from 10 to 100 (or larger).
Robert, any chance we can use Pagila for some sort of test for that?

>> Frankly, I'd be shocked if there is any significant difference and all
>> compared to the actual query run time.

> Well you might start preparing to be shocked. Note that retrieving the
> statistics is a query itself so it's not hard for it to be comparable to a
> similarly simple query. It's not hard for a simple query using multiple
> columns to be using more records of statistics than it is from the actual
> data. And things can look much worse if that data is TOASTed and requires
> further lookups and/or decompression...

Even if all you say above is true, and I think we'll have to agree to disagree
on that, there's an important point to remember: query plans can be (and very
often are) cached. Queries and query results cannot (although I'm working
on that... :) Plans to queries is a 1-N, where N can be very, very large,
and further boosts the query time vs planning time ratio.

...
> 100 is no longer a magic threshold for LIKE queries (in CVS HEAD)

That's great, but doesn't help existing releases (unless that was backpatched,
but I don't recall it if so) But that's a battle I'm going to stop fighting,
and concentrate on helping to find a replacement for 10 that may or may not
be 100.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200806122100
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkhRyXIACgkQvJuQZxSWSshZpwCeOdLZCu0gSQGpOLciQ6H29Tsd
iNgAn3AuoisE8zSbMjLuDL4aWzP6NAth
=ujTa
-----END PGP SIGNATURE-----

Re: Overhauling GUCS

From

James William Pye

Date:

12 June 2008, 23:04:25

> You guys call this "simplification"?  You're out of your minds.
> This proposal is ridiculously complicated, and yet it still fails
> even to consider adjusting non-numeric parameters.  And what about
> things that require more than a trivial arithmetic expression to
> compute?  It's not hard at all to imagine wanting log, sqrt, etc.
>
> We do not need to put computational capability into GUC.  Any
> computations needed to determine a parameter setting should be done
> by a wizard.

+1 (save the comment speculating about anyone's location relative to  
their mind ;)

Additionally, obvious as it may be, there's nothing stopping anyone
from developing a tool to generate the configuration file from a
more "interesting" source. Whether that's XML or some DSL that
supports computations, doesn't matter. I would think if such a tool
showed dramatic merit it would provoke another discussion about core
integration, but ISTM that leaving it dead simple is best.

[mm, hook into the postgres startup script, shouldn't be that hard to  
administer..]

Re: Overhauling GUCS

From

Robert Treat

Date:

13 June 2008, 12:54:50

On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote:
> Dave Page wrote:
> > On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > Dave Page wrote:
> > >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> 
wrote:
> > >> > pg_ctl -D data check?
> > >> >
> > >> > I would +1 that.
> > >>
> > >> I would also really like to see that - though I'd also like to see an
> > >> SQL interface so we can check a config before saving when editing via
> > >> pgAdmin or similar.
> > >
> > > Should this be a TODO?
> >
> > Yes please.
>
> Added to TODO:
>
>     * Add pg_ctl option to do a syntax check of postgresql.conf
>

ISTM we need something that can run inside the db as well, i'm thinking 
something like pg_check_conf() to go with pg_reload_conf().  Also, these 
should probably check validity of the pg_hba.conf as well. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Overhauling GUCS

From

Bruce Momjian

Date:

13 June 2008, 13:24:59

Robert Treat wrote:
> On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote:
> > Dave Page wrote:
> > > On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > > Dave Page wrote:
> > > >> On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake <jd@commandprompt.com> 
> wrote:
> > > >> > pg_ctl -D data check?
> > > >> >
> > > >> > I would +1 that.
> > > >>
> > > >> I would also really like to see that - though I'd also like to see an
> > > >> SQL interface so we can check a config before saving when editing via
> > > >> pgAdmin or similar.
> > > >
> > > > Should this be a TODO?
> > >
> > > Yes please.
> >
> > Added to TODO:
> >
> >     * Add pg_ctl option to do a syntax check of postgresql.conf
> >
> 
> ISTM we need something that can run inside the db as well, i'm thinking 
> something like pg_check_conf() to go with pg_reload_conf().  Also, these 
> should probably check validity of the pg_hba.conf as well. 

Agreed, TODO updated:
       o Add functions to syntax check configuration files

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Better default_statistics_target

From

"David E. Wheeler"

Date:

13 June 2008, 13:51:37

On Jun 12, 2008, at 17:55, Greg Sabino Mullane wrote:

> Glad to hear that, although I think this is only in HEAD, not  
> backpatched,
> right? Well at any rate, I withdraw my strong support for 100 and  
> join in
> the quest for a good number. The "anything but 10" campaign

I vote for 11. That's one louda, in'it?

Best,

David

Re: Overhauling GUCS

From

Magnus Hagander

Date:

18 June 2008, 07:51:30

Greg Smith wrote:
> On Thu, 5 Jun 2008, Magnus Hagander wrote:
> 
>> We really need a "proper API" for it, and the stuff in pgAdmin isn't
>> even enough to base one on.
> 
> I would be curious to hear your opinion on whether the GUC overhaul
> discussed in this thread is a useful precursor to building such a proper
> API.

I'm getting back to this one now that the discussion has died down a bit.

As usual, the discussion spread out into these huge changes needed to be
done to satisfy everyone from day one. I don't think that's the way to
go - we have to do it piece by piece if that's ever going to be done.
IMHO the first thing to do is to create a stable API for modifying
config variables remotely. *at this time* it doesn't matter how this API
modifies the configuration, if it's in a file on in the db or whatever.
We can always change that later...  Which is why I will be proceeding
with this one - make an API that requires the least possible change for
now, per discussions earlier this year (not in this thread) :-)

Now, this in itself is unrelated to the stuff Josh was talking about
which is collecting the information in one place and being able to
generate config files that don't contain everything, and being able to
auto-tune things. It would be nice for such a tool to be able to get the
full descriptions and such from the pg_settings view or such, which
means it needs to go in the code and not in a standalone file, but it's
not *as necessary*.

I still think the config file we ship today is horrible, I get lots of
questions around it, and I see lots of examples of people who tweak
settings they have no idea what they do, just because it's there. That'sa separate issue that could *either* be solved
byshipping more than

one  default config file, or it could be solved by the config file
generator Josh proposed.

//Magnus

Re: Overhauling GUCS

From

Bruce Momjian

Date:

16 July 2008, 13:20:08

Added to TODO:
       o Add external tool to auto-tune some postgresql.conf parameters
        http://archives.postgresql.org/pgsql-hackers/2008-06/msg00000.php


---------------------------------------------------------------------------

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Tom Lane wrote:
> >> * Can we build a "configuration wizard" to tell newbies what settings
> >> they need to tweak?
> 
> > That would trump all the other suggestions conclusively. Anyone good at 
> > expert systems?
> 
> How far could we get with the answers to just three questions:
> 
> * How many concurrent queries do you expect to have?
> 
> * How much RAM space are you willing to let Postgres use?
> 
> * How much "overhead" disk space are you willing to let Postgres use?
> 
> concurrent queries drives max_connections, obviously, and RAM space
> would drive shared_buffers and effective_cache_size, and both of them
> would be needed to size work_mem.  The third one is a bit weird but
> I don't see any other good way to set the checkpoint parameters.
> 
> If those aren't enough questions, what else must we ask?  Or maybe they
> aren't the right questions at all --- maybe we should ask "is this a
> dedicated machine or not" and try to extrapolate everything else from
> what we (hopefully) can find out about the hardware.
> 
>             regards, tom lane
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Overhauling GUCS

From

"Michael Nacos"

Date:

13 August 2008, 11:38:13

<div dir="ltr">Hi there... Configuration autotuning is something I am really interested in.<br />I have seen this page:
<ahref="http://wiki.postgresql.org/wiki/GUCS_Overhaul">http://wiki.postgresql.org/wiki/GUCS_Overhaul</a> and<br /> a
coupleof emails mentioning this, so I wanted to ask is someone already<br />on it? If yes, I'd like to contribute.<br
/><br/>Ideally, an external little app should also provide recommendations based<br />on current database usage
statistics-- wouldn't this constitute something<br /> akin to application-specific advice? In this regard, the tool
couldalso tell<br />you things like how much RAM you should order if you were to replace your<br />existing database
server:-)<br /><br />Michael<br /><br /></div>

Re: Overhauling GUCS

From

Greg Smith

Date:

17 August 2008, 17:48:26

On Wed, 13 Aug 2008, Michael Nacos wrote:

> Hi there... Configuration autotuning is something I am really interested in.
> I have seen this page: http://wiki.postgresql.org/wiki/GUCS_Overhaul and
> a couple of emails mentioning this, so I wanted to ask is someone already
> on it? If yes, I'd like to contribute.

Good time to give a status report on what's been going on with all this.

With some help I just finished off an answer to problem #1 there recently, 
"Most people have no idea how to set these".  There was some concern here 
that work was being done on config tools without a clear vision of what 
was going to be tuned.  See 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for an intro 
on how to set the 18 most important parameters (+7 logging parameters) 
based on the best information I'm aware of.

Circa June, Steve Atkins was looking into writing a C++/Qt GUI tuning 
interface application, with the idea that somebody else would figure out 
the actual smarts to the tuning effort.  Don't know where that's at.

Josh Berkus and I have been exchanging some ideas for the GUC internals 
overhaul and had a quick discussion about that in person last month. 
We've been gravitating toward putting all the extra information we'd like 
to push into there in an extra catalog table (pg_settings_info or 
something).  The stuff the server needs to start can stay right where it 
is right now, all the other decoration can move to the table.

> Ideally, an external little app should also provide recommendations based
> on current database usage statistics -- wouldn't this constitute something
> akin to application-specific advice?

Yes, there's a grand plan for a super-wizard that queries the database for 
size, index, and statistics information for figure out what to do; I've 
been beating that drum for a while now.  Unfortunately, the actual 
implementation is blocked behind the dreadfully boring work of sorting out 
how to organize and manage the GUC information a bit better, and the 
moderately boring work of building a UI for modifying things.  If you were 
hoping to work on the sexy autotuning parts without doing some of the 
grunt work, let me know if you can figure out how so I can follow you.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Tom Lane

Date:

17 August 2008, 18:19:02

Greg Smith <gsmith@gregsmith.com> writes:
> Josh Berkus and I have been exchanging some ideas for the GUC internals 
> overhaul and had a quick discussion about that in person last month. 
> We've been gravitating toward putting all the extra information we'd like 
> to push into there in an extra catalog table (pg_settings_info or 
> something).  The stuff the server needs to start can stay right where it 
> is right now, all the other decoration can move to the table.

Somehow, the attraction of that idea escapes me.

What we have now was named Grand Unified Configuration for a reason:
it centralized the handling of what had been a mess of different things
configured in different ways.  I'm not eager to go backwards on that.

I'm also interested to know exactly what such a table would provide
that isn't already available in the form of the pg_settings view.
        regards, tom lane

Re: Overhauling GUCS

From

Greg Smith

Date:

17 August 2008, 19:12:59

On Sun, 17 Aug 2008, Tom Lane wrote:

> What we have now was named Grand Unified Configuration for a reason:
> it centralized the handling of what had been a mess of different things
> configured in different ways.  I'm not eager to go backwards on that.

No need to change anything related to how the configuration is done. 
There's really only two things wrong with what's there right now IMHO and 
they don't require any changes to the internals, just what's shown:

1) The view should show both how the user defined the setting and how it's 
represented internally.  Basically something that looks like this:

select name,current_setting(name) as input_setting,setting from 
pg_settings;

2) Expose the default value.

> I'm also interested to know exactly what such a table would provide
> that isn't already available in the form of the pg_settings view.

Links to the relevant documentation and a place to save both default and 
user comments about the setting were two things being considered that 
seemed a really bad fit to tack onto the GUC structure.  There's some 
others.  The main point is that that nobody wants to have to tinker with 
the core GUC itself just to decorate it with more information, that is 
complicated enough as it is.

One might make a case that the stuff the GUC must handle (settings, units, 
type, defaults, etc.) could be usefully separated from all the more 
documentation-oriented bits stored there right now (category, 
descriptions), and that the existing documentation bits could move over to 
the table along with the hyperlinks and such.  Doing that adds another 
place to have to edit, but I think there's an even exchange available 
there because it enables easy auto-generation of the postgresql.conf file 
at initdb time from that table + pg_settings.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Steve Atkins

Date:

17 August 2008, 19:49:32

On Aug 17, 2008, at 1:48 PM, Greg Smith wrote:

> On Wed, 13 Aug 2008, Michael Nacos wrote:
>
>> Hi there... Configuration autotuning is something I am really  
>> interested in.
>> I have seen this page: http://wiki.postgresql.org/wiki/ 
>> GUCS_Overhaul and
>> a couple of emails mentioning this, so I wanted to ask is someone  
>> already
>> on it? If yes, I'd like to contribute.
>
> Good time to give a status report on what's been going on with all  
> this.
>
> With some help I just finished off an answer to problem #1 there  
> recently, "Most people have no idea how to set these".  There was  
> some concern here that work was being done on config tools without a  
> clear vision of what was going to be tuned.  See http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server 
>  for an intro on how to set the 18 most important parameters (+7  
> logging parameters) based on the best information I'm aware of.
>
> Circa June, Steve Atkins was looking into writing a C++/Qt GUI  
> tuning interface application, with the idea that somebody else would  
> figure out the actual smarts to the tuning effort.  Don't know where  
> that's at.

First pass is done. Needs a little cleanup before sharing. I spent a  
fair while down OS-specific-hardware-queries rathole, but I'm better  
now.

Cheers,  Steve

Re: Overhauling GUCS

From

Josh Berkus

Date:

18 August 2008, 00:23:55

Steve,

> First pass is done. Needs a little cleanup before sharing. I spent a 
> fair while down OS-specific-hardware-queries rathole, but I'm better now.

Gods, I hope you gave up on that.  You want to use SIGAR or something.

--Josh

Re: Overhauling GUCS

From

Magnus Hagander

Date:

18 August 2008, 05:09:30

Josh Berkus wrote:
> Steve,
> 
>> First pass is done. Needs a little cleanup before sharing. I spent a
>> fair while down OS-specific-hardware-queries rathole, but I'm better now.
> 
> Gods, I hope you gave up on that.  You want to use SIGAR or something.

If it's going to be C++, and reasonably cross platform, and a pg tool,
why not try to build something as a module in pgAdmin? Certainly going
to get you a larger exposure... And I'm sure the pgAdmin team would be
happy to have it!

//Magnus

Re: Overhauling GUCS

From

"Michael Nacos"

Date:

18 August 2008, 08:26:31

<div dir="ltr">What I'm interested in is auto-tuning, not necessarily overhauling GUCS, which happens to be the subject
ofthis thread :-)<br />Having done a SELECT * FROM pg_settings, all the information you need seems to be there...<br />
MaybeI'm being over-simplistic here, but the important bit is knowing how you should tune stuff - and this is what I'm
hopingto learn through this process.<br />Now, you could probably sidestep UI and GUCS concerns by moving the
auto-tuningprocess inside the database. You don't need fancy GUIs for guessing configuration parameters, and if you can
alreadydo that, coming up with a GUI should be pretty straightforward.<br /> For example, I see no reason why you
couldn'tcapture the logic of tuning in a couple of PL/Python functions to look up usage stats, size of indices etc.
PL/Pythonbeing an "untrusted" language, you could even write a new postgresql.conf file to disk, with the suggested
alterations.Cheap, quick and cheerful!<br /><br />Perhaps the auto-tuning conversation should take place in a separate
thread,how do you feel about changing the subject line? The most insteresting bit is discussing and testing tuning
strategies.This, of course, is related to the [admin] and [perform] side of things, but there is also a development
dimension.As soon as there is a tuning strategy published, a number of tools will certainly follow.<br /><br
/>Michael<br/><br /><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204,
204,204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Yes, there's a grand plan for a super-wizard that queries the
databasefor size, index, and statistics information for figure out what to do; I've been beating that drum for a while
now. Unfortunately, the actual implementation is blocked behind the dreadfully boring work of sorting out how to
organizeand manage the GUC information a bit better, and the moderately boring work of building a UI for modifying
things. If you were hoping to work on the sexy autotuning parts without doing some of the grunt work, let me know if
youcan figure out how so I can follow you.<br /><font color="#888888"><br /></font></blockquote></div><br /></div>

Re: Overhauling GUCS

From

Steve Atkins

Date:

18 August 2008, 12:13:13

On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote:

> Josh Berkus wrote:
>> Steve,
>>
>>> First pass is done. Needs a little cleanup before sharing. I spent a
>>> fair while down OS-specific-hardware-queries rathole, but I'm  
>>> better now.
>>
>> Gods, I hope you gave up on that.  You want to use SIGAR or  
>> something.
>
> If it's going to be C++, and reasonably cross platform, and a pg tool,
> why not try to build something as a module in pgAdmin? Certainly going
> to get you a larger exposure... And I'm sure the pgAdmin team would be
> happy to have it!

I'm attempting to build it as something that can be used in several
places. Where there's most need for it is as an install time option
in installers, particularly on Windows.

There's no reason the same underlying code couldn't also go into
pgAdmin, of course. At the moment the code is a bit Qt specific,
reducing that is part of the cleanup.

Cheers,  Steve

Re: Overhauling GUCS

From

Magnus Hagander

Date:

18 August 2008, 12:17:41

Steve Atkins wrote:
> 
> On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote:
> 
>> Josh Berkus wrote:
>>> Steve,
>>>
>>>> First pass is done. Needs a little cleanup before sharing. I spent a
>>>> fair while down OS-specific-hardware-queries rathole, but I'm better
>>>> now.
>>>
>>> Gods, I hope you gave up on that.  You want to use SIGAR or something.
>>
>> If it's going to be C++, and reasonably cross platform, and a pg tool,
>> why not try to build something as a module in pgAdmin? Certainly going
>> to get you a larger exposure... And I'm sure the pgAdmin team would be
>> happy to have it!
> 
> I'm attempting to build it as something that can be used in several
> places. Where there's most need for it is as an install time option
> in installers, particularly on Windows.

Well, if it was in pgadmin, it would be there more or less by default on
Windows. And very easy to get in on other platforms, since pgadmin is
already packaged there. Plus, all the dependencies are already there on
said platforms.

//Magnus

Re: Overhauling GUCS

From

Greg Smith

Date:

18 August 2008, 14:40:13

On Mon, 18 Aug 2008, Michael Nacos wrote:

> Having done a SELECT * FROM pg_settings, all the information you need 
> seems to be there...

See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You 
sound like you're at rung 2 on the tool author ladder I describe there, 
still thinking about the fun part of tuning but not yet aware of the 
annoying postgresql.conf management issues that show up in the field that 
motivate many of the GUCS changes suggested.  Coping with user and 
system-generated comments is one difficult part that people normally don't 
consider, dealing with bad settings the server won't start with is 
another.

I did make one mistake in that message, which is that the "context" field 
of pg_settings already exposes when a setting can be changed.  And it is 
possible to get the value for a setting as entered by the admin by joining 
pg_settings against what current_setting returns, which is one part of 
handling the import/change/export cycle while keeping useful units intact.

> Maybe I'm being over-simplistic here, but the important bit is knowing 
> how you should tune stuff - and this is what I'm hoping to learn through 
> this process.

The tuning references at the bottom of 
http://wiki.postgresql.org/wiki/GUCS_Overhaul provide more detail here 
than anyone has been able to automate so far.  There's certainly room to 
improve on the suggestions there with some introspection of the database, 
I'm trying to stay focused on something to help newbies whose settings are 
completely wrong first.

> As soon as there is a tuning strategy published, a number of tools will 
> certainly follow.

Josh Berkus published one in 2005 and zero such tools have been produced 
since then, even though it looked to him then (like it does to you now and 
like it did to me once) that such a tool would easily follow: 
http://pgfoundry.org/docman/?group_id=1000106

The bright side here is that you don't have to waste time tinkering in 
this area to find out where the dead ends are like Josh and I 
independantly did.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Gregory Stark

Date:

18 August 2008, 15:57:42

"Greg Smith" <gsmith@gregsmith.com> writes:

> On Mon, 18 Aug 2008, Michael Nacos wrote:
>
>> Having done a SELECT * FROM pg_settings, all the information you need seems
>> to be there...
>
> See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You sound
> like you're at rung 2 on the tool author ladder I describe there, still
> thinking about the fun part of tuning but not yet aware of the annoying
> postgresql.conf management issues that show up in the field that motivate many
> of the GUCS changes suggested.  Coping with user and system-generated comments
> is one difficult part that people normally don't consider, 

Because coping with free-form user-edited text is a losing game. People don't
consider it because it's a dead-end. Instead you have one file for user-edited
configuration and a separate file for computer generated configuration. You
never try to automatically edit a user-edited file -- that way lies madness.

> dealing with bad settings the server won't start with is another.

A tuning interface can't be turing complete and detect all possible
misconfigurations. To do that it would have to be as complex as the server.

In any case worrying about things like this before you have a tuning interface
that can do the basics is putting the cart before the horse.

>> As soon as there is a tuning strategy published, a number of tools will
>> certainly follow.
>
> Josh Berkus published one in 2005 and zero such tools have been produced since
> then, even though it looked to him then (like it does to you now and like it
> did to me once) that such a tool would easily follow:
> http://pgfoundry.org/docman/?group_id=1000106

The entire target market for such a thing is DBAs stuck on hosted databases
which don't have shell access to their machines. Perhaps the overlap between
that and the people who can write a server-side module which dumps out a
config file according to some rules is just too small?

I do think you and others make it less likely every time you throw up big
insoluble problems like above though. As a consequence every proposal has
started with big overly-complex solutions trying to solve all these incidental
issues which never go anywhere instead of simple solutions which directly
tackle the main problem.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning

Re: Overhauling GUCS

From

Josh Berkus

Date:

18 August 2008, 16:30:36

Greg,

> The entire target market for such a thing is DBAs stuck on hosted
> databases which don't have shell access to their machines. 

That's incorrect.  The main reason for having a port-based API (such as the 
SQL command line) for managing your server is that it makes it much easier 
to manage a large number of servers.  Right now, if you want to survey 
your databases, tables, approx disk space, query activity, etc., you can 
do that all through port 5432.  You can't manage most of your server 
settings that way, and definitely can't manage the *persistent* settings.  
When you're trying to manage 1000 PostgreSQL servers, this is not a minor 
issue.

With the growing "cloud" sector, the lack of easy server parameter 
management is hurting PostgreSQL's adoption for hosted applications.  This 
isn't a new complaint, and is a big part of the reason why 90% of web 
hosts still don't offer PostgreSQL.  I've heard complaints about our 
manageability problems from more vendors than I can count.

HOWEVER, it's completely possible to get a 1st-generation config tool out 
there without first implementing port-based config access.  For one thing, 
there's Puppet.  So that's what I'm intending to do.

> I do think you and others make it less likely every time you throw up
> big insoluble problems like above though. 

It's not an insoluble problem.  It's a political problem; several people 
don't want to add this functionality to the project.

> As a consequence every 
> proposal has started with big overly-complex solutions trying to solve
> all these incidental issues which never go anywhere instead of simple
> solutions which directly tackle the main problem.

What, in your opinion, is "the main problem"?  I'm not sure we agree on 
that.

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

Re: Overhauling GUCS

From

Tom Lane

Date:

18 August 2008, 16:40:41

Gregory Stark <stark@enterprisedb.com> writes:
> The entire target market for such a thing is DBAs stuck on hosted databases
> which don't have shell access to their machines. Perhaps the overlap between
> that and the people who can write a server-side module which dumps out a
> config file according to some rules is just too small?

There's a veritable boatload of stuff we do that assumes shell access
(how many times have you seen cron jobs recommended, for instance?).
So I'm unconvinced that "modify the config without shell access"
is really a goal that is worth lots of effort.  In any case, there's
already adequate support for sucking postgresql.conf out of the machine
and putting it back: pg_read_file(), pg_file_write(), pg_reload_conf().
So at the end of the day remote access isn't a factor in this at all.

> I do think you and others make it less likely every time you throw up big
> insoluble problems like above though. As a consequence every proposal has
> started with big overly-complex solutions trying to solve all these incidental
> issues which never go anywhere instead of simple solutions which directly
> tackle the main problem.

The impression I get every time this comes up is that various people
have different problems they want to solve that (they think) require
redesign of the way GUC works.  Those complicated solutions arise from
attempting to satisfy N different demands simultaneously.  The fact that
many of these goals aren't subscribed to by the whole community to begin
with doesn't help to ease resolution of the issues.
        regards, tom lane

Re: Overhauling GUCS

From

Gregory Stark

Date:

18 August 2008, 16:51:39

"Josh Berkus" <josh@agliodbs.com> writes:

> Greg,
>
>> The entire target market for such a thing is DBAs stuck on hosted
>> databases which don't have shell access to their machines. 
>
> That's incorrect.  The main reason for having a port-based API (such as the 
> SQL command line) for managing your server is that it makes it much easier 
> to manage a large number of servers.  Right now, if you want to survey 
> your databases, tables, approx disk space, query activity, etc., you can 
> do that all through port 5432.  You can't manage most of your server 
> settings that way, and definitely can't manage the *persistent* settings.  
> When you're trying to manage 1000 PostgreSQL servers, this is not a minor 
> issue.

This I don't understand. If you're managing lots of servers running lots of
software the last thing you want to have to do is write a custom method for
updating the configuration of each service. In that use case you would prefer
to just use rsync/svn/git to push the new config file to all the machines
anyways.

> With the growing "cloud" sector, the lack of easy server parameter 
> management is hurting PostgreSQL's adoption for hosted applications.  This 
> isn't a new complaint, and is a big part of the reason why 90% of web 
> hosts still don't offer PostgreSQL.  I've heard complaints about our 
> manageability problems from more vendors than I can count.

These are both use cases which fall in the category I described where you want
to allow users to configure the system through an automated interface. 

We can do that today by generating the automatically generated section and
including that in postgresql.conf as an include file.

>> As a consequence every proposal has started with big overly-complex
>> solutions trying to solve all these incidental issues which never go
>> anywhere instead of simple solutions which directly tackle the main
>> problem.
>
> What, in your opinion, is "the main problem"?  I'm not sure we agree on 
> that.

The main problem that I've seen described is what I mentioned before: allowing
adjusting the postgresql.conf GUC settings by remote users who don't have
shell access.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Overhauling GUCS

From

Josh Berkus

Date:

18 August 2008, 16:56:41

Greg,

> The main problem that I've seen described is what I mentioned before:
> allowing adjusting the postgresql.conf GUC settings by remote users who
> don't have shell access.

Oh, ok.  I think we're in agreement, though.  I don't think that's the 
*1st* problem to be solved, but it's definitely important.

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

Re: Overhauling GUCS

From

"Dave Page"

Date:

18 August 2008, 16:57:28

On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark@enterprisedb.com> wrote:
> The main problem that I've seen described is what I mentioned before: allowing
> adjusting the postgresql.conf GUC settings by remote users who don't have
> shell access.

Which pgAdmin has done perfectly well for years, as long as the config
is all in one file.

-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

Re: Overhauling GUCS

From

Greg Smith

Date:

18 August 2008, 17:21:47

On Mon, 18 Aug 2008, Gregory Stark wrote:

> Because coping with free-form user-edited text is a losing game. People 
> don't consider it because it's a dead-end.

Right, that's impossible technology to build, which is why I had to plan 
all these screen shots showing tools that handle that easily for Apache's 
very similar configuration file: 
http://www.apache-gui.com/apache-windows.html 
http://kochizz.sourceforge.net/quelques-captures-decran/

> Instead you have one file for user-edited configuration and a separate 
> file for computer generated configuration.

It wouldn't be so difficult if the system generated postgresql.conf didn't 
have all this extra junk in it, which is part of what an overhaul plans to 
simplify.  The way the file gets spit out right now encourages some of the 
awful practices people develop in the field.

> A tuning interface can't be turing complete and detect all possible
> misconfigurations. To do that it would have to be as complex as the server.

Thank you for supporting the case for why changes need to be to the server 
code itself, to handle things like validating new postgresql.conf files 
before they get loaded.  I try not to bring that up lest it complicate 
things further.

> The entire target market for such a thing is DBAs stuck on hosted 
> databases which don't have shell access to their machines.

I've never setup a hosted database on a system I don't have shell access 
to, so I have no idea where you get the impression that was a primary goal 
of anything I've said.  It just so happens that improving what tuning you 
can do over port 5432 helps that crowd out too, that's a bonus as I see 
it.

>  Ask me about EnterpriseDB's On-Demand Production Tuning

...nah, too easy, I'll just let that go.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Overhauling GUCS

From

Magnus Hagander

Date:

18 August 2008, 17:39:32

Dave Page wrote:
> On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark@enterprisedb.com> wrote:
>> The main problem that I've seen described is what I mentioned before: allowing
>> adjusting the postgresql.conf GUC settings by remote users who don't have
>> shell access.
> 
> Which pgAdmin has done perfectly well for years, as long as the config
> is all in one file.

I'll argue it's not done it perfectly well (it's not particularly
user-friendly), but it has certainly *done* it...

//Magnus

Re: Overhauling GUCS

From

dpage@pgadmin.org

Date:

18 August 2008, 18:34:54

On 8/18/08, Magnus Hagander <magnus@hagander.net> wrote:
> Dave Page wrote:
>> On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark <stark@enterprisedb.com>
>> wrote:
>>> The main problem that I've seen described is what I mentioned before:
>>> allowing
>>> adjusting the postgresql.conf GUC settings by remote users who don't have
>>> shell access.
>>
>> Which pgAdmin has done perfectly well for years, as long as the config
>> is all in one file.
>
> I'll argue it's not done it perfectly well (it's not particularly
> user-friendly), but it has certainly *done* it...

I mean it's able to read & write the config file correctly. I agree
the ui is, umm, sub-optimal.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

Re: Overhauling GUCS

From

Alvaro Herrera

Date:

18 August 2008, 18:50:28

Tom Lane escribió:
> Gregory Stark <stark@enterprisedb.com> writes:
> > The entire target market for such a thing is DBAs stuck on hosted databases
> > which don't have shell access to their machines. Perhaps the overlap between
> > that and the people who can write a server-side module which dumps out a
> > config file according to some rules is just too small?
> 
> There's a veritable boatload of stuff we do that assumes shell access
> (how many times have you seen cron jobs recommended, for instance?).
> So I'm unconvinced that "modify the config without shell access"
> is really a goal that is worth lots of effort.

Actually, lots of people are discouraged by suggestions of using cron to
do anything.  The only reason cron is suggested is because we don't have
any other answer, and for many people it's a half-solution.  An
integrated task scheduler in Pg would be more than welcome.

Also, remember that pgAdmin already comes with a pgAgent thing.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Overhauling GUCS

From

Magnus Hagander

Date:

19 August 2008, 05:14:49

Alvaro Herrera wrote:
> Tom Lane escribió:
>> Gregory Stark <stark@enterprisedb.com> writes:
>>> The entire target market for such a thing is DBAs stuck on hosted databases
>>> which don't have shell access to their machines. Perhaps the overlap between
>>> that and the people who can write a server-side module which dumps out a
>>> config file according to some rules is just too small?
>> There's a veritable boatload of stuff we do that assumes shell access
>> (how many times have you seen cron jobs recommended, for instance?).
>> So I'm unconvinced that "modify the config without shell access"
>> is really a goal that is worth lots of effort.
> 
> Actually, lots of people are discouraged by suggestions of using cron to
> do anything.  The only reason cron is suggested is because we don't have
> any other answer, and for many people it's a half-solution.  An
> integrated task scheduler in Pg would be more than welcome.

Yes, I hear a lot of people complaining aobut that too. Now that we have
a working autovacuum, some of it goes away though - no need to cron your
VACUUMs in most cases anymore. But there are still backups - but they
are often managed by the scheduler of an enterprise backup software.


> Also, remember that pgAdmin already comes with a pgAgent thing.

Yeah, it's a real life-saver on Windows where the builtin task-scheduler
isn't as readily accessible or easy to use..

//Magnus

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

19 August 2008, 06:14:35

Am Monday, 18. August 2008 schrieb Josh Berkus:
> Right now, if you want to survey
> your databases, tables, approx disk space, query activity, etc., you can
> do that all through port 5432.  You can't manage most of your server
> settings that way, and definitely can't manage the *persistent* settings.  
> When you're trying to manage 1000 PostgreSQL servers, this is not a minor
> issue.

Some of that effort could go into making less settings persistent.

Re: Overhauling GUCS

From

Peter Eisentraut

Date:

19 August 2008, 06:25:26

Am Monday, 18. August 2008 schrieb Tom Lane:
> The impression I get every time this comes up is that various people
> have different problems they want to solve that (they think) require
> redesign of the way GUC works.  Those complicated solutions arise from
> attempting to satisfy N different demands simultaneously.

Which may be the reason that I have been getting the impression that 
the "Problems" and the proposed resolutions on 
http://wiki.postgresql.org/wiki/GUCS_Overhaul are not really closely related.  
I can agree with the Problems, but then I am lost.