Thread: postgresql.conf

postgresql.conf

From
mlw
Date:
I have been thinking about postgresql.conf and I have made a few posts already
about it. My concern is that Postgres, as default, is not very well tuned. One
can even say the default is pretty much a bad configuration.

My idea of a postgresql.conf "cookbook" seemed not such a good idea, some liked
it, some did not, and the discussion degenerated into a discussion about the
values.

How about this: we have just just two or three default configuration files?
Compact, Workstation, and server.

"Compact" would be the current postgresql.conf.

"Workstation" would boost the number of buffers, sort and vacuum memory, in
essence Postgres would be configured to use about 64M~128M efficiently, maybe
limited to 16 backends. Say 4096 buffers, Sort mem setting of 4096.

"Server" would have a huge number of buffers, large numbers for sort, and a
boosted vacuum memory. A server would be assumed to have lots of memory and be
limited to 128 backends. Say 65536 buffers, Sort memory of 32768.

We could also tune some of the optimizer parameters. For instance, on "Server"
rand_page_cost would be lower because of concurrent disk operations. We could
even try to tune some of the wal settings accordingly.

I know these things are all documented, and shame on the dba for not reading
the documentation, but all the help we can give to someone new to PostgreSQL
makes it that much more likely that they will be able to use it successfully.
Besides, if we put out a small number of specific versions of postgresql.conf,
a more focused feedback about performence issues on optimization can be
obtained.


Re: postgresql.conf

From
Tom Lane
Date:
mlw <markw@mohawksoft.com> writes:
> "Server" would have a huge number of buffers,

Do you have any evidence whatsoever that that's actually a good idea?

Certainly the existing default configuration is ridiculously cramped
for modern machines.  But I've always felt that NBuffers somewhere in
the low thousands should be plenty.  If you have lots of main memory
then the kernel can be expected to use it for kernel-level disk
buffering, which should be nearly as good as buffering inside Postgres.
(Maybe better, considering that we have some routines that scan all our
buffers linearly ...)  Moreover, if you request a huge chunk of shared
memory then you run a significant risk that the kernel will decide to
start swapping parts of it, at which point it definitely becomes a
loser.  Swapping a dirty buffer out and back in before it finally gets
written to disk is counterproductive.  You want to keep the number of
buffers small enough that they all stay pretty "hot" in the swapper's
eyes.

Basically, I think that it's best to give the kernel plenty of elbow
room to deal with memory pressures on its own terms.  Even on a machine
that's nominally dedicated to running Postgres.

Awhile back I suggested raising the default configuration to 1000 or
so buffers, which would be slightly less silly than the current default
even if it's not optimal.  Didn't get much feedback about the idea
though.
        regards, tom lane


Re: postgresql.conf

From
mlw
Date:
Tom Lane wrote:

> mlw <markw@mohawksoft.com> writes:
> > "Server" would have a huge number of buffers,
>
> Do you have any evidence whatsoever that that's actually a good idea?
>
> Certainly the existing default configuration is ridiculously cramped
> for modern machines.  But I've always felt that NBuffers somewhere in
> the low thousands should be plenty.  If you have lots of main memory
> then the kernel can be expected to use it for kernel-level disk
> buffering, which should be nearly as good as buffering inside Postgres.
> (Maybe better, considering that we have some routines that scan all our
> buffers linearly ...)  Moreover, if you request a huge chunk of shared
> memory then you run a significant risk that the kernel will decide to
> start swapping parts of it, at which point it definitely becomes a
> loser.  Swapping a dirty buffer out and back in before it finally gets
> written to disk is counterproductive.  You want to keep the number of
> buffers small enough that they all stay pretty "hot" in the swapper's
> eyes.

I can't speak about routines that scan buffers linearly, but I have noticed
HUGH performance gains by increasing the number of buffers. Queries that
normally hit a few thousand blocks, can have a cache hit rate of about 80%.
(Very query specific I know)

Sort memory is also a huge gain for large queries.

I understand that there is a point of diminishing returns, cache management
vs disk access. Cache too large and poorly managed costs more than disk.
I'm not sure I've hit that point yet. On an SMP machine, it seems that a
CPU bottle neck is better than an I/O bottleneck. The CPU bottleneck is
scalable, where as an I/O bottleneck is not. Perhaps on a single process
machine, fewer buffers would be more appropriate.

>
> Basically, I think that it's best to give the kernel plenty of elbow
> room to deal with memory pressures on its own terms.  Even on a machine
> that's nominally dedicated to running Postgres.

In our database systems we have 1G of ram. Postgres is configured to use
about 1/4~1/2. (Our number of buffers is 32768). Our sort memory is 32768.

> Awhile back I suggested raising the default configuration to 1000 or
> so buffers, which would be slightly less silly than the current default
> even if it's not optimal.  Didn't get much feedback about the idea
> though.

(I am really glad we are talking about this. )

Surely your recommendation of using 1000 buffers is a great step. 8M of
shared memory on a modern system is trivial, and would make a huge impact.
Sort memory also seems exceedingly small as well, when machines ship with a
practical minimum of 256M RAM, and a probable 512M~1G, 512K seems like a
very small sort size.

Regardless of the actual numbers, I still think that more than one
"default" needs to be defined. I would bet that Postgres runs as much as a
stand-alone server as it does as a workstation database ala Access.






Re: postgresql.conf

From
Peter Eisentraut
Date:
mlw writes:

> I have been thinking about postgresql.conf and I have made a few posts already
> about it. My concern is that Postgres, as default, is not very well tuned. One
> can even say the default is pretty much a bad configuration.

The default configuration is mainly guided by three sometimes
contradictory aspects: It should be reasonably secure, "unusual"  or
non-standard features are turned off, and resources are regulated so that
it is easy to "try out" PostgreSQL without having to do major kernel
tuning first or bringing other applications down to their knees.  I think
the default settings for most parameters are not really disputed, it's
only the performance and resource-related settings that you want to work
on.

> How about this: we have just just two or three default configuration files?
> Compact, Workstation, and server.

Trying to elimate the one-size-does-not-fit-all problem with
N-sizes-fit-all cannot be an optimal idea considering the dimensionality
of the space of possible configurations.  If all you're concerned about is
buffers and sort memory it's much easier to say "configure buffers to use
1/4 of available memory" than to make arbitrary guesses about the
available memory and attach arbitrary labels to them.

Theoretically, it should be possible to determine optimal values for all
performance-related settings from a combination of benchmarks, a few
questions asked of the user about the system configuration and the
expected work load, and a dynamic analysis of the nature of the data.
But a system of formulas describing these relationsships is incredibly
difficult to figure out and solve.  If it weren't, we could get rid of all
these settings and allocate resources dynamically at run time.

What we ought to do, however, is to collect and document empirical methods
for tuning, such as the above "1/4 of available memory" rule (which does
not claim to be correct, btw.).

-- 
Peter Eisentraut   peter_e@gmx.net



Re: postgresql.conf

From
Bruce Momjian
Date:
> What we ought to do, however, is to collect and document empirical methods
> for tuning, such as the above "1/4 of available memory" rule (which does
> not claim to be correct, btw.).

What would be interesting is to have a program that prompted the user,
checked some system values, and modified postgresql.conf accordingly.

The major problem with this idea is that I don't know of a good way to
determine the proper values programmatically, let alone portably.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: postgresql.conf

From
Rene Pijlman
Date:
On Mon, 19 Nov 2001 16:04:10 -0500 (EST), you wrote:

>> What we ought to do, however, is to collect and document empirical methods
>> for tuning, such as the above "1/4 of available memory" rule (which does
>> not claim to be correct, btw.).
>
>What would be interesting is to have a program that prompted the user,
>checked some system values, and modified postgresql.conf accordingly.

I vaguely remember Oracle had an out-of-the-box choice for a
small, medium or large installation, with small being the
default. This changed a bunch of parameters, including the RDBMS
block buffers, shared global area, max open cursors and such.
This can be implemented as an (installation time) config utility
on top of the current fundamental parameters.

Regards,
René Pijlman <rene@lab.applinet.nl>


Re: postgresql.conf (Proposed settings)

From
mlw
Date:
Since I proposed three postgresql.conf configuration files, I will start by
suggesting some settings different from the default: (Any additions or corrections
would be greatly appreciated.)


Compact:
The current postgresql.conf


Workstation:
tcpip_socket = true
max_connections = 32
shared_buffers = 1024
sort_mem  =  8192
random_page_cost = 2


Server:
tcpip_socket = true
max_connections = 128
shared_buffers = 8192
sort_mem  =  16384
random_page_cost = 1



The random_page_cost is changed because of an assumption that the bigger systems
will be more busy. The more busy a machine is doing I/O the lower the differential
between a sequential and random access. ("sequential" to the application is less
likely sequential to the physical disk.)

I'd like to open a debate about the benefit/cost of shared_buffers. The question
is: "Will postgres' management of shared buffers out perform O/S cache? Is there a
point of diminishing return on number of buffers? If so, what?

Sort memory makes a huge impact on queries. If you got the memory, use it.

These are just ballpark settings, I don't even know how good they are. The problem
is that server environments differ so greatly that there is no right answer. I am
just really concerned that the newbe PostgreSQL user will assume the performance
they see with the default settings are what they will judge PostgreSQL.













Re: postgresql.conf (Proposed settings)

From
Peter Eisentraut
Date:
mlw writes:

> These are just ballpark settings, I don't even know how good they are. The problem
> is that server environments differ so greatly that there is no right answer.

Which is why this is clearly not a solution.

> I am just really concerned that the newbe PostgreSQL user will assume
> the performance they see with the default settings are what they will
> judge PostgreSQL.

For this kind of "newbie", the kind that doesn't read the documentation,
this would only make it worse, because they'd assume that by making the
choice between three default configurations they've done an adequate
amount of tuning.  Basically, you'd exchange, "I can't find any tuning
information, but it's slow" for "I did all the tuning and it's still
slow".  Not a good choice.

The bottom line is that you *must* edit postgresql.conf in order to tune
your server.  If this editing is simplified it doesn't matter what the
default is.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: postgresql.conf (Proposed settings)

From
Bruce Momjian
Date:
> > I am just really concerned that the newbe PostgreSQL user will assume
> > the performance they see with the default settings are what they will
> > judge PostgreSQL.
> 
> For this kind of "newbie", the kind that doesn't read the documentation,
> this would only make it worse, because they'd assume that by making the
> choice between three default configurations they've done an adequate
> amount of tuning.  Basically, you'd exchange, "I can't find any tuning
> information, but it's slow" for "I did all the tuning and it's still
> slow".  Not a good choice.
> 
> The bottom line is that you *must* edit postgresql.conf in order to tune
> your server.  If this editing is simplified it doesn't matter what the
> default is.

Is it possible to probe the machine with and update postgresql.conf
automatically?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: postgresql.conf (Proposed settings)

From
mlw
Date:
Peter Eisentraut wrote:
> 
> mlw writes:
> 
> > These are just ballpark settings, I don't even know how good they are. The problem
> > is that server environments differ so greatly that there is no right answer.
> 
> Which is why this is clearly not a solution.

Sometimes an incomplete solution, or even a grossly poor solution, which
addresses a problem, is better than no solution what so ever.

> 
> > I am just really concerned that the newbe PostgreSQL user will assume
> > the performance they see with the default settings are what they will
> > judge PostgreSQL.
> 
> For this kind of "newbie", the kind that doesn't read the documentation,
> this would only make it worse, because they'd assume that by making the
> choice between three default configurations they've done an adequate
> amount of tuning.  Basically, you'd exchange, "I can't find any tuning
> information, but it's slow" for "I did all the tuning and it's still
> slow".  Not a good choice.

I think sort of thinking will not help the end user at all. Offering a choice
of three less badly tuned configuration files will probably produce a better
user experience than one very badly tuned file.

> 
> The bottom line is that you *must* edit postgresql.conf in order to tune
> your server.  If this editing is simplified it doesn't matter what the
> default is.

I don't think this is true at all. Making buffers and sort larger numbers, will
improve performance dramatically. I would also bet that most users NEVER see
the postgresql.conf file, and just blame poor performance on bad design and
start using MySQL.


Re: postgresql.conf (Proposed settings)

From
Horst Herb
Date:
On Thursday 22 November 2001 06:52, mlw wrote:

> > For this kind of "newbie", the kind that doesn't read the
> > documentation, this would only make it worse, because they'd assume
> > that by making the choice between three default configurations they've
> > done an adequate amount of tuning.  Basically, you'd exchange, "I
> > can't find any tuning information, but it's slow" for "I did all the
> > tuning and it's still slow".  Not a good choice.

I am the other kind of newbie, the one that reads documentation.
However, I fail to find much regarding tuning within the documentation 
delivered with the original version 7.1.2 tarball. Even when searching the 
postgresql website or the mailing list archives, the information is still 
sporadic. In the whole "Administrators guide" section there is no chapter 
"tuning", and even if you grep your way through it, you won't find much. 
And yes, I have read Bruce's book , from first to last page. And I have 
read Stinson's PostgreSQL Essential Reference the same way. Still, I am no 
wiser.

I am not complaining. Postgresql, after all, is free. But it is kinda 
strange to blame the users for not reading documentation if this 
documentation is that hard to find (does it exist?) that "Joe Average" 
can't find it.

The other domain where I could hardly find any information at all is the 
Postgres log settings.

I would be more than happy to write the documentation if I could get hold 
of the neccessary information, as I badly need it myself.

The other thing I am thinking abouty is that tuning can't be any "magic". 
If I can tune it, why shouldn't a configuration script be able to do the 
same? After all, even I would follow some algorithm to do it; it would 
probably involve a few educaterd trial & error & experiments - but hey, a 
script can do that too, can't it? I know too little about Postgres to do 
that part myself, but I don't think it is valid just to shove the idea of 
"autotuning" aside like this.

Horst


Re: postgresql.conf (Proposed settings)

From
Peter Eisentraut
Date:
Horst Herb writes:

> I am the other kind of newbie, the one that reads documentation.
> However, I fail to find much regarding tuning within the documentation
> delivered with the original version 7.1.2 tarball.

Yes, there is clearly a deficit in this area.  But it's obviously not
going to get better with more buffers by default.

> The other thing I am thinking abouty is that tuning can't be any "magic".
> If I can tune it, why shouldn't a configuration script be able to do the
> same?

Theoretically, you're right of course.  In practice this could be quite
complicated to set up.  Let's say, the optimal configuration depends
primarily on four groups of parameters:

1. hardware setup
2. load/desired load on the server ("dedicated" vs something else)
3. nature of the data
4. nature of the clients/query workload

#1 is easy to figure out by asking a few questions or perhaps a few
nonportable peeks into /proc. #2 is more difficult, it's not simply yes or
no or loadavg = X, because it depends on the nature of the other
applications.  #3 is also not quite that easy.  It'd require a hypersmart
version of ANALYZE, but in reality you would want to configure your server
before the data starts arriving.  So it's not really feasible to run a an
automatic "benchmark" or something.  The same with #4, the queries really
arrive only after the tuning is done.  You'd hardly want to tune while the
application is live and ask the users "how fast was it?".

So what would be required is to parametrize these four factors (and others
we come up with) accurately into questions the user can answer easily.
This would be an enormous task, but I'm not saying it can't be done.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: postgresql.conf (Proposed settings)

From
Peter Eisentraut
Date:
mlw writes:

> I don't think this is true at all. Making buffers and sort larger numbers, will
> improve performance dramatically.

I think we want to make the buffers bigger by default but we need to put
some thought into the numbers.  Currently, the shared buffers are set so
that they presumably fit under the default shared memory limit on most
systems.  We already know this isn't actually true anymore.

Tom Lane thinks that "a few thousand" buffers is enough, so let's say 2048
= 16MB.  This is certainly a lot less than the 64MB that were proposed or
the 512MB that some people use.  However, I feel we should have *some*
data points before we commit to a number that, as you say, most users will
implicitly be stuck with.  Even more so if some users claim "the more the
better" and the leading developer in the field disagrees.

(Perhaps we could arrange it that by default the system attempts to
allocate X amount of shared memory and if it fails it tries smaller sizes
until it succeeds (down to a reasonable minimum)?  This could combine
convenience and optimimal default.)

As for sort memory, I have no idea why this isn't much larger by default.

> I would also bet that most users NEVER see the postgresql.conf file,
> and just blame poor performance on bad design and start using MySQL.

This could be an information deficit more than anything else.  I don't
know.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: postgresql.conf (Proposed settings)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> ...  However, I feel we should have *some*
> data points before we commit to a number that, as you say, most users will
> implicitly be stuck with.

Data points would be a good thing.  I will freely admit that I have made
no measurements to back up my opinion :-(

> As for sort memory, I have no idea why this isn't much larger by default.

The problem with sort memory is that you don't know what the multiplier
is for it.  SortMem is per sort/hash/whatever plan step, which means
that not only might one backend be consuming several times SortMem on
a complex query, but potentially all MaxBackend backends might be doing
the same.  In practice that seems like a pretty unlikely scenario, but
surely you should figure *some* function of SortMem * MaxBackends as
the number you need to compare to available RAM.

The present 512K default is on the small side for current hardware,
no doubt, but that doesn't mean we should crank it up without thought.
We just recently saw a trouble report from someone who had pushed it
to the moon and found out the hard way not to do that.
        regards, tom lane


Re: postgresql.conf (Proposed settings)

From
Guy Fraser
Date:
What!

I don't remember ever seeing docs pointing me to this configuration
file.

When did this file appear in postgres.