Thread: postgresql.conf

postgresql.conf

From

John Cunningham

Date:

22 December 2004, 15:15:59

Hey Guys,

I am setting up a new dedicated Postgres server, and will serve about
60 databases to a web site serving 250,000 people at the rate of about
20,000 a day.  That may all be irrellevent though for the purposes of
this conversation.

The main thing about the application is that we're talking about lots
and lots of little transactions and onyl a few big ones.

The machine in question will do nothing but serve databases.  It's a
dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM.  I'd
like to configure it to get the most out of the server possible as far
as shared memory, sort memore, etc.  I haven't found a lot of
documentation on this.

By the way, for some reason Postgres 7.4.x wouldn't install properly -
had a problem with initdb - had to use 7.3.6

Any help is greatly appreciated.

-John

Re: postgresql.conf

From

Scott Marlowe

Date:

22 December 2004, 15:41:31

On Wed, 2004-12-22 at 09:15, John Cunningham wrote:
> Hey Guys,
>
> I am setting up a new dedicated Postgres server, and will serve about
> 60 databases to a web site serving 250,000 people at the rate of about
> 20,000 a day.  That may all be irrellevent though for the purposes of
> this conversation.
>

If you're gonna handle a lot of connections at the same time, look at
pgpool.  Also, the tuning docs at varlena are a must:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

> The main thing about the application is that we're talking about lots
> and lots of little transactions and onyl a few big ones.
>
> The machine in question will do nothing but serve databases.  It's a
> dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM.  I'd
> like to configure it to get the most out of the server possible as far
> as shared memory, sort memore, etc.  I haven't found a lot of
> documentation on this.

Battery backed cache on the hardware RAID controller is a must. RAID 1+0
may be a better choice than RAID 5, depending on your usage patterns.

> By the way, for some reason Postgres 7.4.x wouldn't install properly -
> had a problem with initdb - had to use 7.3.6

Please post the error messages you got and what OS / version of that OS
you're running, and any other relevant information you can think of.
7.4 is fairly stable, a failed install is something that generally
shouldn't happen, and when it does, it's usually not 7.4's fault
nowadays.

Re: postgresql.conf

From

Steve Wampler

Date:

22 December 2004, 16:01:41

John Cunningham wrote:
...
> The machine in question will do nothing but serve databases.  It's a
> dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM.  I'd
> like to configure it to get the most out of the server possible as far
> as shared memory, sort memore, etc.  I haven't found a lot of
> documentation on this.

What OS?

> By the way, for some reason Postgres 7.4.x wouldn't install properly -
> had a problem with initdb - had to use 7.3.6

I had a similar problem under Fedora Core 3 that Tom Lane solved quickly
for me.  From my notes:

    Run '/usr/sbin/setenforce 0' before intializing database.
    Ok to turn back on with '/usr/sbin/setenforce 1' afterwards.

Has to do with the Fedora using SELinux by default now.

Dunno if that is your situation...

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

Re: postgresql.conf

From

"Frank D. Engel, Jr."

Date:

22 December 2004, 16:03:31

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Dec 22, 2004, at 10:15 AM, John Cunningham wrote:

> like to configure it to get the most out of the server possible as far
> as shared memory, sort memore, etc.  I haven't found a lot of
> documentation on this.

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

- -----------------------------------------------------------
Frank D. Engel, Jr.  <fde101@fjrhome.net>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten
Son, that whosoever believeth in him should not perish, but have
everlasting life.
$
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFByZnw7aqtWrR9cZoRAgl8AJ9PcpktGBeA/Oboh5+ZKPwS1niP9QCfZaL3
yf1Yvav5baMeQZOjGmvmk6E=
=Hnti
-----END PGP SIGNATURE-----



___________________________________________________________
$0 Web Hosting with up to 120MB web space, 1000 MB Transfer
10 Personalized POP and Web E-mail Accounts, and much more.
Signup at www.doteasy.com

Re: postgresql.conf

From

John Cunningham

Date:

22 December 2004, 16:38:14

I'm running Red Hat Enterprise ES with all the most recent updates.

The error - in initdb - was that the system couldn't find
"ascii_and_mic" libraries.  7.3.6 ran without a hitch.

The RAID 1+0 - is that a stripped / mirrored condifuration?  How big
of a difference will that make in performance do you think - are we
talking 10 % or leaps and bounds?

The help is much appreciated - I'm reading that tuning manual now.

-John




On Wed, 22 Dec 2004 10:59:43 -0500, Frank D. Engel, Jr.
<fde101@fjrhome.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On Dec 22, 2004, at 10:15 AM, John Cunningham wrote:
>
> > like to configure it to get the most out of the server possible as far
> > as shared memory, sort memore, etc.  I haven't found a lot of
> > documentation on this.
>
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
>
> - -----------------------------------------------------------
> Frank D. Engel, Jr.  <fde101@fjrhome.net>
>
> $ ln -s /usr/share/kjvbible /usr/manual
> $ true | cat /usr/manual | grep "John 3:16"
> John 3:16 For God so loved the world, that he gave his only begotten
> Son, that whosoever believeth in him should not perish, but have
> everlasting life.
> $
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (Darwin)
>
> iD8DBQFByZnw7aqtWrR9cZoRAgl8AJ9PcpktGBeA/Oboh5+ZKPwS1niP9QCfZaL3
> yf1Yvav5baMeQZOjGmvmk6E=
> =Hnti
> -----END PGP SIGNATURE-----
>
> ___________________________________________________________
> $0 Web Hosting with up to 120MB web space, 1000 MB Transfer
> 10 Personalized POP and Web E-mail Accounts, and much more.
> Signup at www.doteasy.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match
>

Re: postgresql.conf

From

Lonni J Friedman

Date:

22 December 2004, 17:00:31

On Wed, 22 Dec 2004 10:38:01 -0600, John Cunningham <fatbobo@gmail.com> wrote:
> I'm running Red Hat Enterprise ES with all the most recent updates.
>
> The error - in initdb - was that the system couldn't find
> "ascii_and_mic" libraries.  7.3.6 ran without a hitch.

That's very odd, cause I'm using 7.4.6 RPMs on RHEL-3 without any
problems.  Is there anything unusual about your installation?  You
tried with the RPMs from the postgresql.org FTP server, right?

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman                                    netllama@gmail.com
LlamaLand                       http://netllama.linux-sxs.org

Re: postgresql.conf

From

John Cunningham

Date:

22 December 2004, 17:31:02

OK Guys - here's the config file as I've writtten it.

I'll paste in the whole thing before, but this is the important stuff:

max_connections = 256
shared_buffers = 32768   # (256 MB)
sort_mem = 1024 # min 64, size in KB
fsync = No
wal_sync_method = fsync # the default varies across platforms:
effective_cache_size = 786432   # (6 GB)
random_page_cost = 2            # units are one sequential page fetch cost

I am seriously considering breaking the machine all the way down and
changing to a stripped / mirrored config if that will be the fastest
way to run it.  Need advice on that.  Here's the config file:

#========================================================================


#
#       Connection Parameters
#
tcpip_socket = false
#ssl = false

max_connections = 256
superuser_reserved_connections = 2

port = 5432
#hostname_lookup = false
#show_source_port = false

#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal

#virtual_host = ''

#krb_server_keyfile = ''


#
#       Shared Memory Size
#
shared_buffers = 32768          # min max_connections*2 or 16, 8KB each
#max_fsm_relations = 1000       # min 10, fsm is free space map, ~40 bytes
#max_fsm_pages = 10000          # min 1000, fsm is free space map, ~6 bytes
#max_locks_per_transaction = 64 # min 10
#wal_buffers = 8                # min 4, typically 8KB each

#
#       Non-shared Memory Sizes
#
sort_mem = 1024 # min 64, size in KB
#vacuum_mem = 8192              # min 1024, size in KB

#
#       Write-ahead log (WAL)
#
#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300       # range 30-3600, in seconds
#
#commit_delay = 0               # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000
#
fsync = No
wal_sync_method = fsync # the default varies across platforms:
#                               # fsync, fdatasync, open_sync, or open_datasync
#wal_debug = 0                  # range 0-16


#
#       Optimizer Parameters
#
#enable_seqscan = true
#enable_indexscan = true
#enable_tidscan = true
#enable_sort = true
#enable_nestloop = true
#enable_mergejoin = true
#enable_hashjoin = true

effective_cache_size = 786432   # typically 8KB each
random_page_cost = 2            # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)

#default_statistics_target = 10 # range 1-1000

#
#       GEQO Optimizer Parameters
#
#geqo = true
#geqo_selection_bias = 2.0      # range 1.5-2.0
#geqo_threshold = 11
#geqo_pool_size = 0             # default based on tables in statement,
                                # range 128-1024
#geqo_effort = 1
#geqo_generations = 0
#geqo_random_seed = -1          # auto-compute seed

#
#       Message display
#
#server_min_messages = notice   # Values, in order of decreasing detail:
                                #   debug5, debug4, debug3, debug2, debug1,
                                #   info, notice, warning, error, log, fatal,
                                #   panic
#client_min_messages = notice   # Values, in order of decreasing detail:
                                #   debug5, debug4, debug3, debug2, debug1,
                                #   log, info, notice, warning, error
#silent_mode = false

#log_connections = false
#log_pid = false
#log_statement = false
#log_duration = false
#log_timestamp = false

#log_min_error_statement = panic # Values in order of increasing severity:
                                 #   debug5, debug4, debug3, debug2, debug1,
                                 #   info, notice, warning, error, panic(off)

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false

#explain_pretty_print = true

# requires USE_ASSERT_CHECKING
#debug_assertions = true


#
#       Syslog
#
#syslog = 0                     # range 0-2
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


#
#       Statistics
#
#show_parser_stats = false
#show_planner_stats = false
#show_executor_stats = false
#show_statement_stats = false

# requires BTREE_BUILD_STATS
#show_btree_build_stats = false

#
#       Access statistics collection
#
#stats_start_collector = true
#stats_reset_on_server_start = true
#stats_command_string = false
#stats_row_level = false
#stats_block_level = false


#
#       Lock Tracing
#
#trace_notify = false

# requires LOCK_DEBUG
#trace_locks = false
#trace_userlocks = false
#trace_lwlocks = false
#debug_deadlocks = false
#trace_lock_oidmin = 16384
#trace_lock_table = 0


#
#       Misc
#
#autocommit = true
#dynamic_library_path = '$libdir'
#search_path = '$user,public'
#datestyle = 'iso, us'
#timezone = unknown             # actually, defaults to TZ environment setting
#australian_timezones = false
#client_encoding = sql_ascii    # actually, defaults to database encoding
#authentication_timeout = 60    # 1-600, in seconds
#deadlock_timeout = 1000        # in milliseconds
#default_transaction_isolation = 'read committed'
#max_expr_depth = 10000         # min 10
#max_files_per_process = 1000   # min 25
#password_encryption = true
#sql_inheritance = true
#transform_null_equals = false
#statement_timeout = 0          # 0 is disabled, in milliseconds
#db_user_namespace = false

#
#       Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'en_US.UTF-8'
LC_MONETARY = 'en_US.UTF-8'
LC_NUMERIC = 'en_US.utf-8'
LC_TIME = 'en_US.UTF-8'

Re: postgresql.conf

From

Scott Marlowe

Date:

22 December 2004, 18:08:27

On Wed, 2004-12-22 at 11:30, John Cunningham wrote:
> OK Guys - here's the config file as I've writtten it.
>
> I'll paste in the whole thing before, but this is the important stuff:
>
> max_connections = 256

Are you using a connection pooling scheme (jdbc based pooling, pgpool,
etc...)?  If not, you probably should, and then drop the max connections
to something much smaller, like 30 or 40 or so.  If you can.

> shared_buffers = 32768   # (256 MB)

That's really high, even for a machine with 6+ gigs of ram.  Unless
you're working data set is that big, it's too big.  IF, on average,
you're working with smaller amounts of data at a time, it might be
better to drop it down to 5000 to 10000.  Few, if any benchmarks have
shown an improvement at settings over 10000.  OTOH, you might be the one
person out of a thousand or so who needs larger shared_buffers.  Note
that shared buffers aren't cache, and when the last backend referencing
a particular data set in memory stops referencing it, the data it
dropped and the buffer memory released back to the pool, so to speak.
The kernel is generally better at caching than postgresql anyway.

With 8.0's ARC cache algorithm in place, it might be time for someone to
start testing postgresql with a persistant buffer cache (i.e. make it
hold on to the old data sets intead of freeing up the space.)

> sort_mem = 1024 # min 64, size in KB

You can probably up this a bit, especially if you pool your
connections.  Try 8192 for a starting point.  Setting this too large can
be dangerous to the health of your OS, since you can starve the OS for
memory and make it start swapping processes out to come up with sort_mem

> fsync = No

Not really safe, and not the performance gain it once was, if I remember
a post from Tom recently correctly.  On the other hand, IDE disks do
this by design, so if you were using those (with their cache enabled)
you'd be in the same boat.

> wal_sync_method = fsync # the default varies across platforms:
> effective_cache_size = 786432   # (6 GB)
> random_page_cost = 2            # units are one sequential page fetch cost

You can probably drop this down to 1.2 to 1.4 or so on a machine with a
fast disk subsystem and caching controller.

> I am seriously considering breaking the machine all the way down and
> changing to a stripped / mirrored config if that will be the fastest
> way to run it.  Need advice on that.  Here's the config file:

Do you have a battery backed caching raid controller?  If not, that's
one of the first steps to better performance.  After that, if you've got
lots of disks, a RAID 5 or RAID 1+0 should both be pretty fast.  If
you've got <8 or so disks, the RAID 1+0 will normally be faster,
assuming your RAID controller handles that configuration well.  Some
older / cheaper controllers can't parallelize their I/O and run the same
speed in 1+0 as they would in plain old 1.

> #max_fsm_relations = 1000       # min 10, fsm is free space map, ~40 bytes
> #max_fsm_pages = 10000          # min 1000, fsm is free space map, ~6 bytes

Are you sure you're vacuuming often enough and that these settings are
high enough?  What does 'vacuum verbose' on your database say?  You
might want to use pg_autovacuum to ensure sufficient vacuuming is taking
place.

> #
> #       Locale settings
> #
> # (initialized by initdb -- may be changed)
> LC_MESSAGES = 'en_US.UTF-8'
> LC_MONETARY = 'en_US.UTF-8'
> LC_NUMERIC = 'en_US.utf-8'
> LC_TIME = 'en_US.UTF-8'

Are you doing a lot of text searching?  If so, you might be better off
initing the database with locale=C instead.

Re: postgresql.conf

From

John Cunningham

Date:

22 December 2004, 19:37:16

The server is a DELL Poweredge 2650 with it's built in RAID - 4 disks
currently in a RAID 5 config.  I will check on the battery backup.

I'm putting this server together and rebuilding our overall db
structure all at the same time, so I have a good amount of flexiblity.
 I realized I was not taking as much advantage of the machine as
possible before - hence the call out to the list.

I'll look in to the persistent connections - last time we built this
it was several versions ago and the overall attitude was that it
didn't work terrible well.  Is this a PG setting or something in PHP?

The shared buffers was a big concern - I've read that there's a limit
that helps, but as the machine will only do DB transactions, I don't
know what else to do with the RAM.  It's intended for PG's use.

Obviously I'd rather have fsync on - I was really looking for some
opinions on this.  Better safe than sorry - but I am trying to sqeeze
every bit of juice possible out of this machine.

The machine has 4 drives - should I do a RAID 1+0 or a 5?

Most of the searching is char fields then linking ids from one table to another.

BTW - this is one of the best discussions I've been on - glad everyone
can be so helpful.

Thanks!
-John



On Wed, 22 Dec 2004 12:08:10 -0600, Scott Marlowe
<smarlowe@g2switchworks.com> wrote:
> On Wed, 2004-12-22 at 11:30, John Cunningham wrote:
> > OK Guys - here's the config file as I've writtten it.
> >
> > I'll paste in the whole thing before, but this is the important stuff:
> >
> > max_connections = 256
>
> Are you using a connection pooling scheme (jdbc based pooling, pgpool,
> etc...)?  If not, you probably should, and then drop the max connections
> to something much smaller, like 30 or 40 or so.  If you can.
>
> > shared_buffers = 32768   # (256 MB)
>
> That's really high, even for a machine with 6+ gigs of ram.  Unless
> you're working data set is that big, it's too big.  IF, on average,
> you're working with smaller amounts of data at a time, it might be
> better to drop it down to 5000 to 10000.  Few, if any benchmarks have
> shown an improvement at settings over 10000.  OTOH, you might be the one
> person out of a thousand or so who needs larger shared_buffers.  Note
> that shared buffers aren't cache, and when the last backend referencing
> a particular data set in memory stops referencing it, the data it
> dropped and the buffer memory released back to the pool, so to speak.
> The kernel is generally better at caching than postgresql anyway.
>
> With 8.0's ARC cache algorithm in place, it might be time for someone to
> start testing postgresql with a persistant buffer cache (i.e. make it
> hold on to the old data sets intead of freeing up the space.)
>
> > sort_mem = 1024 # min 64, size in KB
>
> You can probably up this a bit, especially if you pool your
> connections.  Try 8192 for a starting point.  Setting this too large can
> be dangerous to the health of your OS, since you can starve the OS for
> memory and make it start swapping processes out to come up with sort_mem
>
> > fsync = No
>
> Not really safe, and not the performance gain it once was, if I remember
> a post from Tom recently correctly.  On the other hand, IDE disks do
> this by design, so if you were using those (with their cache enabled)
> you'd be in the same boat.
>
> > wal_sync_method = fsync # the default varies across platforms:
> > effective_cache_size = 786432   # (6 GB)
> > random_page_cost = 2            # units are one sequential page fetch cost
>
> You can probably drop this down to 1.2 to 1.4 or so on a machine with a
> fast disk subsystem and caching controller.
>
> > I am seriously considering breaking the machine all the way down and
> > changing to a stripped / mirrored config if that will be the fastest
> > way to run it.  Need advice on that.  Here's the config file:
>
> Do you have a battery backed caching raid controller?  If not, that's
> one of the first steps to better performance.  After that, if you've got
> lots of disks, a RAID 5 or RAID 1+0 should both be pretty fast.  If
> you've got <8 or so disks, the RAID 1+0 will normally be faster,
> assuming your RAID controller handles that configuration well.  Some
> older / cheaper controllers can't parallelize their I/O and run the same
> speed in 1+0 as they would in plain old 1.
>
>
> > #max_fsm_relations = 1000       # min 10, fsm is free space map, ~40 bytes
> > #max_fsm_pages = 10000          # min 1000, fsm is free space map, ~6 bytes
>
> Are you sure you're vacuuming often enough and that these settings are
> high enough?  What does 'vacuum verbose' on your database say?  You
> might want to use pg_autovacuum to ensure sufficient vacuuming is taking
> place.
>
> > #
> > #       Locale settings
> > #
> > # (initialized by initdb -- may be changed)
> > LC_MESSAGES = 'en_US.UTF-8'
> > LC_MONETARY = 'en_US.UTF-8'
> > LC_NUMERIC = 'en_US.utf-8'
> > LC_TIME = 'en_US.UTF-8'
>
> Are you doing a lot of text searching?  If so, you might be better off
> initing the database with locale=C instead.
>

Re: postgresql.conf

From

"Frank D. Engel, Jr."

Date:

22 December 2004, 20:38:46

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Dec 22, 2004, at 2:36 PM, John Cunningham wrote:

> The shared buffers was a big concern - I've read that there's a limit
> that helps, but as the machine will only do DB transactions, I don't
> know what else to do with the RAM.  It's intended for PG's use.

Try it both ways: with 32000 and with 10000.  If 32000 gives you a
perceptible performance increase when numerous users are connected, use
it.  Otherwise, stick with the 10000.  I suspect that depending on the
queries themselves and the number of users involved, 10000 may actually
be faster than 32000, since you are freeing memory for use by other
activities within the server.

> Obviously I'd rather have fsync on - I was really looking for some
> opinions on this.  Better safe than sorry - but I am trying to sqeeze
> every bit of juice possible out of this machine.

Keep it on.  The tiny amount of extra performance will mean nothing if
you lose important data over it.

> The machine has 4 drives - should I do a RAID 1+0 or a 5?

1+0, definitely.

- -----------------------------------------------------------
Frank D. Engel, Jr.  <fde101@fjrhome.net>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten
Son, that whosoever believeth in him should not perish, but have
everlasting life.
$
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFBydpA7aqtWrR9cZoRAsgpAKCG1X+9K/5dNv9boIEdnUdEljYINACggnoD
V0rpgscodJUBWcKaIG9uEGg=
=13j7
-----END PGP SIGNATURE-----

___________________________________________________________
$0 Web Hosting with up to 120MB web space, 1000 MB Transfer
10 Personalized POP and Web E-mail Accounts, and much more.
Signup at www.doteasy.com

Re: postgresql.conf

From

Greg Stark

Date:

22 December 2004, 21:01:18

John Cunningham <fatbobo@gmail.com> writes:

> The shared buffers was a big concern - I've read that there's a limit
> that helps, but as the machine will only do DB transactions, I don't
> know what else to do with the RAM.  It's intended for PG's use.

The kernel will use it for disk caching which is exactly what you want it used
for anyways.

--
greg