Thread: postgresql.conf
Hey Guys, I am setting up a new dedicated Postgres server, and will serve about 60 databases to a web site serving 250,000 people at the rate of about 20,000 a day. That may all be irrellevent though for the purposes of this conversation. The main thing about the application is that we're talking about lots and lots of little transactions and onyl a few big ones. The machine in question will do nothing but serve databases. It's a dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM. I'd like to configure it to get the most out of the server possible as far as shared memory, sort memore, etc. I haven't found a lot of documentation on this. By the way, for some reason Postgres 7.4.x wouldn't install properly - had a problem with initdb - had to use 7.3.6 Any help is greatly appreciated. -John
On Wed, 2004-12-22 at 09:15, John Cunningham wrote: > Hey Guys, > > I am setting up a new dedicated Postgres server, and will serve about > 60 databases to a web site serving 250,000 people at the rate of about > 20,000 a day. That may all be irrellevent though for the purposes of > this conversation. > If you're gonna handle a lot of connections at the same time, look at pgpool. Also, the tuning docs at varlena are a must: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html > The main thing about the application is that we're talking about lots > and lots of little transactions and onyl a few big ones. > > The machine in question will do nothing but serve databases. It's a > dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM. I'd > like to configure it to get the most out of the server possible as far > as shared memory, sort memore, etc. I haven't found a lot of > documentation on this. Battery backed cache on the hardware RAID controller is a must. RAID 1+0 may be a better choice than RAID 5, depending on your usage patterns. > By the way, for some reason Postgres 7.4.x wouldn't install properly - > had a problem with initdb - had to use 7.3.6 Please post the error messages you got and what OS / version of that OS you're running, and any other relevant information you can think of. 7.4 is fairly stable, a failed install is something that generally shouldn't happen, and when it does, it's usually not 7.4's fault nowadays.
John Cunningham wrote: ... > The machine in question will do nothing but serve databases. It's a > dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM. I'd > like to configure it to get the most out of the server possible as far > as shared memory, sort memore, etc. I haven't found a lot of > documentation on this. What OS? > By the way, for some reason Postgres 7.4.x wouldn't install properly - > had a problem with initdb - had to use 7.3.6 I had a similar problem under Fedora Core 3 that Tom Lane solved quickly for me. From my notes: Run '/usr/sbin/setenforce 0' before intializing database. Ok to turn back on with '/usr/sbin/setenforce 1' afterwards. Has to do with the Fedora using SELinux by default now. Dunno if that is your situation... -- Steve Wampler -- swampler@noao.edu The gods that smiled on your birth are now laughing out loud.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Dec 22, 2004, at 10:15 AM, John Cunningham wrote: > like to configure it to get the most out of the server possible as far > as shared memory, sort memore, etc. I haven't found a lot of > documentation on this. http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFByZnw7aqtWrR9cZoRAgl8AJ9PcpktGBeA/Oboh5+ZKPwS1niP9QCfZaL3 yf1Yvav5baMeQZOjGmvmk6E= =Hnti -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
I'm running Red Hat Enterprise ES with all the most recent updates. The error - in initdb - was that the system couldn't find "ascii_and_mic" libraries. 7.3.6 ran without a hitch. The RAID 1+0 - is that a stripped / mirrored condifuration? How big of a difference will that make in performance do you think - are we talking 10 % or leaps and bounds? The help is much appreciated - I'm reading that tuning manual now. -John On Wed, 22 Dec 2004 10:59:43 -0500, Frank D. Engel, Jr. <fde101@fjrhome.net> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > On Dec 22, 2004, at 10:15 AM, John Cunningham wrote: > > > like to configure it to get the most out of the server possible as far > > as shared memory, sort memore, etc. I haven't found a lot of > > documentation on this. > > http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html > > - ----------------------------------------------------------- > Frank D. Engel, Jr. <fde101@fjrhome.net> > > $ ln -s /usr/share/kjvbible /usr/manual > $ true | cat /usr/manual | grep "John 3:16" > John 3:16 For God so loved the world, that he gave his only begotten > Son, that whosoever believeth in him should not perish, but have > everlasting life. > $ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (Darwin) > > iD8DBQFByZnw7aqtWrR9cZoRAgl8AJ9PcpktGBeA/Oboh5+ZKPwS1niP9QCfZaL3 > yf1Yvav5baMeQZOjGmvmk6E= > =Hnti > -----END PGP SIGNATURE----- > > ___________________________________________________________ > $0 Web Hosting with up to 120MB web space, 1000 MB Transfer > 10 Personalized POP and Web E-mail Accounts, and much more. > Signup at www.doteasy.com > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
On Wed, 22 Dec 2004 10:38:01 -0600, John Cunningham <fatbobo@gmail.com> wrote: > I'm running Red Hat Enterprise ES with all the most recent updates. > > The error - in initdb - was that the system couldn't find > "ascii_and_mic" libraries. 7.3.6 ran without a hitch. That's very odd, cause I'm using 7.4.6 RPMs on RHEL-3 without any problems. Is there anything unusual about your installation? You tried with the RPMs from the postgresql.org FTP server, right? -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ L. Friedman netllama@gmail.com LlamaLand http://netllama.linux-sxs.org
OK Guys - here's the config file as I've writtten it. I'll paste in the whole thing before, but this is the important stuff: max_connections = 256 shared_buffers = 32768 # (256 MB) sort_mem = 1024 # min 64, size in KB fsync = No wal_sync_method = fsync # the default varies across platforms: effective_cache_size = 786432 # (6 GB) random_page_cost = 2 # units are one sequential page fetch cost I am seriously considering breaking the machine all the way down and changing to a stripped / mirrored config if that will be the fastest way to run it. Need advice on that. Here's the config file: #======================================================================== # # Connection Parameters # tcpip_socket = false #ssl = false max_connections = 256 superuser_reserved_connections = 2 port = 5432 #hostname_lookup = false #show_source_port = false #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #virtual_host = '' #krb_server_keyfile = '' # # Shared Memory Size # shared_buffers = 32768 # min max_connections*2 or 16, 8KB each #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8 # min 4, typically 8KB each # # Non-shared Memory Sizes # sort_mem = 1024 # min 64, size in KB #vacuum_mem = 8192 # min 1024, size in KB # # Write-ahead log (WAL) # #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 300 # range 30-3600, in seconds # #commit_delay = 0 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 # fsync = No wal_sync_method = fsync # the default varies across platforms: # # fsync, fdatasync, open_sync, or open_datasync #wal_debug = 0 # range 0-16 # # Optimizer Parameters # #enable_seqscan = true #enable_indexscan = true #enable_tidscan = true #enable_sort = true #enable_nestloop = true #enable_mergejoin = true #enable_hashjoin = true effective_cache_size = 786432 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) #default_statistics_target = 10 # range 1-1000 # # GEQO Optimizer Parameters # #geqo = true #geqo_selection_bias = 2.0 # range 1.5-2.0 #geqo_threshold = 11 #geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 #geqo_effort = 1 #geqo_generations = 0 #geqo_random_seed = -1 # auto-compute seed # # Message display # #server_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic #client_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error #silent_mode = false #log_connections = false #log_pid = false #log_statement = false #log_duration = false #log_timestamp = false #log_min_error_statement = panic # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false #explain_pretty_print = true # requires USE_ASSERT_CHECKING #debug_assertions = true # # Syslog # #syslog = 0 # range 0-2 #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # # Statistics # #show_parser_stats = false #show_planner_stats = false #show_executor_stats = false #show_statement_stats = false # requires BTREE_BUILD_STATS #show_btree_build_stats = false # # Access statistics collection # #stats_start_collector = true #stats_reset_on_server_start = true #stats_command_string = false #stats_row_level = false #stats_block_level = false # # Lock Tracing # #trace_notify = false # requires LOCK_DEBUG #trace_locks = false #trace_userlocks = false #trace_lwlocks = false #debug_deadlocks = false #trace_lock_oidmin = 16384 #trace_lock_table = 0 # # Misc # #autocommit = true #dynamic_library_path = '$libdir' #search_path = '$user,public' #datestyle = 'iso, us' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #client_encoding = sql_ascii # actually, defaults to database encoding #authentication_timeout = 60 # 1-600, in seconds #deadlock_timeout = 1000 # in milliseconds #default_transaction_isolation = 'read committed' #max_expr_depth = 10000 # min 10 #max_files_per_process = 1000 # min 25 #password_encryption = true #sql_inheritance = true #transform_null_equals = false #statement_timeout = 0 # 0 is disabled, in milliseconds #db_user_namespace = false # # Locale settings # # (initialized by initdb -- may be changed) LC_MESSAGES = 'en_US.UTF-8' LC_MONETARY = 'en_US.UTF-8' LC_NUMERIC = 'en_US.utf-8' LC_TIME = 'en_US.UTF-8'
On Wed, 2004-12-22 at 11:30, John Cunningham wrote: > OK Guys - here's the config file as I've writtten it. > > I'll paste in the whole thing before, but this is the important stuff: > > max_connections = 256 Are you using a connection pooling scheme (jdbc based pooling, pgpool, etc...)? If not, you probably should, and then drop the max connections to something much smaller, like 30 or 40 or so. If you can. > shared_buffers = 32768 # (256 MB) That's really high, even for a machine with 6+ gigs of ram. Unless you're working data set is that big, it's too big. IF, on average, you're working with smaller amounts of data at a time, it might be better to drop it down to 5000 to 10000. Few, if any benchmarks have shown an improvement at settings over 10000. OTOH, you might be the one person out of a thousand or so who needs larger shared_buffers. Note that shared buffers aren't cache, and when the last backend referencing a particular data set in memory stops referencing it, the data it dropped and the buffer memory released back to the pool, so to speak. The kernel is generally better at caching than postgresql anyway. With 8.0's ARC cache algorithm in place, it might be time for someone to start testing postgresql with a persistant buffer cache (i.e. make it hold on to the old data sets intead of freeing up the space.) > sort_mem = 1024 # min 64, size in KB You can probably up this a bit, especially if you pool your connections. Try 8192 for a starting point. Setting this too large can be dangerous to the health of your OS, since you can starve the OS for memory and make it start swapping processes out to come up with sort_mem > fsync = No Not really safe, and not the performance gain it once was, if I remember a post from Tom recently correctly. On the other hand, IDE disks do this by design, so if you were using those (with their cache enabled) you'd be in the same boat. > wal_sync_method = fsync # the default varies across platforms: > effective_cache_size = 786432 # (6 GB) > random_page_cost = 2 # units are one sequential page fetch cost You can probably drop this down to 1.2 to 1.4 or so on a machine with a fast disk subsystem and caching controller. > I am seriously considering breaking the machine all the way down and > changing to a stripped / mirrored config if that will be the fastest > way to run it. Need advice on that. Here's the config file: Do you have a battery backed caching raid controller? If not, that's one of the first steps to better performance. After that, if you've got lots of disks, a RAID 5 or RAID 1+0 should both be pretty fast. If you've got <8 or so disks, the RAID 1+0 will normally be faster, assuming your RAID controller handles that configuration well. Some older / cheaper controllers can't parallelize their I/O and run the same speed in 1+0 as they would in plain old 1. > #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes > #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes Are you sure you're vacuuming often enough and that these settings are high enough? What does 'vacuum verbose' on your database say? You might want to use pg_autovacuum to ensure sufficient vacuuming is taking place. > # > # Locale settings > # > # (initialized by initdb -- may be changed) > LC_MESSAGES = 'en_US.UTF-8' > LC_MONETARY = 'en_US.UTF-8' > LC_NUMERIC = 'en_US.utf-8' > LC_TIME = 'en_US.UTF-8' Are you doing a lot of text searching? If so, you might be better off initing the database with locale=C instead.
The server is a DELL Poweredge 2650 with it's built in RAID - 4 disks currently in a RAID 5 config. I will check on the battery backup. I'm putting this server together and rebuilding our overall db structure all at the same time, so I have a good amount of flexiblity. I realized I was not taking as much advantage of the machine as possible before - hence the call out to the list. I'll look in to the persistent connections - last time we built this it was several versions ago and the overall attitude was that it didn't work terrible well. Is this a PG setting or something in PHP? The shared buffers was a big concern - I've read that there's a limit that helps, but as the machine will only do DB transactions, I don't know what else to do with the RAM. It's intended for PG's use. Obviously I'd rather have fsync on - I was really looking for some opinions on this. Better safe than sorry - but I am trying to sqeeze every bit of juice possible out of this machine. The machine has 4 drives - should I do a RAID 1+0 or a 5? Most of the searching is char fields then linking ids from one table to another. BTW - this is one of the best discussions I've been on - glad everyone can be so helpful. Thanks! -John On Wed, 22 Dec 2004 12:08:10 -0600, Scott Marlowe <smarlowe@g2switchworks.com> wrote: > On Wed, 2004-12-22 at 11:30, John Cunningham wrote: > > OK Guys - here's the config file as I've writtten it. > > > > I'll paste in the whole thing before, but this is the important stuff: > > > > max_connections = 256 > > Are you using a connection pooling scheme (jdbc based pooling, pgpool, > etc...)? If not, you probably should, and then drop the max connections > to something much smaller, like 30 or 40 or so. If you can. > > > shared_buffers = 32768 # (256 MB) > > That's really high, even for a machine with 6+ gigs of ram. Unless > you're working data set is that big, it's too big. IF, on average, > you're working with smaller amounts of data at a time, it might be > better to drop it down to 5000 to 10000. Few, if any benchmarks have > shown an improvement at settings over 10000. OTOH, you might be the one > person out of a thousand or so who needs larger shared_buffers. Note > that shared buffers aren't cache, and when the last backend referencing > a particular data set in memory stops referencing it, the data it > dropped and the buffer memory released back to the pool, so to speak. > The kernel is generally better at caching than postgresql anyway. > > With 8.0's ARC cache algorithm in place, it might be time for someone to > start testing postgresql with a persistant buffer cache (i.e. make it > hold on to the old data sets intead of freeing up the space.) > > > sort_mem = 1024 # min 64, size in KB > > You can probably up this a bit, especially if you pool your > connections. Try 8192 for a starting point. Setting this too large can > be dangerous to the health of your OS, since you can starve the OS for > memory and make it start swapping processes out to come up with sort_mem > > > fsync = No > > Not really safe, and not the performance gain it once was, if I remember > a post from Tom recently correctly. On the other hand, IDE disks do > this by design, so if you were using those (with their cache enabled) > you'd be in the same boat. > > > wal_sync_method = fsync # the default varies across platforms: > > effective_cache_size = 786432 # (6 GB) > > random_page_cost = 2 # units are one sequential page fetch cost > > You can probably drop this down to 1.2 to 1.4 or so on a machine with a > fast disk subsystem and caching controller. > > > I am seriously considering breaking the machine all the way down and > > changing to a stripped / mirrored config if that will be the fastest > > way to run it. Need advice on that. Here's the config file: > > Do you have a battery backed caching raid controller? If not, that's > one of the first steps to better performance. After that, if you've got > lots of disks, a RAID 5 or RAID 1+0 should both be pretty fast. If > you've got <8 or so disks, the RAID 1+0 will normally be faster, > assuming your RAID controller handles that configuration well. Some > older / cheaper controllers can't parallelize their I/O and run the same > speed in 1+0 as they would in plain old 1. > > > > #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes > > #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes > > Are you sure you're vacuuming often enough and that these settings are > high enough? What does 'vacuum verbose' on your database say? You > might want to use pg_autovacuum to ensure sufficient vacuuming is taking > place. > > > # > > # Locale settings > > # > > # (initialized by initdb -- may be changed) > > LC_MESSAGES = 'en_US.UTF-8' > > LC_MONETARY = 'en_US.UTF-8' > > LC_NUMERIC = 'en_US.utf-8' > > LC_TIME = 'en_US.UTF-8' > > Are you doing a lot of text searching? If so, you might be better off > initing the database with locale=C instead. >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Dec 22, 2004, at 2:36 PM, John Cunningham wrote: > The shared buffers was a big concern - I've read that there's a limit > that helps, but as the machine will only do DB transactions, I don't > know what else to do with the RAM. It's intended for PG's use. Try it both ways: with 32000 and with 10000. If 32000 gives you a perceptible performance increase when numerous users are connected, use it. Otherwise, stick with the 10000. I suspect that depending on the queries themselves and the number of users involved, 10000 may actually be faster than 32000, since you are freeing memory for use by other activities within the server. > Obviously I'd rather have fsync on - I was really looking for some > opinions on this. Better safe than sorry - but I am trying to sqeeze > every bit of juice possible out of this machine. Keep it on. The tiny amount of extra performance will mean nothing if you lose important data over it. > The machine has 4 drives - should I do a RAID 1+0 or a 5? 1+0, definitely. - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFBydpA7aqtWrR9cZoRAsgpAKCG1X+9K/5dNv9boIEdnUdEljYINACggnoD V0rpgscodJUBWcKaIG9uEGg= =13j7 -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
John Cunningham <fatbobo@gmail.com> writes: > The shared buffers was a big concern - I've read that there's a limit > that helps, but as the machine will only do DB transactions, I don't > know what else to do with the RAM. It's intended for PG's use. The kernel will use it for disk caching which is exactly what you want it used for anyways. -- greg