Thread: Need advice on postgresql.conf settings
Thanks in advance for anything you can do to help. The real issue is this, we have THE SAME queries taking anywhere from .001 - 90.0 seconds... the server is using 98% of the available RAM at all times (because of the persistant connections via php), and I don't know what to do. Every time I change a .conf setting I feel like it slows it down even more.... and I can't seem to find the balance. I'll give you everything I've got, and I hope to god someone can point out some areas where I could improve the speed of queries overall. For months I've been optimizing my queries, and working around places that I don't need them. They are so incredibly honed, I couldn't even begin to explain.... and when there are less than 10 users browsing our sites, they are LIGHTENING fast... even with 5x the amount of dummy data in the database(s)... so basically the LARGEST factor in this whole performance issue that I can find is the number of users browsing the sites at all times... but lowering shared_buffers to raise max_connections is hurting performance immensley... so I"m totally lost.... please help!! Bless you! THE DETAILS: (for the databases, i'll list only the 'main' tables... as the others are fairly small) Database 1: 5000 'users' 20,000 'threads' 500,000 'posts' ... Database 2: (just starting out) 150 'users' 150 'entries' ... Hardware : Pentium 4 2.44ghz 1.5gb RAM 7200rpm SATA Software: Redhat Linux (kernel v. 2.4.21-9.EL) Postgresql 7.4.2 PHP 4.3.6 (using persistant connections to pgsql) Usage: uptime: 12:23:08 up 132 days, 19:16, 2 users, load average: 19.75, 17.34, 18.86 roughly 100-200 users connected to our server at any given moment roughly 10-15 queries per HTTP page load ---------------------------------------------------------- # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - tcpip_socket = true max_connections = 75 # note: increasing max_connections costs about 500 bytes of shared # memory per connection slot, in addition to costs from shared_buffers # and max_locks_per_transaction. #superuser_reserved_connections = 2 port = 5432 #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #virtual_host = '' # what interface to listen on; defaults to any #rendezvous_name = '' # defaults to the computer name # - Security & Authentication - #authentication_timeout = 60 # 1-600, in seconds ssl = true password_encryption = true #krb_server_keyfile = '' #db_user_namespace = false #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 8192 # min 16, at least max_connections*2, 8KB each sort_mem = 8192 # min 64, size in KB vacuum_mem = 4096 # min 1024, size in KB # - Free Space Map - max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 1000 # min 100, ~50 bytes each # - Kernel Resource Usage - max_files_per_process = 3052 # min 25 #preload_libraries = '' #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - fsync = true # turns forced synchronization on or off #wal_sync_method = fsync # the default varies across platforms: # fsync, fdatasync, open_sync, or open_datasync wal_buffers = 192 # min 4, 8KB each # - Checkpoints - #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each #checkpoint_timeout = 300 # range 30-3600, in seconds #checkpoint_warning = 30 # 0 is off, in seconds #commit_delay = 0 # range 0-100000, in microseconds #commit_siblings = 5 # range 1-1000 #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Enabling - #enable_hashagg = true #enable_hashjoin = true #enable_indexscan = true #enable_mergejoin = true #enable_nestloop = true enable_seqscan = false #enable_sort = true #enable_tidscan = true # - Planner Cost Constants - effective_cache_size = 131072 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost cpu_tuple_cost = .01 # (same) default .01 cpu_index_tuple_cost = .001 # (same) default .001 cpu_operator_cost = 0.0025 # (same) default .0025 # - Genetic Query Optimizer - geqo = true geqo_threshold = 20 #geqo_effort = 1 #geqo_generations = 0 #geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 #from_collapse_limit = 8 #join_collapse_limit = 8 # 1 disables collapsing of explicit JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Syslog - #syslog = 0 # range 0-2; 0=stdout; 1=both; 2=syslog #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - client_min_messages = error # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error log_min_messages = error # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic log_error_verbosity = default # terse, default, or verbose messages log_min_error_statement = panic # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) log_min_duration_statement = -1 # Log all statements whose # execution time exceeds the value, in # milliseconds. Zero prints all queries. # Minus-one disables. #silent_mode = false # DO NOT USE without Syslog! # - What to Log - debug_print_parse = false debug_print_rewritten = false debug_print_plan = false debug_pretty_print = false log_connections = false log_duration = false log_pid = false log_statement = false log_timestamp = false log_hostname = false log_source_port = false #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - log_parser_stats = false log_planner_stats = false log_executor_stats = false log_statement_stats = false # - Query/Index Statistics Collector - stats_start_collector = false stats_command_string = false stats_block_level = false stats_row_level = false stats_reset_on_server_start = false #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '$user,public' # schema names #check_function_bodies = true #default_transaction_isolation = 'read committed' #default_transaction_read_only = false #statement_timeout = 0 # 0 is disabled, in milliseconds # - Locale and Formatting - #datestyle = 'iso, mdy' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database encoding # These settings are initialized by initdb -- they may be changed lc_messages = 'en_US.UTF-8' # locale for system error message strings lc_monetary = 'en_US.UTF-8' # locale for monetary formatting lc_numeric = 'en_US.UTF-8' # locale for number formatting lc_time = 'en_US.UTF-8' # locale for time formatting # - Other Defaults - explain_pretty_print = true #dynamic_library_path = '$libdir' #max_expr_depth = 10000 # min 10 #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- #deadlock_timeout = 1000 # in milliseconds #max_locks_per_transaction = 64 # min 10, ~260*max_connections bytes each #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = true regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = true # - Other Platforms & Clients - #transform_null_equals = false ------ I've been tweaking the postgresql.conf file for about 5 hours... just today. We've had problems in the past (and I've also emailed this list in the past, but perhaps I failed to ask the right questions).... I guess I need some help with the postgresql configuration file. I would like to start off by asking that you not link me to the same basic .CONF overview, as what I really need at this point is real-world experience and wisdom, as opposed to cold, poorly documented, and incredibly abstract (trial-and-error) type manual entries.
"Shane | SkinnyCorp" <shanew@skinnycorp.com> writes: > The real issue is this, we have THE SAME queries taking anywhere from .001 - > 90.0 seconds... the server is using 98% of the available RAM at all times > (because of the persistant connections via php), and I don't know what to > do. I have a feeling that the answer is going to boil down to "buy more RAM" --- it sounds a lot like you're just overstressing your server. The more active backends you have, the more RAM goes to process-local memory, and the less is available for kernel disk cache. Even if you don't go into outright swapping, the amount of disk I/O needed goes up the smaller the kernel disk cache gets. Another possible line of attack is to use persistent (pooled) connections to cut down the number of live backend processes you need. However, depending on what your application software is, that might take more time/effort (= money) than dropping in some more RAM. You can investigate this theory by watching "top" output (the first few lines about memory usage, not the process listing) as well as "vmstat" output. > uptime: 12:23:08 up 132 days, 19:16, 2 users, load average: 19.75, > 17.34, 18.86 Load averages approaching 20 are not good either ... what sort of box are you running on anyway? As for the postgresql.conf settings, the only ones I'd seriously question are max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 1000 # min 100, ~50 bytes each These are the defaults, and are probably too small for a DB exceeding a hundred meg or so. max_files_per_process = 3052 # min 25 You really have your kernel set to support 3052 * 75 simultaneously open files? Back this off. I doubt values beyond a couple hundred buy anything except headaches. wal_buffers = 192 This is an order-of-magnitude overkill too, especially if your transactions are mostly small. I know it's only a megabyte or two, but you evidently need that RAM more elsewhere. enable_seqscan = false I don't think this is a good idea in general. regards, tom lane
>> The real issue is this, we have THE SAME queries taking anywhere from >> .001 - >> 90.0 seconds... the server is using 98% of the available RAM at all >> times >> (because of the persistant connections via php), and I don't know >> what to >> do. > > Another possible line of attack is to use persistent (pooled) > connections to cut down the number of live backend processes you need. > However, depending on what your application software is, that might > take more time/effort (= money) than dropping in some more RAM. This particular feature is pure evilness. Using all of my fingers and toes, I can't count the number of times I've had a client do this and get themselves into a world of hurt. Somewhere in the PHP documentation, there should be a big warning wrapped in the blink tag that steers people away from setting this. The extra time necessary to setup a TCP connection is less than the performance drag induced on the backend when persistent connections are enabled. Reread that last sentence until it sinks in. On a local network, this is premature optimization that's hurting you. > max_files_per_process = 3052 # min 25 > > You really have your kernel set to support 3052 * 75 simultaneously > open > files? Back this off. I doubt values beyond a couple hundred buy > anything except headaches. This, on the other hand, has made a large difference for me. Time necessary to complete open(2) calls can be expensive, especially when the database is poorly designed and is touching many different parts of the database spread across multiple files on the backend. 3000 is high, but I've found 500 to be vastly too low in some cases... in others, it's just fine. My rule of thumb has become, if you're doing lots of aggregate functions (ex, SUM(), COUNT()) more than once in the lifetime of a backend, increasing this value helps.. otherwise it buys you little (if so, 1500 is generally sufficient). Faster IO, however, is going to save you here. If you can, increase your disk caching in the OS. On FreeBSD, increase your KVA_PAGES and NBUFs. Since you've freed up more ram by disabling persistent connections, this shouldn't be a problem. -sc -- Sean Chittenden
On Nov 9, 2004, at 2:01 PM, Shane | SkinnyCorp wrote: > Thanks in advance for anything you can do to help. > > > The real issue is this, we have THE SAME queries taking anywhere from > .001 - 90.0 seconds... the server is using 98% of the available RAM at > all times (because of the persistant connections via php), and I don't > know what to do. Every time I change a I'd recommend strongly ditching the use of pconnect and use pgpool + regular connect. It is a terrific combination that provides pool connections like how you'd think they shoudl work (a pool of N connections to PG shared by Y processes instead of a 1:1 mapping). curiously, have you noticed any pattern to the slowdown? It could be induced by a checkpoint or vacuum. Are you swapping at all? Are your PHP scripts leaking at all, etc.? Your load average is high, how does your CPU idle look (if load is high, and the cpus are pretty idle that is an indicator of being IO bound). good luck. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Im PostgreSQL 7.2.2 / Linux 2.4.27 dual-processor Pentium III 900MHz, we have this table: create table testtable (id SERIAL PRIMARY KEY, coni VARCHAR(255), date TIMESTAMP, direction VARCHAR(255), partner VARCHAR(255),type VARCHAR(255), block VARCHAR(255) ); We using Java with JDBC-driver pg72jdbc2.jar our Java-testgrogram is : public class Stresser implements Runnable { public void run() { System.out.println("-> start"); try { Class.forName("org.postgresql.Driver"); Connection con = DriverManager.getConnection("jdbc:postgresql://"+prop.getProperty("host")+":"+prop.getProperty("port")+"/"+prop.getProperty("dbname"), prop.getProperty("user"), prop.getProperty("pwd")); con.setAutoCommit(true); Statement st = con.createStatement(); java.sql.Timestamp datum = new java.sql.Timestamp(new Date().getTime()); Date start = new Date(); System.out.println(start); for (int i=0; i<100; ++i) { st.executeUpdate("insert into history(uuid,coni,date,direction,partner,type) values('uuid','content','"+datum+"','dir','partner','type')"); } Date end = new Date(); System.out.println(end); con.close(); } catch (Exception e) { System.out.println("Exception!"); e.printStackTrace(); } System.out.println("-> ende"); } public static void main(String[] args) { for (int i=0; i<10; ++i) { Stresser s = new Stresser(); Thread t = new Thread(s); t.start(); } } } It is trunning in in 10 Threads. Each thread makes 100 Inserts: For the 1000 Inserts (10 threads a 100 inserts) we need 8 seconds. That's 125 Insets / Seconds. How could we make it faster ? Inserting 1000 rows via INSERT AS SELECT is much faster. regards Michael
On Nov 10, 2004, at 8:51 AM, Michael Kleiser wrote: > It is trunning in in 10 Threads. Each thread makes 100 Inserts: > > For the 1000 Inserts (10 threads a 100 inserts) > we need 8 seconds. > That's 125 Insets / Seconds. > How could we make it faster ? > Batch the inserts up into a transaction. So you'd have BEGIN insert insert insert ... COMMIT Your numbers will suddenly sky rocket. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
couple of things 1) That is a fairly old version of postgres, there are considerable performance improvements in the last 2 releases since, and even more in the pending release. 2) If you are going to insert more rows than that, consider dropping the index before, and recreating after the insert. Dave Michael Kleiser wrote: > Im PostgreSQL 7.2.2 / Linux 2.4.27 dual-processor Pentium III 900MHz, > > we have this table: > > create table testtable (id SERIAL PRIMARY KEY, coni VARCHAR(255), date > TIMESTAMP, direction VARCHAR(255), partner VARCHAR(255), type > VARCHAR(255), block VARCHAR(255) ); > > > We using Java with JDBC-driver pg72jdbc2.jar > > > our Java-testgrogram is : > > > public class Stresser implements Runnable { > public void run() { > System.out.println("-> start"); > try { > > Class.forName("org.postgresql.Driver"); > Connection con = > DriverManager.getConnection("jdbc:postgresql://"+prop.getProperty("host")+":"+prop.getProperty("port")+"/"+prop.getProperty("dbname"), > prop.getProperty("user"), prop.getProperty("pwd")); > con.setAutoCommit(true); > Statement st = con.createStatement(); > java.sql.Timestamp datum = new java.sql.Timestamp(new > Date().getTime()); > Date start = new Date(); > System.out.println(start); > for (int i=0; i<100; ++i) { > st.executeUpdate("insert into > history(uuid,coni,date,direction,partner,type) > values('uuid','content','"+datum+"','dir','partner','type')"); > } > Date end = new Date(); > System.out.println(end); > con.close(); > } catch (Exception e) { > System.out.println("Exception!"); > e.printStackTrace(); > } > System.out.println("-> ende"); > } > > public static void main(String[] args) { > > for (int i=0; i<10; ++i) { > Stresser s = new Stresser(); > Thread t = new Thread(s); > t.start(); > } > } > } > > > It is trunning in in 10 Threads. Each thread makes 100 Inserts: > > For the 1000 Inserts (10 threads a 100 inserts) > we need 8 seconds. > That's 125 Insets / Seconds. > > How could we make it faster ? > > Inserting 1000 rows via INSERT AS SELECT is much faster. > > regards > Michael > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > -- Dave Cramer http://www.postgresintl.com 519 939 0336 ICQ#14675561
On Wed, 10 Nov 2004 14:51:57 +0100, Michael Kleiser <mkl@webde-ag.de> wrote: > Statement st = con.createStatement(); > java.sql.Timestamp datum = new java.sql.Timestamp(new Date().getTime()); > Date start = new Date(); > System.out.println(start); > for (int i=0; i<100; ++i) { > st.executeUpdate("insert into history(uuid,coni,date,direction,partner,type) values('uuid','content','"+datum+"','dir','partner','type')"); > } how about using PreparedStatment? that's on the java end. on the pg end, maybe do a BEGIN before the for loop and END at the end of the for loop. -- i'm not flying. i'm falling... in style.
On Thu, Nov 11, 2004 at 04:04:06PM +0800, Edwin Eyan Moragas wrote: > how about using PreparedStatment? that's on the java end. > on the pg end, maybe do a BEGIN before the for loop and > END at the end of the for loop. You don't even need a "BEGIN" and "END"; his code has a setAutoComit(true) before the for loop, which just has to be changed to setAutoCommit(false) (and add an explicit commit() after the for loop, of course). /* Steinar */ -- Homepage: http://www.sesse.net/
On Thu, 11 Nov 2004 11:04:18 +0100, Steinar H. Gunderson <sgunderson@bigfoot.com> wrote: > You don't even need a "BEGIN" and "END"; his code has a setAutoComit(true) > before the for loop, which just has to be changed to setAutoCommit(false) > (and add an explicit commit() after the for loop, of course). amen. i stand corrected. -eem