Thread: Postgresql performance in production environment
[Sorry for the length of this post. It stretched as I provided as much info as possible..] So the rubber meets the road. We've put postgresql in a production environment with some heavy simultaneous usage. It works well in general, but often PG doesn't respond. How should I test what is going wrong? All tables are vacuumed, analyzed and such. INSERTS and UPDATEs are happening but not much, it's mostly SELECTs. Is PGSQL running out of connections? We can temporarily fix this by restarting pgsql but I'd like a more tenable solution. Speculating that it could be some conf variable somewhere (max_fsm_pages in particular) I am including three things at the bottom of this post: 1. Our PS output (for "postgres") 2. *Verbose* vacuum info for a table that shows max_fsm warning 3. Our postgresql.conf settings My question 1 -- how should we test and tweak our production installation? Where should we look. In MySQL we could do a "show status" at the console and it would give a mountain of information. Then there was that handy little "tuning-primer" script that made it all come alive. I suppose this stuff is also available in pg_catalog but is there any website that goes in depth into HOW to tune, what different values mean, and such? My question 2 -- in production, we're constantly seeing this message while vacuuming one table with less than 3 million rows, but one that we expect to keep growing: [------------- WARNING: relation "public.links" contains more than "max_fsm_pages" pages with useful free space HINT: Consider compacting this relation or increasing the configuration parameter "max_fsm_pages". VACUUM -------------] I can merrily increase the "max_fsm_pages" directive, but the manual also caveats that with "this can use more system V memory than available on your system". My full verbose vacuum info below includes the line: [------------- INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live rows and 0 dead rows; 3000 rows in sample, 2710124 estimated total rows -------------] Does this mean my table needs nearly 200,000 pages, and that should be the setting of max_fsm_pages? This server is on a fairly common setup these days: Dual AMD Opterons, 4GB memory, SATA RAID 1, 250GB each. I don't mind letting postgres use up to 1GB of the memory for itself, but the rest is needed for others. From http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html , it seems the "max_fsm_relations" is about how many tables and indexes can be tracked in the free space map. Does this mean the number of actual tables and indexes in postgres databases, or instances of these tables? For example, if I only run 5 databases, each of which have about 10 tables and 20 indexes, then I have only 150 (5 * 30) actual "relations" in postgresql lingo. So my max_fsm_relations setting can be 150? (Which seems a little low compared to what I see online in several posts online). Assuming 150 is ok, that manual page, and other tweaking stuff such as - http://www.revsys.com/writings/postgresql-performance.html -- suggest that "max_fsm_pages" is even more critical. The manual says this should be at least 16 times that of max_fsm_relations, so in my example, it should be at least 150 * 16, which is about 2400. This seems abysmally low! If I up this figure to, say, 24000 instead, I still keep seeing the kinds of errors posted above. My question no. 3 -- for a SELECT-heavy database, is there any tried-and-tested caching tool that could be of use? I'd like to skip connection pooling if possible, but would be very interested in good caching products or contribs. I noticed pgmemcached ( http://pgfoundry.org/projects/pgmemcache/ ) but it's in beta 1.2. Would love some thoughts from people who have used it...is it worth the effort? More of my info below. TIA for your thoughts and advice! -/Phoenix ==========EXHIBIT 1: PS OUTPUT ============== > ps auxnm | grep postgres 26 20665 0.0 0.0 11760 612 ? - Aug18 0:00 postgres: logger process 26 20670 0.0 1.1 188684 48312 ? - Aug18 0:00 postgres: writer process 26 20671 0.0 0.0 12032 804 ? - Aug18 0:28 postgres: stats collector process 26 14497 0.0 4.1 452108 172656 ? - 02:05 0:02 postgres: traders_traders traders 127.0.0.1(56204) VACUUM 0 9444 0.0 0.0 5008 656 pts/0 - 02:53 0:00 grep postgres ==========EXHIBIT 2: POSTGRES.CONF ============== listen_addresses = 'localhost,*' max_connections = 250 shared_buffers = 21000 # Not much more: http://snipr.com/pgperf effective_cache_size = 32000 max_fsm_relations = 500 max_fsm_pages = 60000 sort_mem = 4096 # Low when not needed: http://snipr.com/pgperf work_mem = 4096 temp_buffers = 4096 authentication_timeout = 10s ssl = off #VACUUM SETTINGS autovacuum = on vacuum_cost_delay = 10 stats_start_collector = on stats_row_level = on autovacuum_vacuum_threshold = 300 autovacuum_analyze_threshold = 100 #FOR BACKGROUND TASKS PERFORMANCE wal_buffers=64 checkpoint_segments=128 checkpoint_timeout=900 fsync = on maintenance_work_mem = 256MB # Too high? Well, watch for it... ========== EXHIBIT 3: VACUUM VERBOSE OUTPUT ======= mydbuser=# vacuum analyze verbose traders; INFO: vacuuming "public.traders" INFO: scanned index "traders_pkey" to remove 6 row versions DETAIL: CPU 0.07s/0.04u sec elapsed 140.61 sec. INFO: scanned index "idx_traders_userid" to remove 6 row versions DETAIL: CPU 0.05s/0.04u sec elapsed 49.70 sec. INFO: scanned index "idx_traders_mdate" to remove 6 row versions DETAIL: CPU 0.02s/0.04u sec elapsed 32.66 sec. INFO: scanned index "traders_unique_alias" to remove 6 row versions DETAIL: CPU 0.10s/0.11u sec elapsed 167.20 sec. INFO: "traders": removed 6 row versions in 5 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "traders_pkey" now contains 2780925 row versions in 22821 pages DETAIL: 6 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "idx_traders_userid" now contains 2780925 row versions in 11785 pages DETAIL: 6 index row versions were removed. 127 index pages have been deleted, 127 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "idx_traders_mdate" now contains 2780925 row versions in 7912 pages DETAIL: 6 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "traders_unique_alias" now contains 2780925 row versions in 9342 pages DETAIL: 6 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "traders": found 6 removable, 2780925 nonremovable row versions in 199396 pages DETAIL: 0 dead row versions cannot be removed yet. There were 2959732 unused item pointers. 137272 pages contain useful free space. 0 pages are entirely empty. CPU 0.74s/0.40u sec elapsed 1335.71 sec. WARNING: relation "public.traders" contains more than "max_fsm_pages" pages with useful free space HINT: Consider compacting this relation or increasing the configuration parameter "max_fsm_pages". INFO: vacuuming "pg_toast.pg_toast_41513" INFO: index "pg_toast_41513_index" now contains 26 row versions in 2 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_41513": found 0 removable, 26 nonremovable row versions in 5 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 4 pages contain useful free space. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: analyzing "public.traders" INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live rows and 0 dead rows; 3000 rows in sample, 2710124 estimated total rows VACUUM Time: 1533601.235 ms
On 19/08/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > [Sorry for the length of this post. It stretched as I provided as much > info as possible..] > > So the rubber meets the road. We've put postgresql in a production > environment with some heavy simultaneous usage. It works well in > general, but often PG doesn't respond. How should I test what is going > wrong? All tables are vacuumed, analyzed and such. INSERTS and UPDATEs > are happening but not much, it's mostly SELECTs. Is PGSQL running out > of connections? We can temporarily fix this by restarting pgsql but > I'd like a more tenable solution. > > Speculating that it could be some conf variable somewhere > (max_fsm_pages in particular) I am including three things at the > bottom of this post: > > 1. Our PS output (for "postgres") > 2. *Verbose* vacuum info for a table that shows max_fsm warning > 3. Our postgresql.conf settings > > My question 1 -- how should we test and tweak our production > installation? Where should we look. In MySQL we could do a "show > status" at the console and it would give a mountain of information. > Then there was that handy little "tuning-primer" script that made it > all come alive. I suppose this stuff is also available in pg_catalog > but is there any website that goes in depth into HOW to tune, what > different values mean, and such? > > My question 2 -- in production, we're constantly seeing this message > while vacuuming one table with less than 3 million rows, but one that > we expect to keep growing: > > [------------- > WARNING: relation "public.links" contains more than "max_fsm_pages" > pages with useful free space > HINT: Consider compacting this relation or increasing the > configuration parameter "max_fsm_pages". > VACUUM > -------------] > > I can merrily increase the "max_fsm_pages" directive, but the manual > also caveats that with "this can use more system V memory than > available on your system". My full verbose vacuum info below includes > the line: > > [------------- > INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live > rows and 0 dead rows; 3000 rows in sample, 2710124 > estimated total rows > -------------] > > Does this mean my table needs nearly 200,000 pages, and that should be > the setting of max_fsm_pages? This server is on a fairly common setup > these days: Dual AMD Opterons, 4GB memory, SATA RAID 1, 250GB each. I > don't mind letting postgres use up to 1GB of the memory for itself, > but the rest is needed for others. > > From http://www.postgresql.org/docs/8.2/static/runtime-config-resource.html > , it seems the "max_fsm_relations" is about how many tables and > indexes can be tracked in the free space map. Does this mean the > number of actual tables and indexes in postgres databases, or > instances of these tables? For example, if I only run 5 databases, > each of which have about 10 tables and 20 indexes, then I have only > 150 (5 * 30) actual "relations" in postgresql lingo. So my > max_fsm_relations setting can be 150? (Which seems a little low > compared to what I see online in several posts online). > > Assuming 150 is ok, that manual page, and other tweaking stuff such as - > http://www.revsys.com/writings/postgresql-performance.html -- suggest > that "max_fsm_pages" is even more critical. The manual says this > should be at least 16 times that of max_fsm_relations, so in my > example, it should be at least 150 * 16, which is about 2400. This > seems abysmally low! If I up this figure to, say, 24000 instead, I > still keep seeing the kinds of errors posted above. > > > My question no. 3 -- for a SELECT-heavy database, is there any > tried-and-tested caching tool that could be of use? I'd like to skip > connection pooling if possible, but would be very interested in good > caching products or contribs. I noticed pgmemcached ( > http://pgfoundry.org/projects/pgmemcache/ ) but it's in beta 1.2. > Would love some thoughts from people who have used it...is it worth > the effort? > > > More of my info below. TIA for your thoughts and advice! > > -/Phoenix > > > > > ==========EXHIBIT 1: PS OUTPUT ============== > > ps auxnm | grep postgres > 26 20665 0.0 0.0 11760 612 ? - Aug18 0:00 > postgres: logger process > 26 20670 0.0 1.1 188684 48312 ? - Aug18 0:00 > postgres: writer process > 26 20671 0.0 0.0 12032 804 ? - Aug18 0:28 > postgres: stats collector process > 26 14497 0.0 4.1 452108 172656 ? - 02:05 0:02 > postgres: traders_traders traders 127.0.0.1(56204) VACUUM > 0 9444 0.0 0.0 5008 656 pts/0 - 02:53 0:00 grep postgres > > > > > ==========EXHIBIT 2: POSTGRES.CONF ============== > listen_addresses = 'localhost,*' > > max_connections = 250 > shared_buffers = 21000 # Not much more: http://snipr.com/pgperf > effective_cache_size = 32000 > max_fsm_relations = 500 > max_fsm_pages = 60000 > sort_mem = 4096 # Low when not needed: http://snipr.com/pgperf > work_mem = 4096 > temp_buffers = 4096 > authentication_timeout = 10s > ssl = off > > #VACUUM SETTINGS > autovacuum = on > vacuum_cost_delay = 10 > stats_start_collector = on > stats_row_level = on > autovacuum_vacuum_threshold = 300 > autovacuum_analyze_threshold = 100 > > #FOR BACKGROUND TASKS PERFORMANCE > wal_buffers=64 > checkpoint_segments=128 > checkpoint_timeout=900 > fsync = on > maintenance_work_mem = 256MB # Too high? Well, watch for it... > > > > > ========== EXHIBIT 3: VACUUM VERBOSE OUTPUT ======= > > mydbuser=# vacuum analyze verbose traders; > > INFO: vacuuming "public.traders" > INFO: scanned index "traders_pkey" to remove 6 row versions > DETAIL: CPU 0.07s/0.04u sec elapsed 140.61 sec. > INFO: scanned index "idx_traders_userid" to remove 6 row versions > DETAIL: CPU 0.05s/0.04u sec elapsed 49.70 sec. > INFO: scanned index "idx_traders_mdate" to remove 6 row versions > DETAIL: CPU 0.02s/0.04u sec elapsed 32.66 sec. > INFO: scanned index "traders_unique_alias" to remove 6 row versions > DETAIL: CPU 0.10s/0.11u sec elapsed 167.20 sec. > INFO: "traders": removed 6 row versions in 5 pages > DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: index "traders_pkey" now contains 2780925 row versions in 22821 pages > DETAIL: 6 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: index "idx_traders_userid" now contains 2780925 row versions in > 11785 pages > DETAIL: 6 index row versions were removed. > 127 index pages have been deleted, 127 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: index "idx_traders_mdate" now contains 2780925 row versions in 7912 pages > DETAIL: 6 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: index "traders_unique_alias" now contains 2780925 row versions > in 9342 pages > DETAIL: 6 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "traders": found 6 removable, 2780925 nonremovable row versions > in 199396 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 2959732 unused item pointers. > 137272 pages contain useful free space. > 0 pages are entirely empty. > CPU 0.74s/0.40u sec elapsed 1335.71 sec. > WARNING: relation "public.traders" contains more than "max_fsm_pages" > pages with useful free space > HINT: Consider compacting this relation or increasing the > configuration parameter "max_fsm_pages". > INFO: vacuuming "pg_toast.pg_toast_41513" > INFO: index "pg_toast_41513_index" now contains 26 row versions in 2 pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "pg_toast_41513": found 0 removable, 26 nonremovable row > versions in 5 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 0 unused item pointers. > 4 pages contain useful free space. > 0 pages are entirely empty. > CPU 0.00s/0.00u sec elapsed 0.01 sec. > INFO: analyzing "public.traders" > INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live > rows and 0 dead rows; 3000 rows in sample, 2710124 estimated total > rows > VACUUM > Time: 1533601.235 ms > Well based on some past posts, I looked into my pg_log stuff and found a number of these lines: [---------------- LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable ----------------] Which suggests that our guess of running out of connections is the right one. So, we have three options (to begin with) -- 1. Increase the number of max_connections. This seems to be a voodoo art and a complex calculation of database size (which in our case is difficult to predict; it grows very fast), hardware, and such. I cannot risk other apps running on this same machine. 2. Use connection pooling. I've found pgpool2 and pgbouncer from the Skype group. Does anyone have experience using either? The latter looks good, although we're usually skeptical about connection pooling in general (or is that just the mysqli_pconnect() hangover?) 3. Use caching of queries. Memcache comes recommended, but there's a discussion as recently as Jan 2007 on this list about race conditions and such (most of which I don't quite understand) which cautions against its use. We do expect plenty of transactions and if something that has been updated is not very correctly and promptly invalidated in the cache, it has huge business repercussions for us. I'd love to hear other recommendations. This is for the connections bit. I'm also looking for advice on the max_fsm_pages stuff. That's another voodoo!!
On Sun, Aug 19, 2007 at 05:15:34PM +0800, Phoenix Kiula wrote: > Well based on some past posts, I looked into my pg_log stuff and found > a number of these lines: > [---------------- > LOG: could not fork new process for connection: Resource temporarily > unavailable Usually this message means that the *kernel* refused to let you fork a new process. Some resource was unavailable. So this means (usually) one of two things: - You have a maximum on the number of processes on the system and you're exceeding it - You have a limit of the amount of memory. If you have overcommit disabled this may be causing the issue. I couldn't find quickly details of your setup but if you're running linux with overcommit disabled, make sure you have at least twice as much space allocated for swap as you have real memory. Make sure you don't have lots of idle postgres processes lying around. You'll have to provide more detail about your system before getting any better recommendations. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Phoenix Kiula wrote: > > > Well based on some past posts, I looked into my pg_log stuff and found > a number of these lines: > > > [---------------- > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > ----------------] > > > Which suggests that our guess of running out of connections is the right one. No, you're not running out of "connections". You are, however, running over some kernel limit. > So, we have three options (to begin with) -- > > 1. Increase the number of max_connections. This seems to be a voodoo > art and a complex calculation of database size (which in our case is > difficult to predict; it grows very fast), hardware, and such. I > cannot risk other apps running on this same machine. No. You are not yet reaching max_connections, that would give you an error message that actually says so. This message indicates that you have an ulimit for the account that postgresql runs under that limits some resources - in this case most likely the number of processes. And this limit is not "compatible" with your settings for max_connections. You need to find this ulimit, and at least change it, or even remove it. > 2. Use connection pooling. I've found pgpool2 and pgbouncer from the > Skype group. Does anyone have experience using either? The latter > looks good, although we're usually skeptical about connection pooling > in general (or is that just the mysqli_pconnect() hangover?) Connection pooling "in general", is something that pretty much *every* larger app will always use. It may be implemented in the app (something which has often been troublesome in PHP solutions, but it's certainly the norm for Java or .Net apps) or in middleware like pgpool or pgbouncer. There should be no need to be sceptical about it in general ;-) Can't speak for either of those apps specifically, as I haven't used them in production. > 3. Use caching of queries. Memcache comes recommended, but there's a > discussion as recently as Jan 2007 on this list about race conditions > and such (most of which I don't quite understand) which cautions > against its use. We do expect plenty of transactions and if something > that has been updated is not very correctly and promptly invalidated > in the cache, it has huge business repercussions for us. There are ways to do this, but if you can't just use timeouts to expire from the cache, things can become pretty complicated pretty fast. But perhaps you can isolate some kinds of queries that can be cached for <n> minutes, and keep the rest without caching? //Magnus
Phoenix Kiula wrote: > [Sorry for the length of this post. It stretched as I provided as much > info as possible..] > > So the rubber meets the road. We've put postgresql in a production > environment with some heavy simultaneous usage. It works well in > general, but often PG doesn't respond. How should I test what is going > wrong? All tables are vacuumed, analyzed and such. INSERTS and UPDATEs > are happening but not much, it's mostly SELECTs. Is PGSQL running out > of connections? We can temporarily fix this by restarting pgsql but > I'd like a more tenable solution. > > Speculating that it could be some conf variable somewhere > (max_fsm_pages in particular) I am including three things at the > bottom of this post: > > 1. Our PS output (for "postgres") > 2. *Verbose* vacuum info for a table that shows max_fsm warning > 3. Our postgresql.conf settings > > My question 1 -- how should we test and tweak our production > installation? Where should we look. In MySQL we could do a "show > status" at the console and it would give a mountain of information. > Then there was that handy little "tuning-primer" script that made it > all come alive. I suppose this stuff is also available in pg_catalog > but is there any website that goes in depth into HOW to tune, what > different values mean, and such? > > My question 2 -- in production, we're constantly seeing this message > while vacuuming one table with less than 3 million rows, but one that > we expect to keep growing: > > [------------- > WARNING: relation "public.links" contains more than "max_fsm_pages" > pages with useful free space > HINT: Consider compacting this relation or increasing the > configuration parameter "max_fsm_pages". > VACUUM > -------------] > > I can merrily increase the "max_fsm_pages" directive, but the manual > also caveats that with "this can use more system V memory than > available on your system". My full verbose vacuum info below includes > the line: Do you actually run VACUUM FULL, or do you just mean you run VACUUM over the full database? If you run VACUUM FULL, you need to stop doing that :-) However, you will need to run it at least once over the whole database once you've fixed your max_fsm_pages setting. > [------------- > INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live > rows and 0 dead rows; 3000 rows in sample, 2710124 > estimated total rows > -------------] There should be a line like this at the end of a "VACUUM VERBOSE" command: INFO: free space map contains 33 pages in 74 relations DETAIL: A total of 1184 page slots are in use (including overhead). 1184 page slots are required to track all free space. Current limits are: 153600 page slots, 1000 relations, using 965 kB. VACUUM (note that my numbers are for a more or less empty database. Yours will be much higher) If your database size is reasonably stable, pick a good value a bit above the numbers suggested. If you expect it to grow a lot, add some more overhead, but monitor this value. > Does this mean my table needs nearly 200,000 pages, and that should be > the setting of max_fsm_pages? This server is on a fairly common setup No. It means that the table "traders" is using 199396 pages - most of them aren't free, so they are not tracked in the FSM. //Magnus
Phoenix Kiula wrote: > On 19/08/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: [... ] > Well based on some past posts, I looked into my pg_log stuff and found > a number of these lines: > > > [---------------- > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > LOG: could not fork new process for connection: Resource temporarily > unavailable > ----------------] > > > Which suggests that our guess of running out of connections is the right one. > > So, we have three options (to begin with) -- > > 1. Increase the number of max_connections. This seems to be a voodoo > art and a complex calculation of database size (which in our case is > difficult to predict; it grows very fast), hardware, and such. I > cannot risk other apps running on this same machine.sql this error is a sign that the OS(!) is running out of resources(or at least won't allow pg to fork another process) - either you hit an ulimit for the user postgresql runs under or you need to flip some kernel setting to increase the number of processes. increasing max_connections wil NOT help because you are not even hitting the current one yet ... > > 2. Use connection pooling. I've found pgpool2 and pgbouncer from the > Skype group. Does anyone have experience using either? The latter > looks good, although we're usually skeptical about connection pooling > in general (or is that just the mysqli_pconnect() hangover?) pgbouncer works quite fine here. Stefan
On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > Phoenix Kiula wrote: .....snipped.... > > I can merrily increase the "max_fsm_pages" directive, but the manual > > also caveats that with "this can use more system V memory than > > available on your system". My full verbose vacuum info below includes > > the line: > > Do you actually run VACUUM FULL, or do you just mean you run VACUUM over > the full database? If you run VACUUM FULL, you need to stop doing that > :-) However, you will need to run it at least once over the whole > database once you've fixed your max_fsm_pages setting. No we only do a "vacuum analyze" when we do something manually. Otherwise, it's all "autovacuum". Never done a "vacuum full" -- should we do one now given that we've overrun our max_fsm_pages? > > [------------- > > INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live > > rows and 0 dead rows; 3000 rows in sample, 2710124 > > estimated total rows > > -------------] > > There should be a line like this at the end of a "VACUUM VERBOSE" command: > INFO: free space map contains 33 pages in 74 relations > DETAIL: A total of 1184 page slots are in use (including overhead). > 1184 page slots are required to track all free space. > Current limits are: 153600 page slots, 1000 relations, using 965 kB. > VACUUM > Nope, there's no line that includes such useful info. The entire verbose output was included in my note. I did not see the words "Current limits are". Do I need to enable something in the conf file to get more verbose output? "debug2", "debug3" -- kind of stuff? TIA
On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > Phoenix Kiula wrote: > There are ways to do this, but if you can't just use timeouts to expire > from the cache, things can become pretty complicated pretty fast. But > perhaps you can isolate some kinds of queries that can be cached for <n> > minutes, and keep the rest without caching? Thanks. In fact we need caching on a very specific part of our application, for only three queries which hit the DB hard with thousands of simultaneous SELECTs. Do pgmemcache or pgbouncer allow for very specific usage? Both look way too complex. I don't mind the initial headachy setup and config, but then I would like the system to hum on its own, and the querying should be simple and intuitive. I need a simple mechanism to query the cache, and invalidate a specific query in the cache when the underlying table is UPDATED so that the query gets cached afresh when issued later. (And a way to use this mechanism through PHP or Perl would be splendid). TIA for any tips!
Phoenix Kiula wrote: > On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: >> Phoenix Kiula wrote: > > .....snipped.... > >>> I can merrily increase the "max_fsm_pages" directive, but the manual >>> also caveats that with "this can use more system V memory than >>> available on your system". My full verbose vacuum info below includes >>> the line: >> Do you actually run VACUUM FULL, or do you just mean you run VACUUM over >> the full database? If you run VACUUM FULL, you need to stop doing that >> :-) However, you will need to run it at least once over the whole >> database once you've fixed your max_fsm_pages setting. > > > > No we only do a "vacuum analyze" when we do something manually. > Otherwise, it's all "autovacuum". Never done a "vacuum full" -- Ok. That's good. > should we do one now given that we've overrun our max_fsm_pages? Yes, but not until you've fixed it. And only once. >>> [------------- >>> INFO: "traders": scanned 3000 of 199396 pages, containing 40775 live >>> rows and 0 dead rows; 3000 rows in sample, 2710124 >>> estimated total rows >>> -------------] >> There should be a line like this at the end of a "VACUUM VERBOSE" command: >> INFO: free space map contains 33 pages in 74 relations >> DETAIL: A total of 1184 page slots are in use (including overhead). >> 1184 page slots are required to track all free space. >> Current limits are: 153600 page slots, 1000 relations, using 965 kB. >> VACUUM >> > > > Nope, there's no line that includes such useful info. The entire > verbose output was included in my note. I did not see the words > "Current limits are". Do I need to enable something in the conf file > to get more verbose output? "debug2", "debug3" -- kind of stuff? Strange. It comes out at level INFO, and you do see other stuff at INFO level. Any chance this just got mixed up with an autovacuum run and that input it somewhere in the middle of your output? (that this "traders" info is from autovac) //Magnus
Phoenix Kiula wrote: > On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: >> Phoenix Kiula wrote: > >> There are ways to do this, but if you can't just use timeouts to expire >> from the cache, things can become pretty complicated pretty fast. But >> perhaps you can isolate some kinds of queries that can be cached for <n> >> minutes, and keep the rest without caching? > > > Thanks. In fact we need caching on a very specific part of our > application, for only three queries which hit the DB hard with > thousands of simultaneous SELECTs. > > Do pgmemcache or pgbouncer allow for very specific usage? Both look > way too complex. I don't mind the initial headachy setup and config, > but then I would like the system to hum on its own, and the querying > should be simple and intuitive. > > I need a simple mechanism to query the cache, and invalidate a > specific query in the cache when the underlying table is UPDATED so > that the query gets cached afresh when issued later. (And a way to use > this mechanism through PHP or Perl would be splendid). > > TIA for any tips! You can use LISTEN and NOTIFY to clear the cache, if you have many clients that can cause cache invalidations. If you only have a single app that can update the database, you can invalidate the cache from that applications code directly (such as using asp.net output caching if you were doing it in .net). I've implemented the prior a couple of times, but it does get a bit complex. The second part would be easier, but I don't have any direct pointers on that since it depends on the app development framework you're using. //Magnus
On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > Phoenix Kiula wrote: > > On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > > should we do one (VACUUM FULL) now given that we've overrun our max_fsm_pages? > > Yes, but not until you've fixed it. And only once. > FIxed what - the max_fsm_pages? That was my question: how to know what value to set for this. If the "vacuum verbose" won't give me the info you suggested because it is likely overlapping with autovacuum, should I temporarily turn autovacuum off and then run vacuum verbose? Also, while running vacuum full, any precautions to take?
Phoenix Kiula wrote: > On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: >> Phoenix Kiula wrote: >>> On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > > >>> should we do one (VACUUM FULL) now given that we've overrun our max_fsm_pages? >> Yes, but not until you've fixed it. And only once. >> > > > > FIxed what - the max_fsm_pages? That was my question: how to know what > value to set for this. If the "vacuum verbose" won't give me the info > you suggested because it is likely overlapping with autovacuum, should > I temporarily turn autovacuum off and then run vacuum verbose? Also, > while running vacuum full, any precautions to take? Yeah, you can do that - or you can just trawl back through the logs to find that information - it's there somewhere. grep would be helpful to find it. vacuum full will take out blocking locks on your database, so run it during a maintenance window or at least during a low-traffic time. //Magnus
On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > Phoenix Kiula wrote: ...snip.... > There should be a line like this at the end of a "VACUUM VERBOSE" command: > INFO: free space map contains 33 pages in 74 relations > DETAIL: A total of 1184 page slots are in use (including overhead). > 1184 page slots are required to track all free space. > Current limits are: 153600 page slots, 1000 relations, using 965 kB. > VACUUM > I ran the vacuum analyze verbose again, and did not see anything like that. Should I run a vacuum alone? In any case, in your example, which number would I take note of, and derive the max_fsm_pages from? I do notice this in my own output: There were 2959498 unused item pointers. 133616 pages contain useful free space. 0 pages are entirely empty. Does this mean I should have over 133,616 in my max_fsm_pages. Should I set it up at 150,000 for example? Secondly, the max_fsm_relations -- if I have about 150 "relations" in my database (relations as per PGSQL lingo) then can this figure be, say, 200? Or does this have to match max_fsm_pages? Many thanks
Btw, related to one my earlier questions: where can I see how many connections are being made to the DB, what was the maximum number attempted at any given time, and so on? The connections related info. Thanks!
Phoenix Kiula wrote: > On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: >> Phoenix Kiula wrote: > > ...snip.... > >> There should be a line like this at the end of a "VACUUM VERBOSE" command: >> INFO: free space map contains 33 pages in 74 relations >> DETAIL: A total of 1184 page slots are in use (including overhead). >> 1184 page slots are required to track all free space. >> Current limits are: 153600 page slots, 1000 relations, using 965 kB. >> VACUUM >> > > > I ran the vacuum analyze verbose again, and did not see anything like > that. Should I run a vacuum alone? It shows up in both variants for me. Oh, hang on. I think it only shows up if you're logged in with a superuser - table owner is not enough. Check that. > In any case, in your example, which number would I take note of, and > derive the max_fsm_pages from? The 1184 number (the one for "page slots are required to track") > I do notice this in my own output: > > There were 2959498 unused item pointers. > 133616 pages contain useful free space. > 0 pages are entirely empty. > > Does this mean I should have over 133,616 in my max_fsm_pages. Should > I set it up at 150,000 for example? Probably not enough - that's for a single table, no? > Secondly, the max_fsm_relations -- if I have about 150 "relations" in > my database (relations as per PGSQL lingo) then can this figure be, > say, 200? Or does this have to match max_fsm_pages? No need to match. If you have 150 relations, 200 is a reasonable value. But once you get the proper output from the vacuum command, it tells you that as well (74 in my example above) As for your other question,how to view connections. Use "SELECT * FROM pg_stat_activity". See http://www.postgresql.org/docs/8.2/static/monitoring-stats.html. //Magnus
On 19/08/07, Magnus Hagander <magnus@hagander.net> wrote: > Phoenix Kiula wrote: > No need to match. If you have 150 relations, 200 is a reasonable value. > But once you get the proper output from the vacuum command, it tells you > that as well (74 in my example above) Found it! You get those words if you do a generic "vacuum verbose", not a specific "vacuum verbose MYTABLE". In hindsight, the conf variable is for the entire database, so it makes sense to do a generic one! Here is my output: [---------- INFO: free space map contains 76059 pages in 32 relations DETAIL: A total of 136688 page slots are in use (including overhead). 136688 page slots are required to track all free space. Current limits are: 150000 page slots, 200 relations, using 893 kB. VACUUM Time: 202065.807 ms ----------] Now, will this value of "136688" keep incrementing, or will autovacuum keep it in check? I have increased my max_fsm_pages to 150,000 as you can see. Thanks for the "Monitoring Stats" link. Looks like pgsql is a bit more involved. I was looking for information that would allow me to set my "max_connections" well. The command you gave only shows currently active users, not the historic peak of connections for instance. I'll keep digging tha manual but would love any nudges in the right direction, thanks!
> The command you gave only shows currently > active users, not the historic peak of connections for instance. I'll > keep digging tha manual but would love any nudges in the right > direction, thanks! Can you set up a snapshot in a cronjob? It would still only be sample of a sample, but? > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >