Thread: Problems with autovacuum
Hi, Recent change postgresql server from Amazon EC2 small into large one. That gives me x86_64 arch, two core cpu and 7.5GB ram. Atm got almost ~2000 small databases at that server and autovacuum working hole time (witch isn't good for performance as I notice at cacti, one core is busy in 60% hole time). How can I tweak postgresql.conf to get better performance ? Maybe number of database is huge but most of them are unused most of time and others (~400-500) do mainly selects only with small number of inserts or deletes from time to time. My configuration is Fedora Core 10: postgresql-libs-8.3.7-1.fc10.x86_64 postgresql-8.3.7-1.fc10.x86_64 postgresql-server-8.3.7-1.fc10.x86_64 postgresql-devel-8.3.7-1.fc10.x86_64 postgresql.conf: #v+ max_connections = 500 shared_buffers = 200MB work_mem = 4096 maintenance_work_mem = 256MB max_fsm_pages = 204800 max_fsm_relations = 4000 vacuum_cost_delay = 0 vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 vacuum_cost_limit = 200 effective_cache_size = 2048MB logging_collector = on log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 0 track_activities = off track_counts = on log_parser_stats = off log_planner_stats = off log_executor_stats = off log_statement_stats = off autovacuum = on log_autovacuum_min_duration = -1 autovacuum_max_workers = 3 autovacuum_naptime = 10min autovacuum_vacuum_threshold = 10000 autovacuum_analyze_threshold = 10000 autovacuum_vacuum_scale_factor = 0.5 autovacuum_analyze_scale_factor = 0.4 autovacuum_freeze_max_age = 200000000 autovacuum_vacuum_cost_delay = 20 autovacuum_vacuum_cost_limit = -1 #v- Regards -- Łukasz Jagiełło System Administrator G-Forces Web Management Polska sp. z o.o. (www.gforces.pl) Ul. Kruczkowskiego 12, 80-288 Gdańsk Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ
2009/5/24 Łukasz Jagiełło <lukasz.jagiello@gforces.pl>: > Hi, > > Recent change postgresql server from Amazon EC2 small into large one. > That gives me x86_64 arch, two core cpu and 7.5GB ram. Atm got almost > ~2000 small databases at that server and autovacuum working hole time > postgresql.conf: > max_fsm_pages = 204800 > max_fsm_relations = 4000 So, in 2000 databases, there's only an average of 2 relations per db and 102 dead rows? Cause that's all you got room for with those settings. Whats the last 20 or so lines of vacuum verbose as run by a superuser say?
2009/5/25 Scott Marlowe <scott.marlowe@gmail.com>: > > So, in 2000 databases, there's only an average of 2 relations per db > and 102 dead rows? Cause that's all you got room for with those > settings. > > Whats the last 20 or so lines of vacuum verbose as run by a superuser say? according to http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html max_fsm_relations applies only to tables and indices, and it says "in database", so I presume that means per database. In which case, those settings are ok. It would be nice, to see if vacuum actually complains about it. -- GJ
2009/5/25 Grzegorz Jaśkiewicz <gryzman@gmail.com>: > 2009/5/25 Scott Marlowe <scott.marlowe@gmail.com>: > >> >> So, in 2000 databases, there's only an average of 2 relations per db >> and 102 dead rows? Cause that's all you got room for with those >> settings. >> >> Whats the last 20 or so lines of vacuum verbose as run by a superuser say? > > according to http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html > max_fsm_relations applies only to tables and indices, and it says "in > database", so I presume that means per database. In which case, those > settings are ok. > It would be nice, to see if vacuum actually complains about it. The docs say: "These parameters control the size of the shared free space map," Key word being shared.
W dniu 25 maja 2009 17:32 użytkownik Scott Marlowe <scott.marlowe@gmail.com> napisał: >> Recent change postgresql server from Amazon EC2 small into large one. >> That gives me x86_64 arch, two core cpu and 7.5GB ram. Atm got almost >> ~2000 small databases at that server and autovacuum working hole time > >> postgresql.conf: >> max_fsm_pages = 204800 >> max_fsm_relations = 4000 > > So, in 2000 databases, there's only an average of 2 relations per db > and 102 dead rows? Cause that's all you got room for with those > settings. > > Whats the last 20 or so lines of vacuum verbose as run by a superuser say? Guess you was right #v+ Total free space (including removable row versions) is 2408 bytes. 0 pages are or will become empty, including 0 at the end of the table. 1 pages containing 2092 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "pg_depend_depender_index" now contains 5267 row versions in 30 pages SZCZEGÓŁY: 4 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "pg_depend_reference_index" now contains 5267 row versions in 32 pages SZCZEGÓŁY: 4 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_depend": moved 0 row versions, truncated 39 to 39 pages SZCZEGÓŁY: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing "pg_catalog.pg_depend" INFO: "pg_depend": scanned 39 of 39 pages, containing 5267 live rows and 0 dead rows; 3000 rows in sample, 5267 estimated total rows INFO: free space map contains 3876 pages in 4000 relations SZCZEGÓŁY: A total of 67824 page slots are in use (including overhead). 67824 page slots are required to track all free space. Current limits are: 204800 page slots, 4000 relations, using 1612 kB. NOTICE: max_fsm_relations(4000) equals the number of relations checked PODPOWIEDŹ: You have at least 4000 relations. Consider increasing the configuration parameter "max_fsm_relations". VACUUM #v- Change: max_fsm_pages = 6400000 max_fsm_relations = 400000 -- Łukasz Jagiełło System Administrator G-Forces Web Management Polska sp. z o.o. (www.gforces.pl) Ul. Kruczkowskiego 12, 80-288 Gdańsk Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ
W dniu 25 maja 2009 17:50 użytkownik Grzegorz Jaśkiewicz <gryzman@gmail.com> napisał: >> So, in 2000 databases, there's only an average of 2 relations per db >> and 102 dead rows? Cause that's all you got room for with those >> settings. >> >> Whats the last 20 or so lines of vacuum verbose as run by a superuser say? > > according to http://www.postgresql.org/docs/8.1/interactive/runtime-config-resource.html > max_fsm_relations applies only to tables and indices, and it says "in > database", so I presume that means per database. In which case, those > settings are ok. > It would be nice, to see if vacuum actually complains about it. Vacuum did complain about that but still got: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 609 postgres 15 0 537m 105m 676 S 21.5 1.4 1:15.00 postgres: autovacuum launcher process 610 postgres 15 0 241m 88m 500 S 20.5 1.1 1:50.87 postgres: stats collector process That autovacuum working hole time, shoudn't be run only when db needs ? -- Łukasz Jagiełło System Administrator G-Forces Web Management Polska sp. z o.o. (www.gforces.pl) Ul. Kruczkowskiego 12, 80-288 Gdańsk Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ
=?UTF-8?B?xYF1a2FzeiBKYWdpZcWCxYJv?= <lukasz.jagiello@gforces.pl> writes: > That autovacuum working hole time, shoudn't be run only when db needs ? With 2000 databases to cycle through, autovac is going to be spending quite a lot of time just finding out whether it needs to do anything. I believe the interpretation of autovacuum_naptime is that it should examine each database that often, ie once a minute by default. So it's got more than 30 databases per second to look through. Maybe it would make more sense to have one database (or at least, many fewer databases) and 2000 schemas within it? If you really want to stick with this layout, you're going to have to increase autovacuum_naptime. regards, tom lane
2009/5/25 Tom Lane <tgl@sss.pgh.pa.us>: > With 2000 databases to cycle through, autovac is going to be spending > quite a lot of time just finding out whether it needs to do anything. > I believe the interpretation of autovacuum_naptime is that it should > examine each database that often, ie once a minute by default. So > it's got more than 30 databases per second to look through. > > Maybe it would make more sense to have one database (or at least, > many fewer databases) and 2000 schemas within it? It's rather impossible in my case. > If you really want to stick with this layout, you're going to have to > increase autovacuum_naptime. Thanks for hint. -- Łukasz Jagiełło System Administrator G-Forces Web Management Polska sp. z o.o. (www.gforces.pl) Ul. Kruczkowskiego 12, 80-288 Gdańsk Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ
2009/5/25 Łukasz Jagiełło <lukasz.jagiello@gforces.pl>: > W dniu 25 maja 2009 17:32 użytkownik Scott Marlowe > <scott.marlowe@gmail.com> napisał: >>> Recent change postgresql server from Amazon EC2 small into large one. >>> That gives me x86_64 arch, two core cpu and 7.5GB ram. Atm got almost >>> ~2000 small databases at that server and autovacuum working hole time >> >>> postgresql.conf: >>> max_fsm_pages = 204800 >>> max_fsm_relations = 4000 >> >> So, in 2000 databases, there's only an average of 2 relations per db >> and 102 dead rows? Cause that's all you got room for with those >> settings. >> >> Whats the last 20 or so lines of vacuum verbose as run by a superuser say? > > Guess you was right > For future reference, if you don't log postgresql's messages, please turn at least basic logging on, and things like that you would find in $PGDATA/pg_log/ logs. The value suggested by postgresql, is the minimum. I usually put in 1.5 the suggestion, which covers me from worse case hopefully. -- GJ
Tom Lane escribió: > =?UTF-8?B?xYF1a2FzeiBKYWdpZcWCxYJv?= <lukasz.jagiello@gforces.pl> writes: > > That autovacuum working hole time, shoudn't be run only when db needs ? > > With 2000 databases to cycle through, autovac is going to be spending > quite a lot of time just finding out whether it needs to do anything. > I believe the interpretation of autovacuum_naptime is that it should > examine each database that often, ie once a minute by default. So > it's got more than 30 databases per second to look through. Note that this is correct in 8.1 and 8.2 but not 8.3 onwards. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> I believe the interpretation of autovacuum_naptime is that it should >> examine each database that often, ie once a minute by default. So >> it's got more than 30 databases per second to look through. > Note that this is correct in 8.1 and 8.2 but not 8.3 onwards. Oh? The current documentation still defines the variable thusly: Specifies the minimum delay between autovacuum runs on any given database. In each round the daemon examines the database and issues VACUUM and ANALYZE commands as needed for tables in that database. I suppose the use of "minimum" means that this is not technically incorrect, but it's sure not very helpful if there is some other rule involved that causes it to not behave as I said. (And if there is some other rule, what is that?) Please improve the docs. regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane escribi�: > >> I believe the interpretation of autovacuum_naptime is that it should > >> examine each database that often, ie once a minute by default. So > >> it's got more than 30 databases per second to look through. > > > Note that this is correct in 8.1 and 8.2 but not 8.3 onwards. > > Oh? The current documentation still defines the variable thusly: > > Specifies the minimum delay between autovacuum runs on any given > database. In each round the daemon examines the database and > issues VACUUM and ANALYZE commands as needed for tables in that > database. Sorry, it's the other way around actually -- correct for 8.3 onwards, wrong for 8.1 and 8.2. In the earlier versions, it would do one run in a chosen database, sleep during "naptime", then do another run. > I suppose the use of "minimum" means that this is not technically > incorrect, but it's sure not very helpful if there is some other > rule involved that causes it to not behave as I said. (And if there > is some other rule, what is that?) The word "minimum" is there because it's possible that all workers are busy with some other database(s). > Please improve the docs. I'll see about that. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
W dniu 26 maja 2009 20:28 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał: >>> I believe the interpretation of autovacuum_naptime is that it should >>> examine each database that often, ie once a minute by default. So >>> it's got more than 30 databases per second to look through. > >> Note that this is correct in 8.1 and 8.2 but not 8.3 onwards. > > Oh? The current documentation still defines the variable thusly: > > Specifies the minimum delay between autovacuum runs on any given > database. In each round the daemon examines the database and > issues VACUUM and ANALYZE commands as needed for tables in that > database. > > I suppose the use of "minimum" means that this is not technically > incorrect, but it's sure not very helpful if there is some other > rule involved that causes it to not behave as I said. (And if there > is some other rule, what is that?) Please improve the docs. After change autovacuum_naptime postgresql behave like you wrote before. -- Łukasz Jagiełło System Administrator G-Forces Web Management Polska sp. z o.o. (www.gforces.pl) Ul. Kruczkowskiego 12, 80-288 Gdańsk Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ
Alvaro Herrera <alvherre@commandprompt.com> writes: > Sorry, it's the other way around actually -- correct for 8.3 onwards, > wrong for 8.1 and 8.2. In the earlier versions, it would do one run in > a chosen database, sleep during "naptime", then do another run. > Tom Lane escribió: >> I suppose the use of "minimum" means that this is not technically >> incorrect, but it's sure not very helpful if there is some other >> rule involved that causes it to not behave as I said. (And if there >> is some other rule, what is that?) > The word "minimum" is there because it's possible that all workers are > busy with some other database(s). >> Please improve the docs. > I'll see about that. Hmm, maybe we need to improve the code too. This example suggests that there needs to be some limit on the worker launch rate, even if there are so many databases that that means we don't meet naptime exactly. regards, tom lane
Tom Lane escribió: > Hmm, maybe we need to improve the code too. This example suggests that > there needs to be some limit on the worker launch rate, even if there > are so many databases that that means we don't meet naptime exactly. We already have a 100ms lower bound on the sleep time (see launcher_determine_sleep()). Maybe that needs to be increased? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> Hmm, maybe we need to improve the code too. This example suggests that >> there needs to be some limit on the worker launch rate, even if there >> are so many databases that that means we don't meet naptime exactly. > We already have a 100ms lower bound on the sleep time (see > launcher_determine_sleep()). Maybe that needs to be increased? Maybe. I hesitate to suggest a GUC variable ;-) One thought is that I don't trust the code implementing the minimum too much: /* 100ms is the smallest time we'll allow the launcher to sleep */ if (nap->tv_sec <= 0 && nap->tv_usec <= 100000) { nap->tv_sec = 0; nap->tv_usec = 100000; /* 100 ms */ } What would happen if tv_sec is negative and tv_usec is say 500000? Maybe negative tv_sec is impossible here, but ... regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Tom Lane escribi�: > >> Hmm, maybe we need to improve the code too. This example suggests that > >> there needs to be some limit on the worker launch rate, even if there > >> are so many databases that that means we don't meet naptime exactly. > > > We already have a 100ms lower bound on the sleep time (see > > launcher_determine_sleep()). Maybe that needs to be increased? > > Maybe. I hesitate to suggest a GUC variable ;-) Heh :-) > One thought is that I don't trust the code implementing the minimum > too much: > > /* 100ms is the smallest time we'll allow the launcher to sleep */ > if (nap->tv_sec <= 0 && nap->tv_usec <= 100000) > { > nap->tv_sec = 0; > nap->tv_usec = 100000; /* 100 ms */ > } > > What would happen if tv_sec is negative and tv_usec is say 500000? > Maybe negative tv_sec is impossible here, but ... I don't think it's possible to get negative tv_sec here currently, but perhaps you're right that we could make this code more future-proof. However I think there's a bigger problem here, which is that if the user has set naptime too low, i.e. to a value lower than number-of-databases * 100ms, we'll be running the (expensive) rebuild_database_list function on each iteration ... maybe we oughta put a lower bound on naptime based on the number of databases to avoid this problem. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > However I think there's a bigger problem here, which is that if the user > has set naptime too low, i.e. to a value lower than > number-of-databases * 100ms, we'll be running the (expensive) > rebuild_database_list function on each iteration ... maybe we oughta put > a lower bound on naptime based on the number of databases to avoid this > problem. Bingo, that's surely exactly what was happening to the OP. He had 2000 databases and naptime at (I assume) the default; so he was rerunning rebuild_database_list every 100ms. So that recovery code path needs some more thought. Maybe a lower bound on how often to do rebuild_database_list? And/or don't set adl_next_worker to less than 100ms in the future to begin with? regards, tom lane
2009/5/26 Tom Lane <tgl@sss.pgh.pa.us>: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> However I think there's a bigger problem here, which is that if the user >> has set naptime too low, i.e. to a value lower than >> number-of-databases * 100ms, we'll be running the (expensive) >> rebuild_database_list function on each iteration ... maybe we oughta put >> a lower bound on naptime based on the number of databases to avoid this >> problem. > > Bingo, that's surely exactly what was happening to the OP. He had 2000 > databases and naptime at (I assume) the default; so he was rerunning > rebuild_database_list every 100ms. > > So that recovery code path needs some more thought. Maybe a lower bound > on how often to do rebuild_database_list? And/or don't set adl_next_worker > to less than 100ms in the future to begin with? I'd be happy with logging telling me when things are getting pathological.
Tom Lane escribió: > Bingo, that's surely exactly what was happening to the OP. He had 2000 > databases and naptime at (I assume) the default; so he was rerunning > rebuild_database_list every 100ms. > > So that recovery code path needs some more thought. Maybe a lower bound > on how often to do rebuild_database_list? And/or don't set adl_next_worker > to less than 100ms in the future to begin with? I've been giving this some thought and tried several approaches. In the end the one that I like the most is raising autovacuum_naptime to a reasonable value for the exiting number of databases. The only problem I have with it is that it's trivial to change it in the autovacuum launcher process and have it stick, but there's no way to propagate the value out to backends or postmaster to that they SHOW the actual value in use by the launcher. The best I can do is emit a WARNING with the new value. I have experimented with other choices such as not rebuilding the database list if the time elapsed since last rebuild is not very long, but there were small problems with that so I'd prefer to avoid it. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera <alvherre@commandprompt.com> writes: > I've been giving this some thought and tried several approaches. In the > end the one that I like the most is raising autovacuum_naptime to a > reasonable value for the exiting number of databases. The only problem > I have with it is that it's trivial to change it in the autovacuum > launcher process and have it stick, but there's no way to propagate the > value out to backends or postmaster to that they SHOW the actual value > in use by the launcher. The best I can do is emit a WARNING with the > new value. Well, that code isn't even correct I think; you're not supposed to modify a GUC variable directly. I think you should just silently use a naptime of at least X without changing the nominal GUC variable. And definitely without the WARNING --- that's nothing but log spam. regards, tom lane
Tom Lane escribió: > Well, that code isn't even correct I think; you're not supposed to > modify a GUC variable directly. I think you should just silently > use a naptime of at least X without changing the nominal GUC variable. > And definitely without the WARNING --- that's nothing but log spam. Glitches fixed in this version; will apply shortly to 8.3 and HEAD. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Attachment
Alvaro Herrera escribió: > Tom Lane escribió: > > > Well, that code isn't even correct I think; you're not supposed to > > modify a GUC variable directly. I think you should just silently > > use a naptime of at least X without changing the nominal GUC variable. > > And definitely without the WARNING --- that's nothing but log spam. > > Glitches fixed in this version; will apply shortly to 8.3 and HEAD. Committed. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Glitches fixed in this version; will apply shortly to 8.3 and HEAD. Looks sane; one trivial grammar correction: > + /* the minimum allowed time between two awakening of the launcher */ Should read "two awakenings". regards, tom lane