Thread: is pg_autovacuum so effective ?
Hi all, I'm running since one week without use any vacuum full, I'm using ony pg_autovacuum. I expect that disk usage will reach a steady state but is not. PG engine: 7.4.5 Example: The message table is touched by pg_autvacuum at least 2 time a day: $ cat pg_autovacuum.log | grep VACUUM | grep messages [2005-02-15 16:41:00 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-16 03:31:47 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-16 12:44:18 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-16 23:26:09 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-17 09:25:41 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-17 19:57:11 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-18 05:38:46 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-18 14:28:55 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-19 02:22:20 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-19 13:43:02 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-20 02:05:40 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-20 14:06:33 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-20 23:54:32 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-21 08:57:20 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-21 19:24:53 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-22 05:25:03 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-22 15:20:39 CET] Performing: VACUUM ANALYZE "public"."messages" this is what gave me the vacuum full on that table: # vacuum full verbose messages; INFO: vacuuming "public.messages" INFO: "messages": found 77447 removable, 1606437 nonremovable row versions in 69504 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 97 to 2033 bytes long. There were 633541 unused item pointers. Total free space (including removable row versions) is 52819600 bytes. 1690 pages are or will become empty, including 0 at the end of the table. 22217 pages containing 51144248 free bytes are potential move destinations. CPU 2.39s/0.55u sec elapsed 31.90 sec. INFO: index "idx_type_message" now contains 1606437 row versions in 7337 pages DETAIL: 77447 index row versions were removed. 446 index pages have been deleted, 446 are currently reusable. CPU 0.33s/0.75u sec elapsed 16.56 sec. INFO: index "messages_pkey" now contains 1606437 row versions in 5628 pages DETAIL: 77447 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.15s/0.80u sec elapsed 4.22 sec. INFO: index "idx_service_message" now contains 1606437 row versions in 6867 pages DETAIL: 77447 index row versions were removed. 499 index pages have been deleted, 499 are currently reusable. CPU 0.67s/0.99u sec elapsed 8.85 sec. INFO: index "idx_service_message_expired" now contains 135313 row versions in 3308 pages DETAIL: 77375 index row versions were removed. 512 index pages have been deleted, 512 are currently reusable. CPU 0.21s/0.32u sec elapsed 6.88 sec. INFO: index "idx_expired_messages" now contains 1606437 row versions in 7070 pages DETAIL: 77447 index row versions were removed. 448 index pages have been deleted, 448 are currently reusable. CPU 0.34s/1.10u sec elapsed 29.77 sec. INFO: index "idx_messages_target" now contains 1606437 row versions in 14480 pages DETAIL: 77447 index row versions were removed. 643 index pages have been deleted, 643 are currently reusable. CPU 0.84s/1.61u sec elapsed 25.72 sec. INFO: index "idx_messages_source" now contains 1606437 row versions in 10635 pages DETAIL: 77447 index row versions were removed. 190 index pages have been deleted, 190 are currently reusable. CPU 0.68s/1.04u sec elapsed 31.96 sec. INFO: "messages": moved 55221 row versions, truncated 69504 to 63307 pages DETAIL: CPU 5.46s/25.14u sec elapsed 280.20 sec. INFO: index "idx_type_message" now contains 1606437 row versions in 7337 pages DETAIL: 55221 index row versions were removed. 2304 index pages have been deleted, 2304 are currently reusable. CPU 0.42s/0.49u sec elapsed 53.35 sec. INFO: index "messages_pkey" now contains 1606437 row versions in 5628 pages DETAIL: 55221 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.31s/0.34u sec elapsed 13.27 sec. INFO: index "idx_service_message" now contains 1606437 row versions in 6867 pages DETAIL: 55221 index row versions were removed. 2024 index pages have been deleted, 2024 are currently reusable. CPU 0.51s/0.57u sec elapsed 16.60 sec. INFO: index "idx_service_message_expired" now contains 135313 row versions in 3308 pages DETAIL: 41411 index row versions were removed. 1918 index pages have been deleted, 1918 are currently reusable. CPU 0.30s/0.31u sec elapsed 36.01 sec. INFO: index "idx_expired_messages" now contains 1606437 row versions in 7064 pages DETAIL: 55221 index row versions were removed. 2166 index pages have been deleted, 2166 are currently reusable. CPU 0.94s/0.58u sec elapsed 34.97 sec. INFO: index "idx_messages_target" now contains 1606437 row versions in 14480 pages DETAIL: 55221 index row versions were removed. 3404 index pages have been deleted, 3404 are currently reusable. CPU 0.99s/1.03u sec elapsed 50.53 sec. INFO: index "idx_messages_source" now contains 1606437 row versions in 10635 pages DETAIL: 55221 index row versions were removed. 1809 index pages have been deleted, 1809 are currently reusable. CPU 0.84s/1.04u sec elapsed 35.44 sec. INFO: vacuuming "pg_toast.pg_toast_18376" INFO: "pg_toast_18376": found 0 removable, 1 nonremovable row versions in 1 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 1976 to 1976 bytes long. There were 0 unused item pointers. Total free space (including removable row versions) is 6192 bytes. 0 pages are or will become empty, including 0 at the end of the table. 1 pages containing 6192 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.02 sec. INFO: index "pg_toast_18376_index" now contains 1 row versions in 2 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.03 sec. INFO: "pg_toast_18376": moved 0 row versions, truncated 1 to 1 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. VACUUM pg_class after the vacuum full for that table relfilenode | relname | relpages | reltuples -------------+----------+----------+------------- 18376 | messages | 63307 | 1.60644e+06 pg_class before the vacuum full for that table relfilenode | relname | relpages | reltuples -------------+----------+----------+------------- 18376 | messages | 69472 | 1.60644e+06 how was possible accumulate 6000 pages wasted on that table? Between these two calls: [2005-02-22 05:25:03 CET] Performing: VACUUM ANALYZE "public"."messages" [2005-02-22 15:20:39 CET] Performing: VACUUM ANALYZE "public"."messages" 1768 rows where inserted, and I had 21578 updated for that rows ( each row have a counter incremented for each update ) so that table is not so heavy updated I'm running autovacuum with these parameters: pg_autovacuum -d 3 -v 300 -V 0.1 -S 0.8 -a 200 -A 0.1 -D shall I run it in a more aggressive way ? May be I'm missing something. Regards Gaetano Mendola
Gaetano Mendola wrote: >pg_class after the vacuum full for that table > > relfilenode | relname | relpages | reltuples >-------------+----------+----------+------------- > 18376 | messages | 63307 | 1.60644e+06 > > >pg_class before the vacuum full for that table > > relfilenode | relname | relpages | reltuples >-------------+----------+----------+------------- > 18376 | messages | 69472 | 1.60644e+06 > > > >how was possible accumulate 6000 pages wasted on that table? > >Between these two calls: >[2005-02-22 05:25:03 CET] Performing: VACUUM ANALYZE "public"."messages" >[2005-02-22 15:20:39 CET] Performing: VACUUM ANALYZE "public"."messages" > >1768 rows where inserted, and I had 21578 updated for that rows ( each >row have a counter incremented for each update ) so that table is not >so heavy updated > >I'm running autovacuum with these parameters: >pg_autovacuum -d 3 -v 300 -V 0.1 -S 0.8 -a 200 -A 0.1 -D > > >shall I run it in a more aggressive way ? May be I'm missing >something. > Well without thinking too much, I would first ask about your FSM settings? If they aren't big enought that will cause bloat. Try bumping your FSM settings and then see if you reach steady state.
Gaetano Mendola <mendola@bigfoot.com> writes: > I'm using ony pg_autovacuum. I expect that disk usage will reach > a steady state but is not. PG engine: 7.4.5 One data point doesn't prove that you're not at a steady state. > # vacuum full verbose messages; > INFO: vacuuming "public.messages" > INFO: "messages": found 77447 removable, 1606437 nonremovable row versions in 69504 pages > ... > INFO: "messages": moved 55221 row versions, truncated 69504 to 63307 pages 10% overhead sounds fairly reasonable to me. How does that compare to the amount of updating you do on the table --- ie, do you turn over 10% of the table in a day? regards, tom lane
Tom Lane wrote: > Gaetano Mendola <mendola@bigfoot.com> writes: > >>I'm using ony pg_autovacuum. I expect that disk usage will reach >>a steady state but is not. PG engine: 7.4.5 > > > One data point doesn't prove that you're not at a steady state. I do a graph about my disk usage and it's a ramp since one week, I'll continue to wait in order to see if it will decrease. I was expecting the steady state at something like 4 GB ( after a full vacuum and reindex ) + 10 % = 4.4 GB I'm at 4.6 GB and increasing. I'll see how it will continue. >># vacuum full verbose messages; >>INFO: vacuuming "public.messages" >>INFO: "messages": found 77447 removable, 1606437 nonremovable row versions in 69504 pages >>... >>INFO: "messages": moved 55221 row versions, truncated 69504 to 63307 pages > > > 10% overhead sounds fairly reasonable to me. How does that compare to > the amount of updating you do on the table --- ie, do you turn over 10% > of the table in a day? Less, that table have 1.6 milion rows, and I insert 2000 rows in a day with almost ~ 40000 update in one day. So it's something like: 2.5 % Regards Gaetano Mendola
Matthew T. O'Connor wrote: > Well without thinking too much, I would first ask about your FSM > settings? If they aren't big enought that will cause bloat. Try > bumping your FSM settings and then see if you reach steady state. FSM settings are big enough: max_fsm_pages | 2000000 max_fsm_relations | 1000 at least after a vacuum full I see that these numbers are an overkill... REgards Gaetano Mendola
Gaetano Mendola <mendola@bigfoot.com> writes: > Tom Lane wrote: >> Gaetano Mendola <mendola@bigfoot.com> writes: >> >>>I'm using ony pg_autovacuum. I expect that disk usage will reach >>>a steady state but is not. PG engine: 7.4.5 >> >> >> One data point doesn't prove that you're not at a steady state. > > I do a graph about my disk usage and it's a ramp since one week, > I'll continue to wait in order to see if it will decrease. > I was expecting the steady state at something like 4 GB > ( after a full vacuum and reindex ) + 10 % = 4.4 GB > I'm at 4.6 GB and increasing. I'll see how it will continue. You probably want for the "experiment" to last more than a week. After all, it might actually be that with your usage patterns, that table would stabilize at 15% "overhead," and that might take a couple or three weeks. Unless it's clear that it's growing perilously quickly, just leave it alone so that there's actually some possibility of reaching an equilibrium. Any time you "VACUUM FULL" it, that _destroys_ any experimental results or any noticeable patterns, and it guarantees that you'll see "seemingly perilous growth" for a while. And if the table is _TRULY_ growing "perilously quickly," then it is likely that you should add in some scheduled vacuums on the table. Not VACUUM FULLs; just plain VACUUMs. I revised cron scripts yet again today to do hourly and "4x/day" vacuums of certain tables in some of our systems where we know they need the attention. I didn't schedule any VACUUM FULLs; it's unnecessary, and would lead directly to system outages, which is totally unacceptable. -- "cbbrowne","@","ca.afilias.info" <http://dev6.int.libertyrms.com/> Christopher Browne (416) 673-4124 (land)
Gaetano Mendola <mendola@bigfoot.com> writes: > Matthew T. O'Connor wrote: > >> Well without thinking too much, I would first ask about your FSM >> settings? If they aren't big enought that will cause bloat. Try >> bumping your FSM settings and then see if you reach steady state. > > FSM settings are big enough: > > max_fsm_pages | 2000000 > max_fsm_relations | 1000 > > at least after a vacuum full I see that these numbers are an overkill... When you do a VACUUM FULL, the FSM is made irrelevant because VACUUM FULL takes the time to reclaim all possible space without resorting to _any_ use of the FSM. If you VACUUM FULL, then it's of little value to bother having a free space map because you're obviating the need to use it. In any case, the FSM figures you get out of a VACUUM are only really meaningful if you're moving towards the "equilibrium point" where the FSM is large enough to cope with the growth between VACUUM cycles. VACUUM FULL pushes the system away from equilibrium, thereby making FSM estimates less useful. -- "cbbrowne","@","ca.afilias.info" <http://dev6.int.libertyrms.com/> Christopher Browne (416) 673-4124 (land)
Christopher Browne wrote: > Gaetano Mendola <mendola@bigfoot.com> writes: > > >>Tom Lane wrote: >> >>>Gaetano Mendola <mendola@bigfoot.com> writes: >>> >>> >>>>I'm using ony pg_autovacuum. I expect that disk usage will reach >>>>a steady state but is not. PG engine: 7.4.5 >>> >>> >>>One data point doesn't prove that you're not at a steady state. >> >>I do a graph about my disk usage and it's a ramp since one week, >>I'll continue to wait in order to see if it will decrease. >>I was expecting the steady state at something like 4 GB >>( after a full vacuum and reindex ) + 10 % = 4.4 GB >>I'm at 4.6 GB and increasing. I'll see how it will continue. > > > You probably want for the "experiment" to last more than a week. > > After all, it might actually be that with your usage patterns, that > table would stabilize at 15% "overhead," and that might take a couple > or three weeks. > > Unless it's clear that it's growing perilously quickly, just leave it > alone so that there's actually some possibility of reaching an > equilibrium. Any time you "VACUUM FULL" it, that _destroys_ any > experimental results or any noticeable patterns, and it guarantees > that you'll see "seemingly perilous growth" for a while. > > And if the table is _TRULY_ growing "perilously quickly," then it is > likely that you should add in some scheduled vacuums on the table. > Not VACUUM FULLs; just plain VACUUMs. > > I revised cron scripts yet again today to do hourly and "4x/day" > vacuums of certain tables in some of our systems where we know they > need the attention. I didn't schedule any VACUUM FULLs; it's > unnecessary, and would lead directly to system outages, which is > totally unacceptable. Yes, I'm in this direction too. Regards Gaetano Mendola
Christopher Browne wrote: >Gaetano Mendola <mendola@bigfoot.com> writes: > > >>I do a graph about my disk usage and it's a ramp since one week, >>I'll continue to wait in order to see if it will decrease. >>I was expecting the steady state at something like 4 GB >>( after a full vacuum and reindex ) + 10 % = 4.4 GB >>I'm at 4.6 GB and increasing. I'll see how it will continue. >> >> > >You probably want for the "experiment" to last more than a week. > >After all, it might actually be that with your usage patterns, that >table would stabilize at 15% "overhead," and that might take a couple >or three weeks. > >Unless it's clear that it's growing perilously quickly, just leave it >alone so that there's actually some possibility of reaching an >equilibrium. Any time you "VACUUM FULL" it, that _destroys_ any >experimental results or any noticeable patterns, and it guarantees >that you'll see "seemingly perilous growth" for a while. > >And if the table is _TRULY_ growing "perilously quickly," then it is >likely that you should add in some scheduled vacuums on the table. >Not VACUUM FULLs; just plain VACUUMs. > >I revised cron scripts yet again today to do hourly and "4x/day" >vacuums of certain tables in some of our systems where we know they >need the attention. I didn't schedule any VACUUM FULLs; it's >unnecessary, and would lead directly to system outages, which is >totally unacceptable. > > Chris, is this in addition to pg_autovacuum? Or do you not use pg_autovacuum at all?, and if so why not?
Matthew T. O'Connor wrote: > Christopher Browne wrote: > >> Gaetano Mendola <mendola@bigfoot.com> writes: >> >> >>> I do a graph about my disk usage and it's a ramp since one week, >>> I'll continue to wait in order to see if it will decrease. >>> I was expecting the steady state at something like 4 GB >>> ( after a full vacuum and reindex ) + 10 % = 4.4 GB >>> I'm at 4.6 GB and increasing. I'll see how it will continue. >>> >> >> >> You probably want for the "experiment" to last more than a week. >> >> After all, it might actually be that with your usage patterns, that >> table would stabilize at 15% "overhead," and that might take a couple >> or three weeks. >> >> Unless it's clear that it's growing perilously quickly, just leave it >> alone so that there's actually some possibility of reaching an >> equilibrium. Any time you "VACUUM FULL" it, that _destroys_ any >> experimental results or any noticeable patterns, and it guarantees >> that you'll see "seemingly perilous growth" for a while. >> >> And if the table is _TRULY_ growing "perilously quickly," then it is >> likely that you should add in some scheduled vacuums on the table. >> Not VACUUM FULLs; just plain VACUUMs. >> >> I revised cron scripts yet again today to do hourly and "4x/day" >> vacuums of certain tables in some of our systems where we know they >> need the attention. I didn't schedule any VACUUM FULLs; it's >> unnecessary, and would lead directly to system outages, which is >> totally unacceptable. >> >> > > Chris, is this in addition to pg_autovacuum? Or do you not use > pg_autovacuum at all?, and if so why not? I have the same requirement too. Actually pg_autovacuum can not be instructed "per table" so some time the global settings are not good enough. I have a table of logs with 6 milions rows ( 3 years logs ) I insert on that page ~ 6000 rows for day. I'm running pg_autovacuum with setting to ANALYZE or VACUUM table if the 10% is touched. With this setting pg_autovacuum will analyze that table each 3 months!!! So I need to analyze and/or vacuum it manually. Regards Gaetano Mendola
Hi, Gaetano, Gaetano Mendola schrieb: > I have the same requirement too. Actually pg_autovacuum can not be > instructed "per table" so some time the global settings are not good > enough. I have a table of logs with 6 milions rows ( 3 years logs ) > I insert on that page ~ 6000 rows for day. I'm running pg_autovacuum > with setting to ANALYZE or VACUUM table if the 10% is touched. > With this setting pg_autovacuum will analyze that table each 3 months!!! If you have only inserts, and only so few on a large table, you do not need to vacuum such often. Not to reclaim space, only to prevent transaction ID wraparound (which is ensured by pg_autovacuum). And if the data distribution does not change, frequently calling ANALYZE does not help much, either. Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
Attachment
Hi, Gaetano, Gaetano Mendola schrieb: > Yes, I'm aware about it indeed I need the analyze because usualy I do on that > table select regarding last 24 ours so need to analyze it in order to > collect the statistics for this period. If you tend to do lots of queries for the last 24 hours, and there is only a very small percentage of such young rows, partial indices could be helpful. You could include all rows that are not older than 24 hours, and recreate them via cron script daily, so they grow from 24 to 48 hours between recreations. To avoid a gap in recreation, you could first create the new index, and then drop the old one, using alternating names. BTW, a small question for the gurus: does postgres make use of other indices when creating partial indices? HTH, Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
Attachment
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Markus Schaber wrote: > Hi, Gaetano, > > Gaetano Mendola schrieb: > > >>I have the same requirement too. Actually pg_autovacuum can not be >>instructed "per table" so some time the global settings are not good >>enough. I have a table of logs with 6 milions rows ( 3 years logs ) >>I insert on that page ~ 6000 rows for day. I'm running pg_autovacuum >>with setting to ANALYZE or VACUUM table if the 10% is touched. >>With this setting pg_autovacuum will analyze that table each 3 months!!! > > > If you have only inserts, and only so few on a large table, you do not > need to vacuum such often. Not to reclaim space, only to prevent > transaction ID wraparound (which is ensured by pg_autovacuum). > > And if the data distribution does not change, frequently calling ANALYZE > does not help much, either. Yes, I'm aware about it indeed I need the analyze because usualy I do on that table select regarding last 24 ours so need to analyze it in order to collect the statistics for this period. Beside that I tried to partition that table, I used both tecnique on my knowledge 1) A view with UNION ALL on all tables collecting these logs 2) Using inheritance and both cases are working in theory but in practice are not ( the index scan is lost as soon you use this view/table inside others views or joining them) I heard that next version of pg_autovacuum can be instructed "per table"; is it true ? Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCHlVu7UpzwH2SGd4RAqQfAKCatX9qbf5fmTN7RbapWj6BgAcwQgCfRy2R ApeFl9jezm/4YyVN/4fY3Jg= =wBIK -----END PGP SIGNATURE-----
Hi, Matthew, Matthew T. O'Connor schrieb: > The version of pg_autovacuum that I submitted for 8.0 could be > instructed "per table" but it didn't make the cut. Aside from moved out > of contrib and integrated into the backend, per table autovacuum > settings is probably the next highest priority. What was the reason for non-acceptance? Is it available as a standalone project? Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
Gaetano Mendola wrote: >Yes, I'm aware about it indeed I need the analyze because usualy I do on that >table select regarding last 24 ours so need to analyze it in order to >collect the statistics for this period. >Beside that I tried to partition that table, I used both tecnique on >my knowledge > >1) A view with UNION ALL on all tables collecting these logs >2) Using inheritance > >and both cases are working in theory but in practice are not ( the index scan >is lost as soon you use this view/table inside others views or joining them) > >I heard that next version of pg_autovacuum can be instructed "per table"; >is it true ? > The version of pg_autovacuum that I submitted for 8.0 could be instructed "per table" but it didn't make the cut. Aside from moved out of contrib and integrated into the backend, per table autovacuum settings is probably the next highest priority.
On Mon, Feb 28, 2005 at 16:46:34 +0100, Markus Schaber <schabios@logi-track.com> wrote: > Hi, Matthew, > > Matthew T. O'Connor schrieb: > > > The version of pg_autovacuum that I submitted for 8.0 could be > > instructed "per table" but it didn't make the cut. Aside from moved out > > of contrib and integrated into the backend, per table autovacuum > > settings is probably the next highest priority. > > What was the reason for non-acceptance? It wasn't reviewed until very close to freeze due to people who could do the review being busy and then there wasn't enough time to iron some things out before the freeze.