Thread: VACUUM FULL on 24/7 server
Hello! I have 24/7 production server under high load. I need to perform vacuum full on several tables to recover disk space / memory usage frequently ( the server must be online during vacuum time ) The one trick that i see is to try to vacuum duplicate of production database ( or just some tables ). But there are some pitfalls: http://www.postgresql.org/docs/7.4/interactive/backup-file.html "If you have dug into the details of the file system layout of the data you may be tempted to try to back up or restore only certain individual tables or databases from their respective files or directories. This will not work because the information contained in these files contains only half the truth. The other half is in the commit log files pg_clog/*, which contain the commit status of all transactions. A table file is only usable with this information. Of course it is also impossible to restore only a table and the associated pg_clog data because that would render all other tables in the database cluster useless." Any thoughts? Thanks in advance, Aleksey
Aleksey Serba <aserba@gmail.com> writes: > I have 24/7 production server under high load. > I need to perform vacuum full on several tables to recover disk > space / memory usage frequently ( the server must be online during > vacuum time ) Don't use VACUUM FULL; plain VACUUM should be enough if you are managing the database properly (for instance, make sure the FSM settings are high enough). regards, tom lane
aserba@gmail.com (Aleksey Serba) wrote: > Hello! > > I have 24/7 production server under high load. > I need to perform vacuum full on several tables to recover disk > space / memory usage frequently ( the server must be online during > vacuum time ) The main thought is: "Don't do that." It is almost certainly the wrong idea to do a VACUUM FULL. Assuming that the tables in question aren't so large that they cause mass eviction of buffers, it should suffice to do a plain VACUUM (and NOT a "VACUUM FULL") on the tables in question quite frequently. If the tables _are_ rather large, then you might want to look into the "lazier" vacuum patch which causes the vacuum process to sleep once in a while so as to diminish the "eviction" effects. A version of this was proposed for 7.3, and I know Jan Wieck has had a patch for 7.4. -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://linuxfinances.info/info/spiritual.html I am not a number! I am a free man!
Christopher Browne wrote: > aserba@gmail.com (Aleksey Serba) wrote: > >> Hello! >> >> I have 24/7 production server under high load. >> I need to perform vacuum full on several tables to recover disk >> space / memory usage frequently ( the server must be online during >> vacuum time ) > > > The main thought is: "Don't do that." > > It is almost certainly the wrong idea to do a VACUUM FULL. > > Assuming that the tables in question aren't so large that they cause > mass eviction of buffers, it should suffice to do a plain VACUUM (and > NOT a "VACUUM FULL") on the tables in question quite frequently. This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum: pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 I have also a "vacuumdb -z -v -a" running each six hours and if i don't execute a vacuum FULL for one weeks I collect almost 400 MB of dead rows :-( For this reason even with a 7.4.5 I'm obliged to run a vacuum full at least once a week and a reindex once a month. And my FSM parameters are large enough: INFO: free space map: 141 relations, 26787 pages stored; 26032 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2000000 pages = 11780 kB shared memory. Regards Gaetano Mendola PS: I do not have any "idle in transaction" connections around.
Gaetano Mendola <mendola@bigfoot.com> writes: > Christopher Browne wrote: >>> Assuming that the tables in question aren't so large that they cause >>> mass eviction of buffers, it should suffice to do a plain VACUUM (and >>> NOT a "VACUUM FULL") on the tables in question quite frequently. > This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum: > pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 I'm not very familiar at all with appropriate settings for autovacuum, but doesn't the above say to vacuum a table only when the dead space reaches 50%? That seems awfully lax to me. I've always thought one should vacuum often enough to keep dead space to maybe 10 to 25%. regards, tom lane
Tom Lane wrote: > Gaetano Mendola <mendola@bigfoot.com> writes: > >>Christopher Browne wrote: >> >>>>Assuming that the tables in question aren't so large that they cause >>>>mass eviction of buffers, it should suffice to do a plain VACUUM (and >>>>NOT a "VACUUM FULL") on the tables in question quite frequently. > > >>This is easy to say and almost impraticable. I run a 7.4.5 with the autovacuum: > > >>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 > > > I'm not very familiar at all with appropriate settings for autovacuum, > but doesn't the above say to vacuum a table only when the dead space > reaches 50%? That seems awfully lax to me. I've always thought one > should vacuum often enough to keep dead space to maybe 10 to 25%. The problem is that I can not set these value per table and per database so, I had to find some compromise, however I will test in the next days what happen with -V 0.2 However each six hour I perform a vacuum on all database and the HD space continue to grow even with FSM parameters large enough. I'll post in a couple of day about the new settings. Regards Gaetano Mendola
On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote: > Tom Lane wrote: > > Gaetano Mendola <mendola@bigfoot.com> writes: > > > >>Christopher Browne wrote: > >>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 > > > > I'm not very familiar at all with appropriate settings for autovacuum, > > but doesn't the above say to vacuum a table only when the dead space > > reaches 50%? That seems awfully lax to me. I've always thought one > > should vacuum often enough to keep dead space to maybe 10 to 25%. Yes that is what those options say. The default values are even more lax. I wasn't sure how best to set them, I erred on the conservative side. > The problem is that I can not set these value per table and per database > so, I had to find some compromise, however I will test in the next days > what happen with -V 0.2 > > However each six hour I perform a vacuum on all database and the HD space > continue to grow even with FSM parameters large enough. Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 times a day buying you much. It's not a bad idea to do once in a while. Given the way Postgres works, it is normal to have slack space in your tables. The real question is do your table stop growing? At some point you should reach a stead state where you have some percentage of slack space that stops growing. You said that after running for a week you have 400M of reclaimable space. Is that a problem? If you don't do a vacuum full for two weeks is it still 400M? My guess is most of the 400M is created in the first few hours (perhaps days) after running your vacuum full. Matthew
Matthew T. O'Connor wrote: > On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote: > >>Tom Lane wrote: >> >>>Gaetano Mendola <mendola@bigfoot.com> writes: >>> >>> >>>>Christopher Browne wrote: >>>>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 >>> >>>I'm not very familiar at all with appropriate settings for > > autovacuum, > >>>but doesn't the above say to vacuum a table only when the dead space >>>reaches 50%? That seems awfully lax to me. I've always thought one >>>should vacuum often enough to keep dead space to maybe 10 to 25%. > > > Yes that is what those options say. The default values are even more > lax. I wasn't sure how best to set them, I erred on the conservative > side. > > >>The problem is that I can not set these value per table and per > > database > >>so, I had to find some compromise, however I will test in the next > > days > >>what happen with -V 0.2 >> >>However each six hour I perform a vacuum on all database and the HD > > space > >>continue to grow even with FSM parameters large enough. > > > Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 > times a day buying you much. It's not a bad idea to do once in a while. The reason is that I have few tables of about 5 milion with ~ 10000 insert per day. Even with setting -v 300 -V 0.1 this means these tables will be analyzed each 50 days. So I have to force it. Regards Gaetano Mendola > Given the way Postgres works, it is normal to have slack space in your > tables. The real question is do your table stop growing? At some point > you should reach a stead state where you have some percentage of slack > space that stops growing. > > You said that after running for a week you have 400M of reclaimable > space. Is that a problem? If you don't do a vacuum full for two weeks > is it still 400M? My guess is most of the 400M is created in the first > few hours (perhaps days) after running your vacuum full. > > Matthew > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Gaetano Mendola wrote: > Matthew T. O'Connor wrote: > >> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 >> times a day buying you much. It's not a bad idea to do once in a while. > > > The reason is that I have few tables of about 5 milion with ~ 10000 > insert per > day. Even with setting -v 300 -V 0.1 this means these tables will be > analyzed > each 50 days. So I have to force it. I understand, but 10,000 new rows increate your table size only 0.2%, so it won't significantly effect anything. Also, if they really are just inserts then vacuum is totally unnecessary. I agree that for these situations pg_autovacuum should be supplemented by vacuumdb -a -z every once in a while, all I was pointing out was that 3 times a day is probably excessive. Matthew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Matthew T. O'Connor wrote: | Gaetano Mendola wrote: | |> Matthew T. O'Connor wrote: |> |>> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 |>> times a day buying you much. It's not a bad idea to do once in a while. |> |> |> |> The reason is that I have few tables of about 5 milion with ~ 10000 |> insert per |> day. Even with setting -v 300 -V 0.1 this means these tables will be |> analyzed |> each 50 days. So I have to force it. | | | I understand, but 10,000 new rows increate your table size only 0.2%, so | it won't significantly effect anything. Also, if they really are just | inserts then vacuum is totally unnecessary. I agree that for these | situations pg_autovacuum should be supplemented by vacuumdb -a -z every | once in a while, all I was pointing out was that 3 times a day is | probably excessive. Right, but the table collect logs, so is mandatory have the statistics up-to-date in order to obtain index scan for queries that are involving the last 24 hours. For the vacuum vs the analyze I do vacuum because other tables are not in this category of "only update" so instead of write tons of line in my crontab I prefer only one line. Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBYYm77UpzwH2SGd4RAmilAJ98skWgiKI7mqOgYIgigzgpLe0JpQCfRm8/ IPXFZwZVcdJP0RQCE1fPXpw= =CExm -----END PGP SIGNATURE-----
On Sun, 2004-10-03 at 21:01, Gaetano Mendola wrote: > Tom Lane wrote: > > Gaetano Mendola <mendola@bigfoot.com> writes: > > > >>Christopher Browne wrote: > >>pg_autovacuum -d 3 -v 300 -V 0.5 -S 0.8 -a 200 -A 0.8 > > > > I'm not very familiar at all with appropriate settings for autovacuum, > > but doesn't the above say to vacuum a table only when the dead space > > reaches 50%? That seems awfully lax to me. I've always thought one > > should vacuum often enough to keep dead space to maybe 10 to 25%. Yes that is what those options say. The default values are even more lax. I wasn't sure how best to set them, I erred on the conservative side. > The problem is that I can not set these value per table and per database > so, I had to find some compromise, however I will test in the next days > what happen with -V 0.2 > > However each six hour I perform a vacuum on all database and the HD space > continue to grow even with FSM parameters large enough. Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 times a day buying you much. It's not a bad idea to do once in a while. Given the way Postgres works, it is normal to have slack space in your tables. The real question is do your table stop growing? At some point you should reach a stead state where you have some percentage of slack space that stops growing. You said that after running for a week you have 400M of reclaimable space. Is that a problem? If you don't do a vacuum full for two weeks is it still 400M? My guess is most of the 400M is created in the first few hours (perhaps days) after running your vacuum full. Matthew
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Matthew T. O'Connor wrote: | Gaetano Mendola wrote: | |> Matthew T. O'Connor wrote: |> |>> Since you are running autovacuum I doubt the doing vacuumdb -a -z is 3 |>> times a day buying you much. It's not a bad idea to do once in a while. |> |> |> |> The reason is that I have few tables of about 5 milion with ~ 10000 |> insert per |> day. Even with setting -v 300 -V 0.1 this means these tables will be |> analyzed |> each 50 days. So I have to force it. | | | I understand, but 10,000 new rows increate your table size only 0.2%, so | it won't significantly effect anything. Also, if they really are just | inserts then vacuum is totally unnecessary. I agree that for these | situations pg_autovacuum should be supplemented by vacuumdb -a -z every | once in a while, all I was pointing out was that 3 times a day is | probably excessive. Right, but the table collect logs, so is mandatory have the statistics up-to-date in order to obtain index scan for queries that are involving the last 24 hours. For the vacuum vs the analyze I do vacuum because other tables are not in this category of "only update" so instead of write tons of line in my crontab I prefer only one line. Regards Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBYYiV7UpzwH2SGd4RAiglAKDpPvuhEZSabwXVBJzcCezH8fYIJQCfW9UY CUkHUisIuJsbJHU5W0g0FhI= =u6Xz -----END PGP SIGNATURE-----