Thread: Size for vacuum_mem
Trying to improve my vacuum times. Currently have vacuum_mem set to 128MB. What is my best bet to improve vacuum time? Increasing this number? Have 1GB of ram in the machine. Some of my settings: wal_buffers = 256 # min 4 shared_buffers = 12288 #96 MB (buffers * 8K) sort_mem = 131072 # 128 MB measured in K vacuum_mem = 131072 # 128 MB measured in K wal_files = 10 # range 0-64 checkpoint_segments = 12 # in logfile segments (16MB each), min 1 checkpoint_timeout = 900 # in seconds, range 30-3600 Currently a vacuum full takes 3+ hours and very soon the amount of data will increase. Have a very high turn over of data because this is a copy of data on a different system. Because of that I every day delete/merge entire range of dates deleting/adding million of records.
On Wed, 2002-12-04 at 19:17, Francisco Reyes wrote: > What is my best bet to improve vacuum time? If you VACUUM more frequently, each individual VACUUM takes less time. > Currently a vacuum full takes 3+ hours and very soon the amount of data > will increase. Do you need to use VACUUM FULL? Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On 4 Dec 2002 at 19:38, Neil Conway wrote: > > Currently a vacuum full takes 3+ hours and very soon the amount of data > > will increase. > Do you need to use VACUUM FULL? Let me elaborate this statement. 1) You need vacuum full only to recover space from deleted tuples. Unless the database has undergone major deletion somewhere 'vacuum full' might be a cannon to kill an ant. 2) You should consider vacuuming tablewise. Vacuum is useful only for those tables which change at a faster rate. A lookup or archive table might not need vacuum. Just vacuum the tables which are heavily updated/deleted/inserted. And by heavily, I mean heavily in terms of tuples. Inserting a single 200MB BLOB and vacuuming the table might not yield any performance improvement.. HTH Bye Shridhar -- blithwapping: Using anything BUT a hammer to hammer a nail into the wall, such as shoes, lamp bases, doorstops, etc. -- "Sniglets", Rich Hall & Friends
On 4 Dec 2002, Neil Conway wrote: > > Currently a vacuum full takes 3+ hours and very soon the amount of data > > will increase. > > Do you need to use VACUUM FULL? I have a very large turnover ratio. For some tables I delete/reload the whole table daily (tables in the 500K to 1.5 Million records), while other tables I delete/reload about 1/3 (ie 7 Million records table I delete/copy 1.5 Million records). The reason for the delete/copy is that the data is coming from a system where there is now way to tell what was added, changed since the last update. So I use date ranges for some tables and do entire tables for others.
On 5 Dec 2002, Neil Conway wrote: > For these, it would probably be faster to TRUNCATE the table and then > load the new data, then ANALYZE. I can't because those tables would not be usable during the load. Right now I do delete/copy from within a transaction. If the loads are still running while people start coming in the morning they can still do work. > > while other tables I delete/reload about 1/3 (ie > > 7 Million records table I delete/copy 1.5 Million records). > > For these, you can try just using a plain VACUUM and seeing how > effective that is at reclaiming space. I am not too concerned with space reclamation. In theory if I don't do vacuum fulls I may have some dead space, but it would get re-used daily. My concern is the performance hit I would suffer with the table scans. >If necessary, increase max_fsm_pages. What is this setting for? To what number could I increase it to? > You might also want to check and see if your indexes are growing in size > (btree indexes on incrementally increasing values like timestamps can > grow, even with VACUUM FULL); use REINDEX if that's the case. Every once in a while I truncate the tables and re-load the whole set. Probably about every couple of months.
Can someone please explain the relatiionship between operation of a vacuum-ing job and vacuum_mem setting in non-shared memfory configuration. Ie from a capacity planning point of view, how is vacuum_mem's size is related to this operation. The current 8k default seems so low... Thanks Shridhar Daithankar wrote: >On 4 Dec 2002 at 19:38, Neil Conway wrote: > > > >>>Currently a vacuum full takes 3+ hours and very soon the amount of data >>>will increase. >>> >>> >>Do you need to use VACUUM FULL? >> >> > >Let me elaborate this statement. > >1) You need vacuum full only to recover space from deleted tuples. Unless the >database has undergone major deletion somewhere 'vacuum full' might be a cannon >to kill an ant. > >2) You should consider vacuuming tablewise. Vacuum is useful only for those >tables which change at a faster rate. A lookup or archive table might not need >vacuum. Just vacuum the tables which are heavily updated/deleted/inserted. And >by heavily, I mean heavily in terms of tuples. Inserting a single 200MB BLOB >and vacuuming the table might not yield any performance improvement.. > > HTH > >Bye > Shridhar > >-- >blithwapping: Using anything BUT a hammer to hammer a nail into the wall, such >as shoes, lamp bases, doorstops, etc. -- "Sniglets", Rich Hall & Friends > > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > >
"Francisco Reyes" <lists@natserv.com> writes: > On 4 Dec 2002, Neil Conway wrote: >> Do you need to use VACUUM FULL? > I have a very large turnover ratio. > For some tables I delete/reload the whole table daily (tables in the 500K > to 1.5 Million records), while other tables I delete/reload about 1/3 (ie > 7 Million records table I delete/copy 1.5 Million records). For the delete/copy scenario, you could avoid the need for VACUUM by using TRUNCATE, if you don't mind the fact that TRUNCATE isn't rollbackable (yet). regards, tom lane
Hi all, I know triggers are executed in alphabetical order, but which of constraints, triggers and rules get executed first and which one is last? Thx JLL
On 4 Dec 2002, Neil Conway wrote: > > Currently a vacuum full takes 3+ hours and very soon the amount of data > > will increase. > > Do you need to use VACUUM FULL? I have a very large turnover ratio. For some tables I delete/reload the whole table daily (tables in the 500K to 1.5 Million records), while other tables I delete/reload about 1/3 (ie 7 Million records table I delete/copy 1.5 Million records). The reason for the delete/copy is that the data is coming from a system where there is now way to tell what was added, changed since the last update. So I use date ranges for some tables and do entire tables for others.
On 5 Dec 2002, Neil Conway wrote: > For these, it would probably be faster to TRUNCATE the table and then > load the new data, then ANALYZE. I can't because those tables would not be usable during the load. Right now I do delete/copy from within a transaction. If the loads are still running while people start coming in the morning they can still do work. > > while other tables I delete/reload about 1/3 (ie > > 7 Million records table I delete/copy 1.5 Million records). > > For these, you can try just using a plain VACUUM and seeing how > effective that is at reclaiming space. I am not too concerned with space reclamation. In theory if I don't do vacuum fulls I may have some dead space, but it would get re-used daily. My concern is the performance hit I would suffer with the table scans. >If necessary, increase max_fsm_pages. What is this setting for? To what number could I increase it to? > You might also want to check and see if your indexes are growing in size > (btree indexes on incrementally increasing values like timestamps can > grow, even with VACUUM FULL); use REINDEX if that's the case. Every once in a while I truncate the tables and re-load the whole set. Probably about every couple of months.
On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote: > > For these, you can try just using a plain VACUUM and seeing how > > effective that is at reclaiming space. > > I am not too concerned with space reclamation. In theory if I don't do > vacuum fulls I may have some dead space, but it would get re-used daily. > My concern is the performance hit I would suffer with the table scans. > you should see very little performance impact from lazy vacuuming. If there is a performance hit, you can gain some offset by quicker queries (if you do vacuum analyze). And remember, lazy vacuums are non-blocking so users won't see an impact from that standpoint. The trick is to find a good interval that will keep your tables from growing too big. I have one table that updates every 10 minutes (the whole content of the table gets updated within 15 minutes), which keeps the size very manageable (it's not a huge table, or I would do it more). In this scenario, you can still do vacuum fulls if you feel the need, but they should take much less time. Robert Treat
A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it during the day it kills us. This is because to vacuum all the tables postgres has to read them from the disk. While it doesn't not lock rows it does block other rows from reading/writing to/from the disk. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Robert Treat Sent: Thursday, December 05, 2002 2:58 PM To: Francisco Reyes Cc: neilc@samurai.com; pgsql-general@postgresql.org Subject: Re: [GENERAL] Size for vacuum_mem On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote: > > For these, you can try just using a plain VACUUM and seeing how > > effective that is at reclaiming space. > > I am not too concerned with space reclamation. In theory if I don't do > vacuum fulls I may have some dead space, but it would get re-used daily. > My concern is the performance hit I would suffer with the table scans. > you should see very little performance impact from lazy vacuuming. If there is a performance hit, you can gain some offset by quicker queries (if you do vacuum analyze). And remember, lazy vacuums are non-blocking so users won't see an impact from that standpoint. The trick is to find a good interval that will keep your tables from growing too big. I have one table that updates every 10 minutes (the whole content of the table gets updated within 15 minutes), which keeps the size very manageable (it's not a huge table, or I would do it more). In this scenario, you can still do vacuum fulls if you feel the need, but they should take much less time. Robert Treat ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Thu, 5 Dec 2002, David Blood wrote: > A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it > during the day it kills us. This is because to vacuum all the tables > postgres has to read them from the disk. While it doesn't not lock rows > it does block other rows from reading/writing to/from the disk. How much shared memory do you have allocated to Postgresql? I've found that if I have a couple hundred megs of shared buffer on a machine with 1 gig or more of ram, that lazy vacuums (in 7.2.x and later, 7.1 has massive problems with lazy vacuums acting up) don't seem to affect performance much at all. Vacuum on most my boxen results in no more than a 5% performance loss for other queries (all types, select, update, delete, insert) but keeps the database running well, even if they are running one vacuum right after another.
Any takers on this one? Jean-Luc Lachance wrote: > > Hi all, > > I know triggers are executed in alphabetical order, but which of > constraints, triggers and rules get executed first and which one is > last? > > Thx > > JLL > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On Fri, 6 Dec 2002, Jean-Luc Lachance wrote: > Jean-Luc Lachance wrote: > > > > Hi all, > > > > I know triggers are executed in alphabetical order, but which of > > constraints, triggers and rules get executed first and which one is > > last? I think it's: Rules are processed (since they're rewrites). Before triggers (in order) Check constraints/not null/unique/primary key After triggers (before the foreign key triggers) Foreign keys After triggers (after the foreign key triggers)
"David Blood" <david@matraex.com> writes: > A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it > during the day it kills us. This is because to vacuum all the tables > postgres has to read them from the disk. While it doesn't not lock rows > it does block other rows from reading/writing to/from the disk. On the other hand, I have watched people lazy-vacuum production databases in 7.2.* and not seen any visible hit on system load (as far as top or vmstat could show, anyway). I think it may be a matter of whether you have disk bandwidth to spare. If the disk farm is marginal, the extra demand from a vacuum may push you over the knee of the performance curve. But that's just a guess. It would be interesting if some folks from the "it doesn't hurt" and the "it does hurt" camps could compare notes and try to understand the reason for the difference in their results. regards, tom lane
> "Tom Lane" <tgl@sss.pgh.pa.us> writes: > On the other hand, I have watched people lazy-vacuum production > databases in 7.2.* and not seen any visible hit on system load > (as far as top or vmstat could show, anyway). From my experience lazy-vacuum never strains system load with modern processors. It is defenitely IO-bound. regards, Nic
On Fri, Dec 06, 2002 at 02:56:47PM -0500, Tom Lane wrote: > "David Blood" <david@matraex.com> writes: > > A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it > > during the day it kills us. This is because to vacuum all the tables > > postgres has to read them from the disk. While it doesn't not lock rows > > it does block other rows from reading/writing to/from the disk. > > On the other hand, I have watched people lazy-vacuum production > databases in 7.2.* and not seen any visible hit on system load > (as far as top or vmstat could show, anyway). > > I think it may be a matter of whether you have disk bandwidth to > spare. If the disk farm is marginal, the extra demand from a vacuum > may push you over the knee of the performance curve. But that's just > a guess. It would be interesting if some folks from the "it doesn't > hurt" and the "it does hurt" camps could compare notes and try to > understand the reason for the difference in their results. I'm firmly in the "devastating to performance" camp. 7.2.3, reasonably well-tuned on a not overspecced, but adequate Solaris box. (Built with a non-Solaris qsort, though I doubt that's relevant). Several large-ish (hundreds of thousands to millions of rows), fairly heavily updated tables, with some text fields large enough to push data out to toast. Vacuumed pretty much continuously while data is being updated, so it didn't get too far out to lunch. Then the process updating the table were shut down, so the system was basically idle, and the tables were vacuumed. Simple selects (from some small tables, via psql) slowed to a crawl - tens of seconds to get any response. There was a lot of I/O but also high CPU usage - including a fair fraction of system time. It felt like the system was i/o-starved, yet it would run a very intensive DB app quite happily if it wasn't vacuuming. (I finally rewrote the algorithm to avoid UPDATE, instead storing deltas in a daily table, then every night reading all the deltas and all the archived data and inserting the merged data into a new archive table (then indexing it and renaming to replace the old archived data table). Ugly, and offended my SQL sensibilities, but it avoided having to keep that table vacuumed.) Cheers, Steve
Friends, I just want to throw my support behind what David said. We have particularly slow drives, but a lot of memory (our data all fits into the cache) because of the failure of our fast drive and our inability to replace it (we're just using a single slow ide drive now), and the system fly's except while doing a vacuum. Just showing an example of when the lazy vacuum can significantly slow things. Thanks, Peter Darley -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of David Blood Sent: Thursday, December 05, 2002 3:12 PM To: 'Robert Treat'; 'Francisco Reyes' Cc: neilc@samurai.com; pgsql-general@postgresql.org Subject: Re: [GENERAL] Size for vacuum_mem A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it during the day it kills us. This is because to vacuum all the tables postgres has to read them from the disk. While it doesn't not lock rows it does block other rows from reading/writing to/from the disk. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Robert Treat Sent: Thursday, December 05, 2002 2:58 PM To: Francisco Reyes Cc: neilc@samurai.com; pgsql-general@postgresql.org Subject: Re: [GENERAL] Size for vacuum_mem On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote: > > For these, you can try just using a plain VACUUM and seeing how > > effective that is at reclaiming space. > > I am not too concerned with space reclamation. In theory if I don't do > vacuum fulls I may have some dead space, but it would get re-used daily. > My concern is the performance hit I would suffer with the table scans. > you should see very little performance impact from lazy vacuuming. If there is a performance hit, you can gain some offset by quicker queries (if you do vacuum analyze). And remember, lazy vacuums are non-blocking so users won't see an impact from that standpoint. The trick is to find a good interval that will keep your tables from growing too big. I have one table that updates every 10 minutes (the whole content of the table gets updated within 15 minutes), which keeps the size very manageable (it's not a huge table, or I would do it more). In this scenario, you can still do vacuum fulls if you feel the need, but they should take much less time. Robert Treat ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly