Thread: Vacuum Daemon
From the ToDo list: Vacuum: * Provide automatic running of vacuum in the background (Tom) As of 7.2 we have lazy vacuum. The next logical step is setting up vacuum to run automatically in the background either as some type of daemon or as something kicked off by the postmaster. I am interested in working on this to do item, although I see it is assigned to Tom right now. First: is this something we still want (I assume it is since its in the todo.). Second: There was some discussion (http://archives.postgresql.org/pgsql-hackers/2002-05/msg00970.php) about this not being neede once UNDO is on place, what is the current view on this? Matthew
"Matthew T. O'Connor" <matthew@zeut.net> writes: > As of 7.2 we have lazy vacuum. The next logical step is setting up vacuum to > run automatically in the background either as some type of daemon or as > something kicked off by the postmaster. > I am interested in working on this to do item, although I see it is assigned > to Tom right now. It's sufficiently far down my to-do list that I'm happy to let someone else do it ;-). > Second: There was some discussion > (http://archives.postgresql.org/pgsql-hackers/2002-05/msg00970.php) about > this not being neede once UNDO is on place, what is the current view on this? I do not think that is the case; and anyway we've pretty much rejected Vadim's notion of going to an Oracle-style UNDO buffer. I don't foresee VACUUM going away anytime soon --- what we need is to make it less obtrusive. 7.2 made some progress in that direction, but we need more. Launching VACUUMs on some automatic schedule, preferably using feedback about where space needs to be reclaimed, seems like a pretty straightforward small-matter-of-programming. The thing that would really be needed to make it unobtrusive is to find a way to run the vacuum processing at low priority, or at least when the system is not heavily loaded. I don't know a good way to do that. Nice'ing the vacuum process won't work because of priority-inversion problems. Making it suspend itself when load gets high might do; but how to detect that in a reasonably portable fashion? regards, tom lane
On Sat, 2002-06-29 at 20:14, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > Second: There was some discussion > > (http://archives.postgresql.org/pgsql-hackers/2002-05/msg00970.php) about > > this not being neede once UNDO is on place, what is the current view on this? > > I do not think that is the case; and anyway we've pretty much rejected > Vadim's notion of going to an Oracle-style UNDO buffer. I don't foresee > VACUUM going away anytime soon --- what we need is to make it less > obtrusive. 7.2 made some progress in that direction, but we need more. > Could someone point me to this discussion, or summarize what the problem was? Was his proposal to keep tuple versions in the UNDO AM, or only pointers to them? The referred-to message seems to be about something else. ;jrnield -- J. R. Nield jrnield@usol.com
"J. R. Nield" <jrnield@usol.com> writes: >> I do not think that is the case; and anyway we've pretty much rejected >> Vadim's notion of going to an Oracle-style UNDO buffer. > Could someone point me to this discussion, or summarize what the problem > was? I'm too lazy to dig through the archives at the moment, but the main points were (a) a finite-size UNDO buffer chokes on large transactions and (b) the Oracle approach requires live transaction processing to do the cleanup work that our approach can push off to hopefully-not- time-critical vacuum processing. UNDO per se doesn't eliminate VACUUM anyhow; it only reclaims space from tuples written by aborted transactions. If you want to get rid of VACUUM then you need another way to get rid of the old versions of perfectly good committed tuples that are obsoleted by updates from later transactions. That essentially means you need an overwriting storage manager, which is a concept that doesn't mix well with MVCC. Oracle found a solution to that conundrum, but it's really not obvious to me that their solution is better than ours. Also, they have patents that we'd probably run afoul of if we try to imitate their approach too closely. regards, tom lane
Tom Lane wrote: > "J. R. Nield" <jrnield@usol.com> writes: > >> I do not think that is the case; and anyway we've pretty much rejected > >> Vadim's notion of going to an Oracle-style UNDO buffer. > > > Could someone point me to this discussion, or summarize what the problem > > was? > > I'm too lazy to dig through the archives at the moment, but the main > points were (a) a finite-size UNDO buffer chokes on large transactions > and (b) the Oracle approach requires live transaction processing to > do the cleanup work that our approach can push off to hopefully-not- > time-critical vacuum processing. > > UNDO per se doesn't eliminate VACUUM anyhow; it only reclaims space > from tuples written by aborted transactions. If you want to get rid > of VACUUM then you need another way to get rid of the old versions of > perfectly good committed tuples that are obsoleted by updates from > later transactions. That essentially means you need an overwriting > storage manager, which is a concept that doesn't mix well with MVCC. > > Oracle found a solution to that conundrum, but it's really not obvious > to me that their solution is better than ours. Also, they have > patents that we'd probably run afoul of if we try to imitate their > approach too closely. Don't forget reclaiming space from transactions that delete tuples. UNDO doesn't help there either. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Sat, 2002-06-29 at 21:55, Tom Lane wrote: > "J. R. Nield" <jrnield@usol.com> writes: > >> I do not think that is the case; and anyway we've pretty much rejected > >> Vadim's notion of going to an Oracle-style UNDO buffer. > > > Could someone point me to this discussion, or summarize what the problem > > was? > > I'm too lazy to dig through the archives at the moment, but the main > points were (a) a finite-size UNDO buffer chokes on large transactions > Yes this is a good point. Oracle was always lame with its "ROLLBACK SEGMENTS". SolidDB (SolidWorks? It's been a while...) used a btree-like structure for this that was not of fixed size. Oracle supposedly moved to the same method in its 9i release, but I don't know the details. I could never figure out how they did this, until I realized that UNDO doesn't need to be in the WAL log. You just use any access method you feel like, and make sure the method is itself protected by REDO. Just insert REDO entries to protect the insert into the UNDO AM, and REDO log when you delete. That makes it easy to have the recovery code be idempotent, to catch the case of a system crash during recovery. > and (b) the Oracle approach requires live transaction processing to > do the cleanup work that our approach can push off to hopefully-not- > time-critical vacuum processing. I'm not sure which way I'm leaning on this. On the one hand, it requires extra work to clean up while the system is live, in addition to writing the undo records, though the cleanup is not necessarily by the same transaction that committed the work (the cleanup needs to be deferred until it's out of an active snapshot anyway). On the other hand, you can clean-up without a full table scan, because you know which tuples need to be changed. This can be a big advantage on gigantic tables. Also, it lets you remove deleted tuples quickly, so the space can be reused, and eliminates the xid wraparound problem. Of course, any kind of undo is worse for performance with bulk inserts/updates, so you either end up committing every few thousand inserts, or you use some special extension to disable undo logging for a bulk load (or if you really want to be able to roll it back, you live with it :-) How slow is it to vacuum a >1 TB database with postgres? Do we even have any users who could test this? Also, I would never advocate that we do what I'm pretty sure Oracle does, and keep old values in the "Rollback Segment". Only (RelFileNode, ItemDataPointer) addresses would need to be kept in the UNDO AM, if we went this route. > > UNDO per se doesn't eliminate VACUUM anyhow; it only reclaims space > from tuples written by aborted transactions. If you want to get rid > of VACUUM then you need another way to get rid of the old versions of > perfectly good committed tuples that are obsoleted by updates from > later transactions. That essentially means you need an overwriting > storage manager, which is a concept that doesn't mix well with MVCC. Well, you can keep the UNDO records after commit to do a fast incremental vacuum as soon as the transaction that deleted the tuples becomes older than the oldest snapshot. If this is always done whenever an XID becomes that old, then you never need to vacuum, and you never need a full table scan. Because postgres never overwrites (except at vacuum), I think it actually makes us a BETTER candidate for this to be implemented cleanly then with an overwriting storage manager. We will never need to keep tuple values in UNDO! > > Oracle found a solution to that conundrum, but it's really not obvious > to me that their solution is better than ours. Their approach was worse, because they had an overwriting storage manager before they tried to implement it (I'm guessing). :-) > Also, they have > patents that we'd probably run afoul of if we try to imitate their > approach too closely. > Given the current state of affairs here in the US, PostgreSQL probably violates hundreds or even thousands of software patents. It probably violates tens of patents that have been upheld in court. The only thing keeping companies from shutting down postgres, linux, OpenOffice, and a hundred other projects is fear of adverse publicity, and the fact that development would move overseas and continue to be a thorn in their side. We'll see how long this lasts, given the fear some vendors have of certain maturing open-source/GPL projects, but I don't think PostgreSQL will be first, since anyone can take this code and become an instant proprietary database vendor! (No, I'm not complaining. Please, nobody start a license fight because of this) -- J. R. Nield jrnield@usol.com
On Saturday 29 June 2002 08:14 pm, Tom Lane wrote: > Launching VACUUMs on some automatic schedule, preferably using feedback > about where space needs to be reclaimed, seems like a pretty > straightforward small-matter-of-programming. The thing that would > really be needed to make it unobtrusive is to find a way to run the > vacuum processing at low priority, or at least when the system is not > heavily loaded. I don't know a good way to do that. Nice'ing the > vacuum process won't work because of priority-inversion problems. > Making it suspend itself when load gets high might do; but how to > detect that in a reasonably portable fashion? Are we sure we want it to be unobtrusive? If vacuum is performed only where and when it's needed, it might be better for overall throughput to have it run even when the system is loaded. Such as a constantly updated table. As for a portable way to identify system load (if this is what we want) I was thinking of looking at the load average (such as the one reported by the top command) but I don't know much about portability issues. Since there appears to be sufficient interest in some solution, I'll start working on it. I would like to hear a quick description of what small-matter-of-programming means. Do you have specific ideas about what how best to get that feedback? Matthew
Matthew T. O'Connor wrote: > On Saturday 29 June 2002 08:14 pm, Tom Lane wrote: > > Launching VACUUMs on some automatic schedule, preferably using feedback > > about where space needs to be reclaimed, seems like a pretty > > straightforward small-matter-of-programming. The thing that would > > really be needed to make it unobtrusive is to find a way to run the > > vacuum processing at low priority, or at least when the system is not > > heavily loaded. I don't know a good way to do that. Nice'ing the > > vacuum process won't work because of priority-inversion problems. > > Making it suspend itself when load gets high might do; but how to > > detect that in a reasonably portable fashion? > > Are we sure we want it to be unobtrusive? If vacuum is performed only where > and when it's needed, it might be better for overall throughput to have it > run even when the system is loaded. Such as a constantly updated table. > > As for a portable way to identify system load (if this is what we want) I was > thinking of looking at the load average (such as the one reported by the top > command) but I don't know much about portability issues. > > Since there appears to be sufficient interest in some solution, I'll start > working on it. I would like to hear a quick description of what > small-matter-of-programming means. Do you have specific ideas about what how > best to get that feedback? Another idea is that the statistics tables keep information on table activity, so that could be used to determine what needs vacuuming. As far as collecting info on which rows are expired, I think a table scan is pretty quick and the cleanest solution to finding them. Trying to track the exact tuples and when they aren't visible to anyone is just a major pain, while with a table scan it is very easy. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026