Thread: a vacuum thread is not the answer
I was just toying around with things, and you know, running vacuum in the background doesn't work. It slows things down too much. The worst case senario is when one does this: update accounts set abalance = abalance + 1 ; This takes forever to run and doubles the size of the table. Is there a way that a separate thread managing the freelist can perform a "per row" vacuum concurrently? Maybe I am stating the problem incorrectly, but we need to be able to recover rows already in memory for performance.
mlw wrote: > I was just toying around with things, and you know, running vacuum in the > background doesn't work. It slows things down too much. > > The worst case senario is when one does this: > > update accounts set abalance = abalance + 1 ; > > This takes forever to run and doubles the size of the table. > > Is there a way that a separate thread managing the freelist can perform a "per > row" vacuum concurrently? Maybe I am stating the problem incorrectly, but we > need to be able to recover rows already in memory for performance. So you want to reuse space from rows before your transaction committed? Fine, I'm all for it, as long as begin ; update accounts set abalance = abalance + 1 ; rollback ; still works. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Thu, 2002-03-07 at 20:15, mlw wrote: > I was just toying around with things, and you know, running vacuum in the > background doesn't work. It slows things down too much. > > The worst case senario is when one does this: > > update accounts set abalance = abalance + 1 ; > > This takes forever to run and doubles the size of the table. How is this related to running vacuum in background ? Does it run fast when vacuum is not running ? > Is there a way that a separate thread managing the freelist can perform a "per > row" vacuum concurrently? Maybe I am stating the problem incorrectly, but we > need to be able to recover rows already in memory for performance. What could be possibly done (and is probably not very useful anyway) is reusing the row modified _in_the_same_transaction_ so that begin; abalance = abalance + 1 ; abalance = abalance + 1 ; abalance = abalance + 1 ; end; would consume just 2x the tablespace and not 4x. But this does not require a separate thread, just some changes in update logic. OTOH, this will probably interfere with some transaction modes that make use of command ids. -------------- Hannu
Hannu Krosing wrote: > > On Thu, 2002-03-07 at 20:15, mlw wrote: > > I was just toying around with things, and you know, running vacuum in the > > background doesn't work. It slows things down too much. > > > > The worst case senario is when one does this: > > > > update accounts set abalance = abalance + 1 ; > > > > This takes forever to run and doubles the size of the table. > > How is this related to running vacuum in background ? > > Does it run fast when vacuum is not running ? The problem is that it doubles the size of a table. This invariably means that you have more I/O. If there were a way to reuse old tulples, while they are still in the buffer cache, then PostgreSQL could handle this query faster. It was, however, pointed out that (obviously) you can't do reclaimation during a transaction because if it fails or someone issues "rollback" you have broken the database. So, I guess I'm saying ignore that part. > > > Is there a way that a separate thread managing the freelist can perform a "per > > row" vacuum concurrently? Maybe I am stating the problem incorrectly, but we > > need to be able to recover rows already in memory for performance. > > What could be possibly done (and is probably not very useful anyway) is > reusing the row modified _in_the_same_transaction_ so that > > begin; > abalance = abalance + 1 ; > abalance = abalance + 1 ; > abalance = abalance + 1 ; > end; > > would consume just 2x the tablespace and not 4x. But this does not > require a separate thread, just some changes in update logic. > > OTOH, this will probably interfere with some transaction modes that make > use of command ids. I haven't looked at the code, so I don't even know if it is doable. Could a small vacuum thread run in the background and monitor the buffer cache? When it finds a buffer with an unreferenced tuple, do what vacuum does, but only to that block? Here is my problem with vacuum. It scans the whole damn table and it takes a long time. In many, dare I say most, SQL databases, the rows which are updated are likely a small percent. If a small vacuum routine can be run against the blocks that are already in the buffer, this will eliminate a block read, and focus more on blocks which are likely to have been modified.