Re: Single pass vacuum - take 1 - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Single pass vacuum - take 1 |
Date | |
Msg-id | 4E1F1CBB.5030503@enterprisedb.com Whole thread Raw |
In response to | Re: Single pass vacuum - take 1 (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Responses |
Re: Single pass vacuum - take 1
Re: Single pass vacuum - take 1 |
List | pgsql-hackers |
On 14.07.2011 18:57, Pavan Deolasee wrote: > On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs<simon@2ndquadrant.com> wrote: >> I'd say that seems way too complex for such a small use case and we've >> only just fixed the bugs from 8.4 vacuum map complexity. The code's >> looking very robust now and I'm uneasy that such changes are really >> worth it. >> > Thanks Simon for looking at the patch. > > I am not sure if the use case is really narrow. Today, we dirty the pages in > both the passes and also emit WAL records. Just the heap scan can take a > very long time for large tables, blocking the autovacuum worker threads from > doing useful work on other tables. If I am not wrong, we use ring buffers > for vacuum which would most-likely force those buffers to be written/read > twice to the disk. Seems worthwhile to me. What bothers me a bit is the need for the new 64-bit LSN value on each heap page. Also, note that temporary tables are not WAL-logged, so there's no LSNs. How does this interact with the visibility map? If you set the visibility map bit after vacuuming indexes, a subsequent vacuum will not visit the page. The second vacuum will update relindxvacxlogid/off, but it will not clean up the dead line pointers left behind by the first vacuum. Now the LSN on the page differs from the one stored in pg_class, so subsequent pruning will not remove the dead line pointers either. I think you can sidestep that if you check that the page's vacuum LSN <= vacuum LSN in pg_class, instead of equality. Ignoring the issue stated in previous paragraph, I think you wouldn't actually need an 64-bit LSN. A smaller counter is enough, as wrap-around doesn't matter. In fact, a single bit would be enough. After a successful vacuum, the counter on each heap page (with dead line pointers) is N, and the value in pg_class is N. There are no other values on the heap, because vacuum will have cleaned them up. When you begin the next vacuum, it will stamp pages with N+1. So at any stage, there is only one of two values on any page, so a single bit is enough. (But as I said, that doesn't hold if vacuum skips some pages thanks to the visibility map) Is there something in place to make sure that pruning uses an up-to-date relindxvacxlogid/off value? I guess it doesn't matter if it's out-of-date, you'll just miss the opportunity to remove some dead tuples. Seems odd to store relindxvacxlogid/off as two int32 columns. Store it in one uint64 column, or invent a new datatype for LSNs, or store it as text in %X/%X format. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: