Re: Single pass vacuum - take 1 - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Single pass vacuum - take 1
Date
Msg-id 4E1F1CBB.5030503@enterprisedb.com
Whole thread Raw
In response to Re: Single pass vacuum - take 1  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Single pass vacuum - take 1
Re: Single pass vacuum - take 1
List pgsql-hackers
On 14.07.2011 18:57, Pavan Deolasee wrote:
> On Thu, Jul 14, 2011 at 11:46 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> I'd say that seems way too complex for such a small use case and we've
>> only just fixed the bugs from 8.4 vacuum map complexity. The code's
>> looking very robust now and I'm uneasy that such changes are really
>> worth it.
>>
> Thanks Simon for looking at the patch.
>
> I am not sure if the use case is really narrow. Today, we dirty the pages in
> both the passes and also emit WAL records. Just the heap scan can take a
> very long time for large tables, blocking the autovacuum worker threads from
> doing useful work on other tables. If I am not wrong, we use ring buffers
> for vacuum which would most-likely force those buffers to be written/read
> twice to the disk.

Seems worthwhile to me. What bothers me a bit is the need for the new 
64-bit LSN value on each heap page. Also, note that temporary tables are 
not WAL-logged, so there's no LSNs.

How does this interact with the visibility map? If you set the 
visibility map bit after vacuuming indexes, a subsequent vacuum will not 
visit the page. The second vacuum will update relindxvacxlogid/off, but 
it will not clean up the dead line pointers left behind by the first 
vacuum. Now the LSN on the page differs from the one stored in pg_class, 
so subsequent pruning will not remove the dead line pointers either. I 
think you can sidestep that if you check that the page's vacuum LSN <= 
vacuum LSN in pg_class, instead of equality.

Ignoring the issue stated in previous paragraph, I think you wouldn't 
actually need an 64-bit LSN. A smaller counter is enough, as wrap-around 
doesn't matter. In fact, a single bit would be enough. After a 
successful vacuum, the counter on each heap page (with dead line 
pointers) is N, and the value in pg_class is N. There are no other 
values on the heap, because vacuum will have cleaned them up. When you 
begin the next vacuum, it will stamp pages with N+1. So at any stage, 
there is only one of two values on any page, so a single bit is enough. 
(But as I said, that doesn't hold if vacuum skips some pages thanks to 
the visibility map)

Is there something in place to make sure that pruning uses an up-to-date 
relindxvacxlogid/off value? I guess it doesn't matter if it's 
out-of-date, you'll just miss the opportunity to remove some dead tuples.

Seems odd to store relindxvacxlogid/off as two int32 columns. Store it 
in one uint64 column, or invent a new datatype for LSNs, or store it as 
text in %X/%X format.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Single pass vacuum - take 1
Next
From: "David E. Wheeler"
Date:
Subject: Re: pg_class.relistemp