Re: Single pass vacuum - take 1 - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Single pass vacuum - take 1
Date
Msg-id CABOikdP7t9GfF6e4jubrANaW7epUJg5=iuJ=OpXrYQefOj=XQg@mail.gmail.com
Whole thread Raw
In response to Re: Single pass vacuum - take 1  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Thu, Jul 21, 2011 at 11:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jul 12, 2011 at 4:47 PM, Pavan Deolasee
> Comments ?

I was going to spend some time reviewing this, but I see that (1) it
has bit-rotted slightly - there is a failing hunk in pg_class.h and
(2) some of the comments downthread seem to suggest that you're
thinking about whether to revise this somewhat, in particular by using
some counter other than an LSN.  Are you planning to submit an updated
version?


Yeah, I would submit an updated version. I was just waiting to see if there are more comments about the general design. But I think I can now proceed.

I wonder if we can just ignore the wrap-around issue and use a 32 bit counter. The counter can be stored in the pg_class itself since its use is limited for the given table. At the start of vacuum, we get the current value. We then increment the counter (taking care of wrap-around) and use the incremented value as a marker in the page special area. If the vacuum runs to completion, we store the new value back in the pg_class row. Since vacuums are serialized for a given table, we don't need to worry about concurrent updates to the value.

While collecting dead-vacuum line pointers, either during HOT-prune or subsequent vacuum, we check if the current pg_class value and if the value is equal to the page counter, we can safely collect the dead-vacuum line pointers. For a moment, I thought we can just do away with a bit as Heikki suggested up thread, but the problem comes with the backends which might be running with stale value of the counter in the pg_class and the counter should be large enough so that it does not quickly wrap-around for all practical purposes.

 
A few comments on this version just reading through it:

- In lazy_scan_heap, where you've made the call to
RecordPageWithFreeSpace() unconditional, the comment change you made
immediately above is pretty half-baked.  It still refers to
lazy_vacuum_heap, which you've meanwhile removed.  You need to rewrite
the whole comment, I think.

- Instead of passing bool need_vaclsn to PageRepairFragmentation(),
how about passing Offset new_special_size?  Currently,
PageRepairFragmentation() doesn't know whether it's looking at a heap
page or an index page, and it would be nice to keep it that way.  It's
even possible that expanding the special space opportunistically
during page defragmentation could be useful in other contexts besides
this.  Or perhaps contracting it.

- itemid.h seems a bit schizophrenic about dead line pointers.  Here,
you've decided that it's OK for lp_flags == LP_DEAD && lp_off == 1 to
mean dead-vacuumed, but there existing code says:

#define LP_DEAD                 3               /* dead, may or may
not have storage */

AFAICT, the actual situation here is that indexes sometimes use dead
line pointers with storage, but the heap doesn't; thus, the heap can
safely use the storage bits of dead line pointers to mean something
else, but indexes can't.  I think the comments throughout itemid.h
should be adjusted to bring this out a bit more clearly, though.



I will take care of these issues in the revised patch. Thanks for looking at the patch.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: fixing PQsetvalue()
Next
From: Pavan Deolasee
Date:
Subject: Re: Single pass vacuum - take 1