Re: lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples
Date
Msg-id CABOikdMi4NROp1rSjDN6F75JcyjWMUzPtmiuu98ZbFvTc31MCg@mail.gmail.com
Whole thread Raw
In response to Re: lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples  (Noah Misch <noah@leadboat.com>)
Responses Re: lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples
List pgsql-hackers
On Wed, Jan 30, 2013 at 7:34 AM, Noah Misch <noah@leadboat.com> wrote:
> On Mon, Jan 28, 2013 at 07:24:04PM +0530, Pavan Deolasee wrote:
>> On Wed, Jan 23, 2013 at 10:05 AM, Noah Misch <noah@leadboat.com> wrote:
>>
>> > You're the second commentator to be skittish about the patch's correctness, so
>> > I won't argue against a conservatism-motivated bounce of the patch.
>>
>> Can you please rebase the patch against the latest head ? I see
>> Alvaro's and Simon's recent changes has bit-rotten the patch.
>
> Attached.

Thanks. Here are a few comments.

1. I saw you took care of the bug that I reported in the first email
by allowing overwriting a LP_DEAD itemid during recovery. While this
should be OK given that we had never seen reports of recovery trying
to accidentally overwrite a LP_DEAD itemid, are you sure after this
change we can't get into this situation post reachedConsistency ? If
we ever do, I think the recovery would just fail.

2. In lazy_scan_heap() after detecting a DEAD tuple you now try to
confirm that the tuple must not require a freeze. Is that really safe
? I think this assumes that the HOT prune would have already truncated
*every* DEAD tuple to LP_DEAD except an INSERT_IN_PROGRESS which may
have turned into DEAD between two calls to HeapTupleSatisfiesVacuum().
While this might be true, but we never tried hard to prove that before
because it wasn't necessary. I remember Heikki raising this concern
when I proposed setting the VM bit after HOT prune. So a proof of that
would probably help.

3. Converting LP_DEAD to LP_UNUSED in lazy_vacuum_page() while holding
just a SHARE lock seems to be OK, but is a bit adventurous. I would
rather just get an EX lock and do it. Also, its probably more
appropriate to just mark the buffer dirty instead of
SetBufferCommitInfoNeedsSave(). It may cause line pointer bloat and
vacuum may not come to process this page again. This will also be kind
of unnecessary after the patch to set VM bit in the second phase gets
in.

4. Are changes in the variable names and logic around them in
lazy_scan_heap() really required ? Just makes me a bit uncomfortable.
See if we can minimize those changes or do it in a separate patch, if
possible.

I haven't run tests with the patch yet. Will see if I can try a few. I
share other's views on making these changes late in the cycle, but if
we can reduce the foot-print of the patch, we should be OK.

I see the following (and similar) messages while applying the patch,
but may be they are harmless.

(Stripping trailing CRs from patch.)

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: sql_drop Event Trigger
Next
From: Kevin Grittner
Date:
Subject: Re: autovacuum not prioritising for-wraparound tables