Re: Vacuum/visibility is busted - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Vacuum/visibility is busted
Date
Msg-id CABOikdPtpVdv08L4w_HO9+Og1igNxxsFr5scoArQg7bFde6Hzw@mail.gmail.com
Whole thread Raw
In response to Re: Vacuum/visibility is busted  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Vacuum/visibility is busted  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Fri, Feb 8, 2013 at 10:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Alvaro Herrera escribió:
>> Alvaro Herrera escribió:
>>
>> > Hm, if the foreign key patch is to blame, this sounds like these tuples
>> > had a different set of XMAX hint bits and a different Xmax, and they
>> > were clobbered by something like vacuum or page pruning.
>>
>> Hm, I think heap_freeze_tuple is busted, yes.
>
> This patch seems to fix the problem for me.  Please confirm.
>

I'm trying to reason how this bug explains what we saw. In the test,
we'd left with duplicate tuples. If I just take index 219 in the table
as an example, that tuple had three duplicates. The tuple with CTID
(150, 126) had the index pointer and the rest two were dangling tuples
in the heap. I wonder how the index pointers to those tuples got
removed:

1. May be HOT prune saw those tuples as DEAD and adjusted the HOT
chain by removing those tuples. But then HOT prune would have
reclaimed those tuples as well by setting the lp to UNUSED.

2. Index scan saw the HOT chain as DEAD and hence killed the index
tuple. That looks unlikely because that would require an intermediate
non-HOT update to the tuple. Given that the latest live tuple with the
same index value is in the same block, I seriously doubt there was a
non-HOT update to those tuples.

Also, there are couple of other things to notice.

1. For VACUUM to freeze those tuples as you are suspecting, they
should be seen as LIVE when HeapTupleSatisfiesVacuum is run by VACUUM.
But for them to be removed from the HOT chain, they must be seen as
DEAD to someone else and that must happen before VACUUM is run.

2. Tuple (150, 98) links to (150, 101) and both of them are unwanted
duplicates. Can't reason how we end up in this state.

Jeff mentioned that this thinks this issue could be reproducible
without any crash recovery. Alvaro, I did not try to reproduce the
problem using your patch, but can you please check if you see
duplicates in similar state that we saw in Jeff's case ? Or can
someone explain how we could end up in this state because of
heap_tuple_freeze() freezing a potentially DEAD tuple ?

Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee



pgsql-hackers by date:

Previous
From: Миша Тюрин
Date:
Subject: Re[2]: [HACKERS] standby, pg_basebackup and last xlog file
Next
From: Marti Raudsepp
Date:
Subject: Release notes & git attribution