Thread: Incorrect assumption in heap_prepare_freeze_tuple

Incorrect assumption in heap_prepare_freeze_tuple

From
Kuntal Ghosh
Date:
Hello hackers,

In heap_prepare_freeze_tuple, we make the following assumption:

 * It is assumed that the caller has checked the tuple with
 * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
 * (else we should be removing the tuple, not freezing it).

Thus, when we see a committed xmax that precedes the cutoff_xid, we throw the following data corruption error:
errmsg_internal("cannot freeze committed xmax %u", xid)

However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a transaction older than OldestXmin. And, if the tuple is HOT-updated, it should only be removed by a hot-chain prune operation. So, we treat the tuple as RECENTLY_DEAD and don't remove the tuple.

So, it may lead to an incorrect data corruption error. IIUC, following will be the exact scenario where the error may happen,

An updated/deleted tuple whose xamx is in between cutoff_xid and OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and autovacuum_freeze_max_age, it'll not be encountered  easily. But, I think it can be reproduced with some xid burner patch.

I think the fix should be something like following:
            if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
-               TransactionIdDidCommit(xid))
+               TransactionIdDidCommit(xid) &&
+               !HeapTupleHeaderIsHotUpdated(tuple))
                ereport(ERROR,
                        (errcode(ERRCODE_DATA_CORRUPTED),
                         errmsg_internal("cannot freeze committed xmax %u",
                                         xid)));
-           freeze_xmax = true;
+
+           freeze_xmax = HeapTupleHeaderIsHotUpdated(tuple) ? false : true;

Attached a patch for the same. Thoughts?

--
Thanks & Regards,
Kuntal Ghosh
Attachment

Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Andres Freund
Date:
Hi,

On 2020-10-02 23:26:05 +0530, Kuntal Ghosh wrote:
> In heap_prepare_freeze_tuple, we make the following assumption:
> 
>  * It is assumed that the caller has checked the tuple with
>  * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
>  * (else we should be removing the tuple, not freezing it).
> 
> Thus, when we see a committed xmax that precedes the cutoff_xid, we throw
> the following data corruption error:
> errmsg_internal("cannot freeze committed xmax %u", xid)
> 
> However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may
> return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a
> transaction older than OldestXmin. And, if the tuple is HOT-updated, it
> should only be removed by a hot-chain prune operation. So, we treat the
> tuple as RECENTLY_DEAD and don't remove the tuple.

This code is so terrible :(

We really should just merge the HOT pruning and lazy_scan_heap()
removal/freeze operations. That'd avoid this corner case and *also*
would significantly reduce the WAL volume of VACUUM. And safe a good bit
of CPU time.


> So, it may lead to an incorrect data corruption error. IIUC, following will
> be the exact scenario where the error may happen,
> 
> An updated/deleted tuple whose xamx is in between cutoff_xid and
> OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and
> autovacuum_freeze_max_age, it'll not be encountered  easily. But, I think
> it can be reproduced with some xid burner patch.

I don't think this case is possible (*). By definition, there cannot be any
transactions needing tuples from before OldestXmin. Which means that the
heap_page_prune() earlier in lazy_scan_heap() would have pruned away a
DEAD tuple version that is part of a hot chain.

The HEAPTUPLE_DEAD branch you're referring to really can only be hit for
tuples that are *newer* than OldestXmin but become DEAD (instead of
RECENTLY_DEAD) because the inserting transaction aborted.


(*) with the exception of temp tables due to some recent changes, I am
currently working on a fix for that.


> I think the fix should be something like following:
>             if (!HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask) &&
> -               TransactionIdDidCommit(xid))
> +               TransactionIdDidCommit(xid) &&
> +               !HeapTupleHeaderIsHotUpdated(tuple))
>                 ereport(ERROR,
>                         (errcode(ERRCODE_DATA_CORRUPTED),
>                          errmsg_internal("cannot freeze committed xmax %u",
>                                          xid)));
> -           freeze_xmax = true;
> +
> +           freeze_xmax = HeapTupleHeaderIsHotUpdated(tuple) ? false : true;


I don't think that would be correct - we'd end up with an xmax that's
older than cutoff_xid left in the table. Breaking relfrozenxid /
creating wraparound and clog lookup dangers. This branch is only entered
when xmax precedes cutoff_xid - which is what we may set relfrozenxid
to.


What made you look at this? Did you hit the error?

Greetings,

Andres Freund



Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Kuntal Ghosh
Date:
On Sat, Oct 3, 2020 at 12:05 AM Andres Freund <andres@anarazel.de> wrote:
Thank you for your quick response and detailed explanation.

>
> >  * It is assumed that the caller has checked the tuple with
> >  * HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
> >  * (else we should be removing the tuple, not freezing it).
> >
> > Thus, when we see a committed xmax that precedes the cutoff_xid, we throw
> > the following data corruption error:
> > errmsg_internal("cannot freeze committed xmax %u", xid)
> >
> > However, in the caller (lazy_scan_heap), HeapTupleSatisfiesVacuum may
> > return HEAPTUPLE_DEAD for an updated/deleted tuple that got modified by a
> > transaction older than OldestXmin. And, if the tuple is HOT-updated, it
> > should only be removed by a hot-chain prune operation. So, we treat the
> > tuple as RECENTLY_DEAD and don't remove the tuple.
>
> This code is so terrible :(
>
> We really should just merge the HOT pruning and lazy_scan_heap()
> removal/freeze operations. That'd avoid this corner case and *also*
> would significantly reduce the WAL volume of VACUUM. And safe a good bit
> of CPU time.
>
+1

>
> > So, it may lead to an incorrect data corruption error. IIUC, following will
> > be the exact scenario where the error may happen,
> >
> > An updated/deleted tuple whose xamx is in between cutoff_xid and
> > OldestXmin. Since cutoff_xid depends on vacuum_freeze_min_age and
> > autovacuum_freeze_max_age, it'll not be encountered  easily. But, I think
> > it can be reproduced with some xid burner patch.
>
> I don't think this case is possible (*). By definition, there cannot be any
> transactions needing tuples from before OldestXmin. Which means that the
> heap_page_prune() earlier in lazy_scan_heap() would have pruned away a
> DEAD tuple version that is part of a hot chain.
>
> The HEAPTUPLE_DEAD branch you're referring to really can only be hit for
> tuples that are *newer* than OldestXmin but become DEAD (instead of
> RECENTLY_DEAD) because the inserting transaction aborted.
>
>
> (*) with the exception of temp tables due to some recent changes, I am
> currently working on a fix for that.
>
IIUC, there can be a HOT-updated tuple which is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
we call HeapTupleSatisfiesVacuum and xmax becomes older than
OldestXmin). So, there can be deleted tuples with xmax older than
OldestXmin. But, you're right that any tuple older than cutoff_xid
(since it was set earlier) will be pruned by heap_page_prune and hence
we shouldn't encounter the error. I'll study the rewrite_heap_tuple
path as well.

> What made you look at this? Did you hit the error?
Nope, I haven't encountered the error. Just trying to understand the code. :-)

-- 
Thanks & Regards,
Kuntal Ghosh



Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Andres Freund
Date:
Hi,

On 2020-10-03 12:58:01 +0530, Kuntal Ghosh wrote:
> IIUC, there can be a HOT-updated tuple which is not initially pruned
> by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
> HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
> we call HeapTupleSatisfiesVacuum and xmax becomes older than
> OldestXmin).

Hm? OldestXmin is constant over the course of vacuum, no?

Greetings,

Andres Freund



Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Kuntal Ghosh
Date:
On Sat, Oct 3, 2020 at 1:06 PM Andres Freund <andres@anarazel.de> wrote:
> On 2020-10-03 12:58:01 +0530, Kuntal Ghosh wrote:
> > IIUC, there can be a HOT-updated tuple which is not initially pruned
> > by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
> > HeapTupleSatisfiesVacuum (Since OldestXmin can be updated by the time
> > we call HeapTupleSatisfiesVacuum and xmax becomes older than
> > OldestXmin).
>
> Hm? OldestXmin is constant over the course of vacuum, no?
>
Yeah, it's constant. And, my understanding was completely wrong.

Actually, I misunderstood the following commit message:

commit dd69597988859c51131e0cbff3e30432db4259e1
Date:   Thu May 2 10:07:13 2019 -0400

Fix some problems with VACUUM (INDEX_CLEANUP FALSE)

...
Change the logic for the case where a tuple is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum.
.....
I thought the reason is OldestXmin will be updated in between, but
that's just a stupid assumption I've made. :-(

I still need to understand the scenario that the above commit is
referring to. If those kind of tuples can't have committed xmax older
than OldestXmin/cutoff_xid, then we're good. Otherwise, these kind of
tuples can create problems if we're performing vacuum with index
cleanup disabled. Because, if index cleanup is disabled, we set
tupgone as false although the status of the tuple is HEAPTUPLE_DEAD
and try to freeze those tuple later.

You've also mentioned that HEAPTUPLE_DEAD case I'm referring to can
only be hit for for tuples that are *newer* than OldestXmin but become
DEAD (instead of RECENTLY_DEAD) because the inserting transaction
aborted. But, I don't see that's the only case when
HeapTupleSatisfiesVacuum returns HEAPTUPLE_DEAD. If
HeapTupleSatisfiesVacuumHorizon returns HEAPTUPLE_RECENTLY_DEAD and if
tuple xmax(dead_after) precedes OlestXmin, we set it as
HEAPTUPLE_DEAD.

    res = HeapTupleSatisfiesVacuumHorizon(htup, buffer, &dead_after);

    if (res == HEAPTUPLE_RECENTLY_DEAD)
    {
        Assert(TransactionIdIsValid(dead_after));

        if (TransactionIdPrecedes(dead_after, OldestXmin))
            res = HEAPTUPLE_DEAD;
    }
    else
        Assert(!TransactionIdIsValid(dead_after));

Am I missing something here?


--
Thanks & Regards,
Kuntal Ghosh



Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Andres Freund
Date:
Hi,

On 2020-10-03 19:57:03 +0530, Kuntal Ghosh wrote:
> You've also mentioned that HEAPTUPLE_DEAD case I'm referring to can
> only be hit for for tuples that are *newer* than OldestXmin but become
> DEAD (instead of RECENTLY_DEAD) because the inserting transaction
> aborted. But, I don't see that's the only case when
> HeapTupleSatisfiesVacuum returns HEAPTUPLE_DEAD. If
> HeapTupleSatisfiesVacuumHorizon returns HEAPTUPLE_RECENTLY_DEAD and if
> tuple xmax(dead_after) precedes OlestXmin, we set it as
> HEAPTUPLE_DEAD.
> 
>     res = HeapTupleSatisfiesVacuumHorizon(htup, buffer, &dead_after);
> 
>     if (res == HEAPTUPLE_RECENTLY_DEAD)
>     {
>         Assert(TransactionIdIsValid(dead_after));
> 
>         if (TransactionIdPrecedes(dead_after, OldestXmin))
>             res = HEAPTUPLE_DEAD;
>     }
>     else
>         Assert(!TransactionIdIsValid(dead_after));
> 
> Am I missing something here?

To get to this point heap_page_prune() has to have been called for the
page. That removes all tuple [versions] that are DEAD. But not
RECENTLY_DEAD. But RECENTLY_DEAD can only happen for tuples that are
newere than OldestXmin. Thus the only tuples that the HTSV() we're
talking about can return DEAD for are ones that were RECENTLY_DEAD
in heap_page_prune().

Greetings,

Andres Freund



Re: Incorrect assumption in heap_prepare_freeze_tuple

From
Kuntal Ghosh
Date:
On Sun, Oct 4, 2020 at 12:33 AM Andres Freund <andres@anarazel.de> wrote:
>
> To get to this point heap_page_prune() has to have been called for the
> page. That removes all tuple [versions] that are DEAD. But not
> RECENTLY_DEAD. But RECENTLY_DEAD can only happen for tuples that are
> newere than OldestXmin. Thus the only tuples that the HTSV() we're
> talking about can return DEAD for are ones that were RECENTLY_DEAD
> in heap_page_prune().
>
Got it. Thank you for the explanations. :-)



-- 
Thanks & Regards,
Kuntal Ghosh