Hi,
On 2020-09-21 16:02:29 -0400, Robert Haas wrote:
> On Mon, Sep 21, 2020 at 2:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Right, but what we end up with is that the very same tuple xmin and
> > xmax might result in pruning/deletion, or not, depending on whether
> > it's part of a HOT chain or not. That's at best pretty weird, and
> > at worst it means that corner-case bugs in other places are triggered
> > in only one of the two scenarios ... which is what we have here.
>
> I'm not sure I really understand how that's happening, because surely
> HOT chains and non-HOT chains are pruned by the same code, but it
> doesn't sound good.
Not necessarily, unfortunately:
case HEAPTUPLE_DEAD:
/*
* Ordinarily, DEAD tuples would have been removed by
* heap_page_prune(), but it's possible that the tuple
* state changed since heap_page_prune() looked. In
* particular an INSERT_IN_PROGRESS tuple could have
* changed to DEAD if the inserter aborted. So this
* cannot be considered an error condition.
*
* If the tuple is HOT-updated then it must only be
* removed by a prune operation; so we keep it just as if
* it were RECENTLY_DEAD. Also, if it's a heap-only
* tuple, we choose to keep it, because it'll be a lot
* cheaper to get rid of it in the next pruning pass than
* to treat it like an indexed tuple. Finally, if index
* cleanup is disabled, the second heap pass will not
* execute, and the tuple will not get removed, so we must
* treat it like any other dead tuple that we choose to
* keep.
*
* If this were to happen for a tuple that actually needed
* to be deleted, we'd be in trouble, because it'd
* possibly leave a tuple below the relation's xmin
* horizon alive. heap_prepare_freeze_tuple() is prepared
* to detect that case and abort the transaction,
* preventing corruption.
*/
if (HeapTupleIsHotUpdated(&tuple) ||
HeapTupleIsHeapOnly(&tuple) ||
params->index_cleanup == VACOPT_TERNARY_DISABLED)
nkeep += 1;
else
tupgone = true; /* we can delete the tuple */
all_visible = false;
So if e.g. a transaction aborts between the heap_page_prune and this
check the pruning behaviour depends on whether the tuple is part of a
HOT chain or not.
Greetings,
Andres Freund