Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: HOT chain validation in verify_heapam()
Date
Msg-id CA+TgmoZJqis=x7S6g5xoUEC0CTcXFKa6opvmjpC5e9Mb4jFa1g@mail.gmail.com
Whole thread Raw
In response to Re: HOT chain validation in verify_heapam()  (Andres Freund <andres@anarazel.de>)
Responses Re: HOT chain validation in verify_heapam()
List pgsql-hackers
On Tue, Nov 15, 2022 at 2:50 PM Andres Freund <andres@anarazel.de> wrote:
> On 2022-11-15 11:36:21 -0500, Robert Haas wrote:
> > On Mon, Nov 14, 2022 at 5:02 PM Andres Freund <andres@anarazel.de> wrote:
> > > It seems like we should do a bit more validation within a chain of
> > > tuples. E.g. that no live tuple can follow an !DidCommit xmin?
> >
> > I think this check is already present in stronger form. If we see a
> > !DidCommit xmin, the xmin of the next tuple in the chain not only can't be
> > committed, but had better be the same.
>
> As I think I mentioned before, I don't think the "better be the same" aspect
> is correct, think subxacts. E.g.
>
> off 0: xmin: top, xmax: child_1
> off 1: xmin: child_1, xmax: invalid
>
> If top hasn't committed yet, the current logic afaict will warn about this
> situation, no? And I don't think we can generally the subxid parent at this
> point, unfortunately (might have truncated subtrans).

Woops, you're right.

> Different aspect: Is it ok that we use TransactionIdDidCommit() without a
> preceding IsInProgress() check?

Well, the code doesn't match the comments here, sadly. The comments
claim that we want to check that if the prior tuple's xmin was aborted
or in progress the current one is in the same state. If that's
actually what the code checked, we'd definitely need to check both
TransactionIdInProgress and TransactionIdCommit and be wary of the
possibility of the value changing concurrently. But the code doesn't
actually check the status of more than one XID, nor does it care about
the distinction between aborted and in progress, so I don't think that
the current code is buggy in that particular way, just in a bunch of
other ways.

> I do think there's some potential for additional checks that don't run into
> the above issue, e.g. checking that no in-progress xids follow an explicitly
> aborted xact, that a committed xid can't follow an uncommitted xid etc.

Yeah, maybe so.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Schema variables - new implementation for Postgres 15
Next
From: Andres Freund
Date:
Subject: Re: Standardizing how pg_waldump presents recovery conflict XID cutoffs