Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: HOT chain validation in verify_heapam()
Date
Msg-id CA+Tgmoas7DOU9umrrvhZO1VMUc74eQsyKkeVEKeZZD4ODBLH8w@mail.gmail.com
Whole thread Raw
In response to Re: HOT chain validation in verify_heapam()  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Re: HOT chain validation in verify_heapam()
List pgsql-hackers
On Tue, Sep 6, 2022 at 9:38 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> > Please correct me if I'm wrong, but don't we have a race condition here:
> >
> > ```
> > +            if ((TransactionIdDidAbort(pred_xmin) ||
> > TransactionIdIsInProgress(pred_xmin))
> > +                && !TransactionIdEquals(pred_xmin, curr_xmin))
> >              {
> > ```
>
> It looks like I had a slight brain fade here.
>
> In order to report a false error either TransactionIdDidAbort() or
> TransactionIdIsInProgress() should return true and
> TransactionIdEquals() should be false. So actually under rare
> conditions the error will NOT be reported while it should. Other than
> that we seem to be safe from the concurrency perspective, unless I'm
> missing something again.
>
> Personally I don't have a strong opinion on whether we should bother
> about this scenario. Probably an explicit comment will not hurt.
>
> Also I suggest checking TransactionIdEquals() first though since it's cheaper.

I think the check should be written like this:

!TransactionIdEquals(pred_xmin, curr_xmin) && !TransctionIdDidCommit(pred_xmin)

The TransactionIdEquals check should be done first for the reason you
state: it's cheaper.

I think that we shouldn't be using TransactionIdDidAbort() at all,
because it can sometimes return false even when the transaction
actually did abort. See test_lockmode_for_conflict() and
TransactionIdIsInProgress() for examples of logic that copes with
this. What's happening here is that TransactionIdDidAbort doesn't ever
get called if the system crashes while a transaction is running. So we
can use TransactionIdDidAbort() only as an optimization: if it returns
true, the transaction is definitely aborted, but if it returns false,
we have to check whether it's still running. If not, it aborted
anyway.

TransactionIdDidCommit() does not have the same issue. A transaction
can abort without updating CLOG if the system crashes, but it can
never commit without updating CLOG. If the transaction didn't commit,
then it is either aborted or still in progress (and we don't care
which, because neither is an error here).

As to whether the existing formulation of the test has an error
condition, you're generally right that we should test
TransactionIdIsInProgress() before TransactionIdDidCommit/Abort,
because we during commit or abort, we first set the status in CLOG -
which is queried by TransactionIdDidCommit/Abort - and only afterward
update the procarray - which is queried by TransactionIdIsInProgress.
So normally TransactionIdIsInProgress should be checked first, and
TransactionIdDidCommit/Abort should only be checked if it returns
false, at which point we know that the return values of the latter
calls can't ever change. Possibly there is an argument for including
the TransactionIdInProgress check here too:

!TransactionIdEquals(pred_xmin, curr_xmin) &&
(TransactionIdIsInProgress(pred_xmin) ||
!TransctionIdDidCommit(pred_xmin))

...but I don't think it could change the answer. Most places that
check TransactionIdIsInProgress() first are concerned with MVCC
semantics, and here we are not. I think the only effects of including
or excluding the TransactionIdIsInProgress() test are (1) performance,
in that searching the procarray might avoid expense if it's cheaper
than searching clog, or add expense if the reverse is true and (2)
slightly changing the time at which we're first able to detect this
form of corruption. I am inclined to prefer the simpler form of the
test without TransactionIdIsInProgress() unless someone can say why we
shouldn't go that route.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: predefined role(s) for VACUUM and ANALYZE
Next
From: Jeff Davis
Date:
Subject: Re: allowing for control over SET ROLE