Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers

From Himanshu Upadhyaya
Subject Re: HOT chain validation in verify_heapam()
Date
Msg-id CAPF61jBjb+VafRhsmRg2gN1mX89+iPyPrfa6ZdUVtoX7T35TOg@mail.gmail.com
Whole thread Raw
In response to Re: HOT chain validation in verify_heapam()  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: HOT chain validation in verify_heapam()
List pgsql-hackers


On Wed, Sep 7, 2022 at 12:11 AM Robert Haas <robertmhaas@gmail.com> wrote:

I think the check should be written like this:

!TransactionIdEquals(pred_xmin, curr_xmin) && !TransctionIdDidCommit(pred_xmin)

The TransactionIdEquals check should be done first for the reason you
state: it's cheaper.

I think that we shouldn't be using TransactionIdDidAbort() at all,
because it can sometimes return false even when the transaction
actually did abort. See test_lockmode_for_conflict() and
TransactionIdIsInProgress() for examples of logic that copes with
this. What's happening here is that TransactionIdDidAbort doesn't ever
get called if the system crashes while a transaction is running. So we
can use TransactionIdDidAbort() only as an optimization: if it returns
true, the transaction is definitely aborted, but if it returns false,
we have to check whether it's still running. If not, it aborted
anyway.

TransactionIdDidCommit() does not have the same issue. A transaction
can abort without updating CLOG if the system crashes, but it can
never commit without updating CLOG. If the transaction didn't commit,
then it is either aborted or still in progress (and we don't care
which, because neither is an error here).

As to whether the existing formulation of the test has an error
condition, you're generally right that we should test
TransactionIdIsInProgress() before TransactionIdDidCommit/Abort,
because we during commit or abort, we first set the status in CLOG -
which is queried by TransactionIdDidCommit/Abort - and only afterward
update the procarray - which is queried by TransactionIdIsInProgress.
So normally TransactionIdIsInProgress should be checked first, and
TransactionIdDidCommit/Abort should only be checked if it returns
false, at which point we know that the return values of the latter
calls can't ever change. Possibly there is an argument for including
the TransactionIdInProgress check here too:

!TransactionIdEquals(pred_xmin, curr_xmin) &&
(TransactionIdIsInProgress(pred_xmin) ||
!TransctionIdDidCommit(pred_xmin))

...but I don't think it could change the answer. Most places that
check TransactionIdIsInProgress() first are concerned with MVCC
semantics, and here we are not. I think the only effects of including
or excluding the TransactionIdIsInProgress() test are (1) performance,
in that searching the procarray might avoid expense if it's cheaper
than searching clog, or add expense if the reverse is true and (2)
slightly changing the time at which we're first able to detect this
form of corruption. I am inclined to prefer the simpler form of the
test without TransactionIdIsInProgress() unless someone can say why we
shouldn't go that route.

Done, updated in the v3 patch.


--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Himanshu Upadhyaya
Date:
Subject: Re: HOT chain validation in verify_heapam()
Next
From: Aleksander Alekseev
Date:
Subject: Re: Add common function ReplicationOriginName.