[HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly - Mailing list pgsql-hackers

From Tom Lane
Subject [HACKERS] StandbyRecoverPreparedTransactions recovers subtrans links incorrectly
Date
Msg-id 20110.1492905318@sss.pgh.pa.us
Whole thread Raw
Responses Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Now that we've got consistent failure reports about the 009_twophase.pl
recovery test, I set out to find out why it's failing.  It looks to me
like the reason is that this (twophase.c:2145):
           SubTransSetParent(xid, subxid, overwriteOK);

ought to be this:
           SubTransSetParent(subxid, xid, overwriteOK);

because the definition of SubTransSetParent is

void
SubTransSetParent(TransactionId xid, TransactionId parent, bool overwriteOK)

not the other way 'round.  

While "git blame" blames this line on the recent commit 728bd991c,
that just moved the call from somewhere else.  AFAICS this has actually
been wrong since StandbyRecoverPreparedTransactions was written,
in 361bd1662 of 2010-04-13.

It's not clear to me how much potential this has to create user data
corruption, but it doesn't look good at first glance.  Discuss.

Also, when I fix that, it gets further but still crashes at the same
Assert in SubTransSetParent.  The proximate cause this time seems to be
that RecoverPreparedTransactions's calculation of overwriteOK is wrong:
it's computing that as "false", but in reality the subtrans link in
question has already been set.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: [HACKERS] Remove dead interfaces added by mistake in 7c4f52409
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] StandbyRecoverPreparedTransactions recovers subtranslinks incorrectly