Thread: Do XID sequences need to be contiguous?

Do XID sequences need to be contiguous?

From
Mark Dilger
Date:
Hackers,

While working on the problem of XID wraparound within the LISTEN/NOTIFY
system, I tried to increment XIDs by more than one per transaction. 
This leads to a number of test failures, many which look like:

+ERROR:  could not access status of transaction 7485
+DETAIL:  Could not read from file "pg_subtrans/0000" at offset 24576: 
read too few bytes.

I might not have read the right documentation, but....

I do not see anything in src/backend/access/transam/README nor elsewhere
documenting a design decision or assumption that transaction IDs must
be assigned contiguously.  I suppose this is such a fundamental
assumption that it is completely implicit and nobody thought to document
it, but I'd like to check for two reasons:

First, I'd like a good method of burning through transaction ids in
tests designed to check for problems in XID wrap-around.

Second, I'd like to add Asserts where appropriate regarding this
assumption.  It seems strange to me that I should have gotten as far
as a failing read() without having tripped an Assert somewhere along the
way.

To duplicate the errors I hit, you can either apply this simple change:


diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 33fd052156..360b7335bb 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -83,7 +83,7 @@ FullTransactionIdFromEpochAndXid(uint32 epoch, 
TransactionId xid)
  static inline void
  FullTransactionIdAdvance(FullTransactionId *dest)
  {
-       dest->value++;
+       dest->value += 2;
         while (XidFromFullTransactionId(*dest) < FirstNormalTransactionId)
                 dest->value++;
  }


or apply the much larger WIP patch, attached, and then be sure to
provide the --enable-xidcheck flag to configure before building.

-- 
Mark Dilger

Attachment

Re: Do XID sequences need to be contiguous?

From
Tom Lane
Date:
Mark Dilger <hornschnorter@gmail.com> writes:
> While working on the problem of XID wraparound within the LISTEN/NOTIFY
> system, I tried to increment XIDs by more than one per transaction. 
> This leads to a number of test failures, many which look like:

IIRC, the XID-creation logic is designed to initialize the next clog
page whenever it allocates an exact-multiple-of-BLCKSZ*4 transaction
number.  Skipping over such numbers would create trouble.

> First, I'd like a good method of burning through transaction ids in
> tests designed to check for problems in XID wrap-around.

Don't "burn through them".  Stop the cluster and use pg_resetwal to
set the XID counter wherever you want it.  (You might need to set it
just before a page or segment boundary; I'm not sure if pg_resetwal
has any logic of its own to initialize a new CLOG page/file when you
move the counter this way.  Perhaps it's worth improving that.)

> Second, I'd like to add Asserts where appropriate regarding this
> assumption.

I'm not excited about that, and it's *certainly* not a problem that
justifies additional configure infrastructure.

            regards, tom lane



Re: Do XID sequences need to be contiguous?

From
Amit Kapila
Date:
On Fri, Nov 29, 2019 at 12:22 AM Mark Dilger <hornschnorter@gmail.com> wrote:
>
> Hackers,
>
> While working on the problem of XID wraparound within the LISTEN/NOTIFY
> system, I tried to increment XIDs by more than one per transaction.
> This leads to a number of test failures, many which look like:
>
> +ERROR:  could not access status of transaction 7485
> +DETAIL:  Could not read from file "pg_subtrans/0000" at offset 24576:
> read too few bytes.
>
> I might not have read the right documentation, but....
>
> I do not see anything in src/backend/access/transam/README nor elsewhere
> documenting a design decision or assumption that transaction IDs must
> be assigned contiguously.  I suppose this is such a fundamental
> assumption that it is completely implicit and nobody thought to document
> it, but I'd like to check for two reasons:
>
> First, I'd like a good method of burning through transaction ids in
> tests designed to check for problems in XID wrap-around.
>

As Tom pointed out and as mentioned in the comments "If we are
allocating the first XID of a new page of the commit log, zero out
that commit-log page before returning.", we need to take care of
extending the CLOG while advancing TransactionIds.   I have some old
script for burning transactionid's which I am attaching here.  It
might help you.  I think this is provided long back by Jeff Janes.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment