Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15) - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Date
Msg-id CAJ7c6TP6i7wsYoH-_A5rnKDJ9CZWZzDQ7DgkF4X2Fj7T-J5mig@mail.gmail.com
Whole thread Raw
In response to Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)  (Jacob Champion <jchampion@timescale.com>)
Responses Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)  (Aleksander Alekseev <aleksander@timescale.com>)
List pgsql-hackers
Hi hackers,

> Yeah, pg_upgrade will briefly start and stop the old server to make
> sure all the WAL is replayed, and won't transfer any of the files
> over. AFAIK, major-version WAL changes are fine; it was the previous
> claim that we could do it in a minor version that I was unsure about.

OK, here is the patchset v53 where I mostly modified the commit
messages. It is explicitly said that 0001 modifies the WAL records and
why we decided to do it in this patch. Additionally any mention of
64-bit XIDs is removed since it is not guaranteed that the rest of the
patches are going to be accepted. 64-bit SLRU page numbering is a
valuable change per se.

Changing the status of the CF entry to RfC apparently was a bit
premature. It looks like the patchset can use a few more rounds of
review.

In 0002:

```
-#define TransactionIdToCTsPage(xid) \
-    ((xid) / (TransactionId) COMMIT_TS_XACTS_PER_PAGE)
+static inline int64
+TransactionIdToCTsPageInternal(TransactionId xid, bool lock)
+{
+    FullTransactionId    fxid,
+                        nextXid;
+    uint32                epoch;
+
+    if (lock)
+        LWLockAcquire(XidGenLock, LW_SHARED);
+
+    /* make a local copy */
+    nextXid = ShmemVariableCache->nextXid;
+
+    if (lock)
+        LWLockRelease(XidGenLock);
+
+    epoch = EpochFromFullTransactionId(nextXid);
+    if (xid > XidFromFullTransactionId(nextXid))
+        --epoch;
+
+    fxid = FullTransactionIdFromEpochAndXid(epoch, xid);
+
+    return fxid.value / (uint64) COMMIT_TS_XACTS_PER_PAGE;
+}
```

I'm pretty confident that shared memory can't be accessed like this,
without taking a lock. Although it may work on x64 generally we can
get garbage, unless nextXid is accessed atomically and has a
corresponding atomic type. On top of that I'm pretty sure
TransactionIds can't be compared with the regular comparison
operators. All in all, so far I don't understand why this piece of
code should be so complicated.

The same applies to:

```
-#define TransactionIdToPage(xid) ((xid) / (TransactionId)
SUBTRANS_XACTS_PER_PAGE)
+static inline int64
+TransactionIdToPageInternal(TransactionId xid, bool lock)
```

... in subtrans.c

Maxim, perhaps you could share with us what your reasoning was here?

-- 
Best regards,
Aleksander Alekseev

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: SQL/JSON revisited
Next
From: Andrew Dunstan
Date:
Subject: Re: Extracting cross-version-upgrade knowledge from buildfarm client