Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)? - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
Date
Msg-id CAPpHfduJO6p=7wRSMncwoqEMrP9QDmqCfv3UxaLFzQ7PUaC53Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
List pgsql-hackers
On Wed, Jun 7, 2017 at 10:47 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 06/06/2017 07:24 AM, Ashutosh Bapat wrote:
On Tue, Jun 6, 2017 at 9:48 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 6 June 2017 at 12:13, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

What happens when the epoch is so low that the rest of the XID does
not fit in 32bits of tuple header? Or such a case should never arise?

Storing an epoch implies that rows can't have (xmin,xmax) different by
more than one epoch. So if you're updating/deleting an extremely old
tuple you'll presumably have to set xmin to FrozenTransactionId if it
isn't already, so you can set a new epoch and xmax.

If the page has multiple such tuples, updating one tuple will mean
updating headers of other tuples as well? This means that those tuples
need to be locked for concurrent scans? May be not, since such tuples
will be anyway visible to any concurrent scans and updating xmin/xmax
doesn't change the visibility. But we might have to prevent multiple
updates to the xmin/xmax because of concurrent updates on the same
page.

"Store the epoch in the page header" is actually a slightly simpler-to-visualize, but incorrect, version of what we actually need to do. If you only store the epoch, then all the XIDs on a page need to belong to the same epoch, which causes trouble when the current epoch changes. Just after the epoch changes, you cannot necessarily freeze all the tuples from the previous epoch, because they would not yet be visible to everyone.

The full picture is that we need to store one 64-bit XID "base" value in the page header, and all the xmin/xmax values in the tuple headers are offsets relative to that base. With that, you effectively have 64-bit XIDs, as long as the *difference* between any two XIDs on a page is not greater than 2^32. That can be guaranteed, as long as we don't allow a transaction to be in-progress for more than 2^32 XIDs. That seems like a reasonable limitation.
 
Right.  I used the term "64-bit epoch" during developer unconference, but that was ambiguous.  It would be more correct to call it a "64-bit base".
BTW, we will have to store two 64-bit bases: for xids and for multixacts, because they are completely independent counters.

But yes, when the "current XID - base XID in page header" becomes greater than 2^32, and you need to update a tuple on that page, you need to first freeze the page, update the base XID on the page header to a more recent value, and update the XID offsets on every tuple on the page accordingly. And to do that, you need to hold a lock on the page. If you don't move any tuples around at the same time, but just update the XID fields, and exclusive lock on the page is enough, i.e. you don't need to take a super-exclusive or vacuum lock. In any case, it happens so infrequently that it should not become a serious burden.

Yes, exclusive lock seems to be enough for single page freeze.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?